Skip to content

Monitoring Setup

Prerequisites

  • Docker and Docker Compose installed and running (lab 09/10)
  • Inventory API running in Docker Compose at /home/centos/lab10/docker-compose.yml

Procedure: Configure Inventory API Tracing

When to use: The inventory API source has been updated with OpenTelemetry instrumentation. Configure and deploy the new version so traces appear in the central Jaeger instance.

Steps:

  1. Pull the latest inventory API source and rebuild the Docker image:

    sudo docker build -t inventory-api:lab11 .
    

  2. Update the inventory service in /home/centos/lab10/docker-compose.yml — use the new image tag and add three environment variables:

    inventory:
      image: inventory-api:lab11
      environment:
        - OTEL_EXPORTER_OTLP_ENDPOINT=https://jaeger.sysadm.ee
        - OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
        - OTEL_SERVICE_NAME=inventory.<vm_name>.sysadm.ee
    
    Replace <vm_name> with your VM's short hostname.

  3. Restart the inventory service:

    cd /home/centos/lab10
    sudo docker compose up -d --force-recreate inventory
    

  4. Send a test request to generate a trace:

    curl -H "Authorization: Bearer 845e6732f32b81dd778972703474ccbb" \
      http://inventory.<vm_name>.sysadm.ee/api/v1/inventory
    

Troubleshooting:

  • Service not in Jaeger after 5–10 seconds: check container logs for OTLP exporter errors (sudo docker compose logs inventory). Ensure all three env vars are set and OTEL_EXPORTER_OTLP_PROTOCOL is http/protobuf.

Procedure: Deploy Prometheus and node_exporter

When to use: Collecting host metrics with Prometheus scraping node_exporter.

Steps:

  1. Create prometheus.yml in your chosen working directory:

    global:
      scrape_interval: 15s
    scrape_configs:
      - job_name: 'node-exporter'
        static_configs:
          - targets: ['node-exporter:9100']
    

  2. Add node-exporter and prometheus services to a docker-compose.yml (see Technologies: Prometheus for the full service definitions).

  3. Add a prometheus-data volume entry to the top-level volumes: section.

  4. Start the services:

    sudo docker compose up -d node-exporter prometheus
    

  5. Open firewall ports:

    sudo firewall-cmd --permanent --add-port=9100/tcp
    sudo firewall-cmd --permanent --add-port=9090/tcp
    sudo firewall-cmd --reload
    

  6. Verify node_exporter is working:

    curl http://localhost:9100/metrics | head -5
    

  7. Verify Prometheus is healthy and scraping:

    curl http://localhost:9090/-/healthy
    curl http://localhost:9090/api/v1/targets
    
    Browse to http://<vm-ip>:9090Status → Targets to confirm the node-exporter job is UP.

Troubleshooting:

  • node-exporter target DOWN: wait 15 seconds for the first scrape, then check prometheus.yml spelling.
  • Port not accessible: verify firewall rules and that ports are bound (ss -tlnp | grep 909).

Procedure: Deploy Loki and Promtail

When to use: Centralizing system logs with Loki and Promtail.

Steps:

  1. Create loki-config.yaml and promtail-config.yaml (see Technologies: Loki and Technologies: Promtail for the full configs).

  2. Add loki and promtail services to docker-compose.yml. Add a loki-data volume entry.

  3. Start the services:

    sudo docker compose up -d loki promtail
    

  4. Open the Loki firewall port:

    sudo firewall-cmd --permanent --add-port=3100/tcp
    sudo firewall-cmd --reload
    

  5. Wait 30–60 seconds, then verify Loki is ready:

    curl http://localhost:3100/ready
    

  6. Verify Promtail has shipped logs:

    curl 'http://localhost:3100/loki/api/v1/query?query={job="varlogs"}&limit=5'
    

Troubleshooting:

  • Loki not ready: check logs with sudo docker compose logs loki. First start takes 30–60 s.
  • No logs in Loki: check /var/log/messages exists (ls /var/log/messages) and Promtail can reach Loki.

Procedure: Deploy Grafana and Connect Data Sources

When to use: Adding dashboards to visualize Prometheus metrics and Loki logs.

Steps:

  1. Add the grafana service to docker-compose.yml with a grafana-data volume, binding the port to 127.0.0.1:3000:3000.

  2. Start Grafana:

    sudo docker compose up -d grafana
    

  3. Add a grafana DNS record in your zone file and reload Knot DNS (see Technologies: Grafana for the full DNS, Apache vhost, and reload steps).

  4. Browse to https://grafana.<vm_name>.sysadm.ee and log in (admin / admin).

  5. Add Prometheus data source:

    • Connections → Data Sources → Add new data source → Prometheus
    • URL: http://prometheus:9090
    • Click Save & Test
  6. Import the Node Exporter dashboard:

    • Dashboards → New → Import
    • Dashboard ID: 1860
    • Select the Prometheus data source → Import
  7. Add Loki data source:

    • Connections → Data Sources → Add new data source → Loki
    • URL: http://loki:3100
    • Click Save & Test
  8. Import the Loki logs dashboard:

    • Dashboard ID: 13639 → Select the Loki data source → Import

Troubleshooting:

  • Bad Gateway on data source test: Grafana cannot reach Prometheus/Loki. Verify they are in the same Docker network (same docker-compose.yml or same network name).
  • Dashboard shows no data: data source test must succeed first.

Quick Reference

Action Command
Start all monitoring services sudo docker compose up -d
Check all containers sudo docker compose ps
View any service logs sudo docker compose logs <service>
Check Prometheus targets curl http://localhost:9090/api/v1/targets
Check Loki ready curl http://localhost:3100/ready
Check Grafana health curl -sk https://grafana.<vm_name>.sysadm.ee/api/health