Prometheus Operator for simplifying deployment and management of Prometheus instance

Yogesh Kumar
6 min readMay 2, 2021

--

Photo by Markus Spiske on Unsplash

The objective of this page is to investigate Prometheus operators to achieve following:

  • Understanding Prometheus Operator architecture.
  • Prometheus configuration to collect custom metrics from applications running on different namespace.
  • Custom metric configuration should be achieved by non-admin user in own namespace, not where Prometheus is deployed.
  • Custom metric collection from external services outside cluster where Prometheus is deployed.

Overview

Prometheus is an open source monitoring and alerting toolkit originally developed by SoundCloud in 2012. Understand Prometheus in detail from below link:

https://prometheus.io/

Configuring Prometheus is not a trivial task because you need to have a domain-specific knowledge including Prometheus configuration format and Kubernetes auto-discover settings. Obviously, acquiring this knowledge takes time and effort.

Below is diagram for Prometheus architecture:

We can dramatically simplify the deployment and management of our Prometheus instances with the Prometheus Operator developed by CoreOS.

The Prometheus Operator serves to make running Prometheus on top of Kubernetes as easy as possible, while preserving Kubernetes-native configuration options.

Prometheus-Operator is far more dynamic than the default Prometheus install. It adds some CRD to dynamically and transparently re-configure your Prometheus cluster.

A “ServiceMonitor” is a resource describing which pods to scrape based on a Service.

Below is diagram for Prometheus Operator.

Installation and Configuration of Prometheus Operator

This involves following steps:

Created k8s cluster with “kubeadm” . Follow link :

https://gist.github.com/ykumar-rb/20df3608445160f81f1da4a7622c5faa

Pre-requisites for Spike

The spike is done using source code:
https://github.com/ykumar-rb/POC/tree/master/Prometheus-Operator-monitoring/

git clone https://github.com/ykumar-rb/POC.git

cd POC/

Create Prometheus Operator in admin namespace(say monitoring)

Create the Prometheus Operator POD and CRDS with below command.

kubectl create ns -n monitoring

kubectl apply -f prometheus-operator -n monitoring

Create a Prometheus Resource in admin namespace(say monitoring)

kubectl apply -f prometheus-cluster-monitoring -n monitoring]

Ensure Prometheus manifests should have serviceMonitorNamespaceSelector{} if we would like service monitors from other namespace to be discovered. Though doing this could be security concern. We may think of adding selected namespaces for metrics should be scraped.

RBAC needs to be changed accordingly.

serviceMonitorNamespaceSelector: {}

serviceMonitorSelector:

matchExpressions

The :”serviceMonitorSelector” scraps metrics from “serviceMonitors” having labels present.

Deployment of App shipping Prometheus-format Metrics in different namespace(apps)

kubectl create ns apps

kubectl create ns test-cluster

kubectl apply -f 3rd-party-apps/node-exporter-apps/apps -n apps

kubectl create -f minimal_prometheus_operator_poc/3rd-party-apps-for-metrics-scraping/rpc-app -n test-cluster

# kubectl get pods -n test-cluster

NAME READY STATUS RESTARTS AGE

rpc-app-deployment-b556c5494-4885f 1/1 Running 0 5m40s

rpc-app-deployment-b556c5494-hc67j 1/1 Running 0 5m40s

# kubectl get servicemonitor -n test-cluster

NAME AGE

rpc-app 6m12s

Create a Service Monitor in different namespace(apps)

Servicemonitor manifest defines the serviceMonitorSelector that associates ServiceMonitors with the operator. The value of this field should match the label k8s-app specified in the ServiceMonitor manifest used below. Using ServiceMonitor labels makes it easy to dynamically reconfigure Prometheus.

kubectl apply -f 3rd-party-apps/node-exporter-apps/service-monitors -n apps

Listing Kubernetes Resources Created

Export Prometheus UI

Here “172.29.55.252” Host IP and reachable from system’s browser.

# kubectl get pods -n monitoring

NAME READY STATUS RESTARTS AGE

grafana-5874b66f87-25msx 1/1 Running 0 25h

kube-state-metrics-957fd6c75-6xppd 3/3 Running 0 25h

prometheus-k8s-0 3/3 Running 1 7h19m

prometheus-operator-59b7c5584c-pvqh8 2/2 Running 0 8h

kubectl -n monitoring port-forward --address 172.29.55.252 prometheus-k8s-0 9090

Prometheus Dashboard

The Prometheus Operator automatically created a working Prometheus configuration with the kubernetes_sd_configs for the auto-discovery of Kubernetes service endpoints.

Configuration page can be accessed at below link. http://172.29.55.252:9090/config

Scraping metrics from external services

External services refers here the services running outside the k8s cluster(“cluster-1”). So, this section will focus on how Prometheus instance scrapes the metrics from exposed endpoint by external service. In order to scrap metrics from external service, We’ll create service that does not use selectors, manually define endpoints for that service. Finally create a service monitor for our newly created service.

For Spike say, we will create a “rpc-app” in different k8s cluster(cluster-2) and Prometheus running on previous K8s cluster(cluster-1) would be scraping metrics from this external service.

For Demonstration, creating a “rpc-app” on cluster-2

Open new terminal. Export the kubeconfig of cluster-2 for creating “rpc-app”.

kubectl create ns custom-ns

kubectl create -f Prometheus-Operator-monitoring/external-svc/rpc_apps/external_entity -n custom-ns

$ kubectl get all -n custom-ns | grep rpc-app

pod/rpc-app-deployment-7fc8fc987b-65fv7 1/1 Running 0 168m

pod/rpc-app-deployment-7fc8fc987b-x7cpf 1/1 Running 0 168m

pod/svclb-rpc-app-service-nwr86 1/1 Running 0 168m

service/rpc-app-service LoadBalancer 10.43.125.246 172.29.55.41 8081:30298/TCP 168m

daemonset.apps/svclb-rpc-app-service 3 3 1 3 1 <none> 168m

deployment.apps/rpc-app-deployment 2/2 2 2 168m

replicaset.apps/rpc-app-deployment-7fc8fc987b 2 2 2 168m

Once app is up and running on cluster-2.We can switch back to “cluster-1” context to create service, endpoint and service monitor to scrap data from it.

NOTE: The name property and labels of your endpoint must match the name/labels property of your service. This is how service gets associated with endpoint.

Creation of services , endpoint and service-monitor can be done by executing manifests like below:

kubectl create -f Prometheus-Operator-monitoring/external-svc/rpc_apps/ServiceMonitorAndEndpoint -n test-cluster

# kubectl get servicemonitor -n test-cluster | grep custom-external-rpc

custom-external-rpc-svc 145m

# kubectl get svc -n test-cluster | grep custom-external-rpc

custom-external-rpc-svc ClusterIP 10.108.91.141 <none> 9106/TCP 146m

# kubectl get endpoints -n test-cluster | grep custom-external-rpc

custom-external-rpc-svc 172.29.55.41:8081 146m

## To check service to endpoint association, verify the relected

# kubectl describe svc custom-external-rpc-svc -n test-cluster

Name: custom-external-rpc-svc

Namespace: test-cluster

Labels: app=custom-external-rpc-svc

Annotations: <none>

Selector: <none>

Type: ClusterIP

IP: 10.102.242.107

Port: web 9106/TCP

TargetPort: 8081/TCP

Endpoints: 172.29.55.41:8081

Session Affinity: None

Events: <none>

Services Discovered in Prometheus UI

We can see external service “custom-external-rpc-svc” discovered from “cluster-2” on Prometheus UI running on “cluster-1”.

Metrics from “custom-external-rpc_svc” can be fetched here as displayed on targets Prometheus view http://172.29.55.41:8081/metrics

Similarly jobs created can be viewed on: http://172.29.55.252:9090/config

Job configuration for say “custom-external-rpc-svc” will be auto generated like:

- job_name: test-cluster/custom-external-rpc-svc/0

honor_timestamps: true

scrape_interval: 30s

scrape_timeout: 10s

metrics_path: /metrics

scheme: http

kubernetes_sd_configs:

- role: endpoints

namespaces:

names:

- test-cluster

relabel_configs:

- source_labels: [__meta_kubernetes_service_label_app]

separator: ;

regex: custom-external-rpc-svc

replacement: $1

action: keep

- source_labels: [__meta_kubernetes_endpoint_port_name]

separator: ;

regex: web

replacement: $1

action: keep

- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]

separator: ;

regex: Node;(.*)

target_label: node

replacement: ${1}

action: replace

- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]

separator: ;

regex: Pod;(.*)

target_label: pod

replacement: ${1}

action: replace

- source_labels: [__meta_kubernetes_namespace]

separator: ;

regex: (.*)

target_label: namespace

replacement: $1

action: replace

- source_labels: [__meta_kubernetes_service_name]

separator: ;

regex: (.*)

target_label: service

replacement: $1

action: replace

- source_labels: [__meta_kubernetes_pod_name]

separator: ;

regex: (.*)

target_label: pod

replacement: $1

action: replace

- source_labels: [__meta_kubernetes_service_name]

separator: ;

regex: (.*)

target_label: job

replacement: ${1}

action: replace

- separator: ;

regex: (.*)

target_label: endpoint

replacement: web

action: replace

References

https://github.com/prometheus-operator

https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/

--

--

Yogesh Kumar
0 Followers

Believe in learn, share and grow principle. Passion to learn new technologies and tool sets