Wednesday, October 2, 2024

Building an observability platform for Kubernetes using CNCF (Cloud Native Computing Foundation) open-source projects

Building an observability platform for Kubernetes using CNCF (Cloud Native Computing Foundation) open-source projects is a great approach to ensuring your cloud-native applications are scalable, reliable, and easy to troubleshoot. Observability platforms typically focus on three pillars: metrics, logs, and traces, providing insights into the health, performance, and behaviour of your Kubernetes clusters and applications.

Here is a step-by-step guide to building an observability platform for Kubernetes using CNCF open-source tools:

1. Metrics Collection with Prometheus

Prometheus is a leading open-source monitoring solution from CNCF, known for scraping metrics from services running in a Kubernetes cluster.

Steps:

  • Install Prometheus Operator: The Prometheus Operator simplifies the deployment and configuration of Prometheus on Kubernetes.
    kubectl apply -f https://github.com/prometheus-operator/ prometheus-operator/blob/main/bundle.yaml

  • Configure Service Monitors: Prometheus collects metrics by scraping endpoints. You need to create ServiceMonitors to tell Prometheus which services to scrape. The Prometheus Operator manages this for you.
    apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

  name: my-app-metrics

spec:

  selector:

    matchLabels:

      app: my-app

  endpoints:

    - port: metrics

  • Visualize Metrics with Grafana: Use Grafana to create dashboards for visualizing the metrics scraped by Prometheus. Install Grafana via Helm:
    helm install grafana grafana/grafana

Connect Grafana to Prometheus and create customized dashboards for various components.

2. Log Aggregation with Fluentd and Loki

Logs provide detailed information about the system's events. Fluentd is the de-facto CNCF log aggregator, and Loki from Grafana Labs is a CNCF project designed for log aggregation in Kubernetes environments.

Steps:

  • Install Fluentd: Fluentd collects, processes, and forwards logs from your Kubernetes nodes and applications.
    helm install fluentd stable/fluentd

Configure Fluentd to collect logs from Kubernetes pods and services.

  • Install Loki: Loki stores logs efficiently by indexing only metadata. It works seamlessly with Prometheus and Grafana.
    helm repo add grafana https://grafana.github.io/helm-charts

helm install loki grafana/loki-stack

  • Integrate with Grafana: Add Loki as a data source in Grafana for querying logs in conjunction with metrics from Prometheus.

3. Distributed Tracing with Jaeger

Tracing helps with understanding the flow of requests across microservices. Jaeger is a CNCF project for distributed tracing, making it easier to debug and monitor complex, microservice-based applications.

Steps:

  • Install Jaeger Operator: The Jaeger Operator simplifies the deployment and management of Jaeger on Kubernetes.
    kubectl create -f https://github.com/jaegertracing/jaeger-operator/ releases/download/v1.22.1/jaeger-operator.yaml

Deploy a Jaeger instance by applying the following custom resource:
apiVersion: jaegertracing.io/v1

kind: Jaeger

metadata:

  name: simple-prod

spec:

  strategy: production

  • Instrument Applications: Use OpenTelemetry SDK to instrument your applications. OpenTelemetry is another CNCF project for telemetry data collection. Make sure your microservices export traces compatible with Jaeger.

  • Visualize Traces in Jaeger: Jaeger’s UI allows you to visualize traces, which helps in analyzing request flow across services. You can also integrate Jaeger with Grafana for centralized visualization.

4. Kubernetes Cluster Monitoring with Prometheus and Node Exporter

Kubernetes metrics such as CPU, memory, and disk usage from nodes and pods are critical for monitoring the cluster health.

Steps:

  • Install Kubernetes Metrics Server: The Metrics Server collects resource metrics from Kubernetes nodes and pods. Install it with:
    kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/ releases/latest/download/components.yaml

  • Install Node Exporter: Prometheus Node Exporter is used to collect hardware and OS-level metrics from Kubernetes nodes.
    helm install node-exporter prometheus/node-exporter

The Node Exporter metrics will be scraped by Prometheus and can be visualized in Grafana.

5. Alerting and Notifications with Alertmanager

Prometheus has a built-in component called Alertmanager to handle alerts based on predefined rules.

Steps:

  • Set up Prometheus Alerts: Configure alerting rules in Prometheus. For example, you can set an alert for high CPU usage:
    Groups:

- name: example-alert

  rules:

  - alert: HighCpuUsage

    expr: node_cpu_seconds_total > 80

    for: 1m

    labels:

      severity: warning

    annotations:

      summary: "High CPU usage detected"

  • Configure Alertmanager: Set up Alertmanager to handle alerts and send notifications via channels like Slack, email, or PagerDuty.
    Receivers:

- name: 'slack-notifications'

  slack_configs:

  - send_resolved: true

    channel: '#alerts'

    username: 'prometheus'

    api_url: '<slack-webhook-url>'

6. Visualization and Centralized Dashboards with Grafana

Grafana acts as a centralized hub for visualizing metrics, logs, and traces from Prometheus, Loki, and Jaeger.

Steps:

  • Create Dashboards: Use pre-built Grafana dashboards from the Grafana dashboard library, or build custom dashboards based on your metrics and traces.

  • Monitor Across Metrics, Logs, and Traces: Grafana allows you to correlate metrics, logs, and traces in a single interface. You can drill down from a metric anomaly to related logs and traces, making it easier to debug issues.

Optional CNCF Tools:

  • KubeStateMetrics: This tool provides detailed metrics about Kubernetes objects (such as Deployments, DaemonSets, etc.), which Prometheus scrapes.

  • Thanos: For long-term storage of Prometheus metrics and a global view of multiple Prometheus instances.

  • OpenTelemetry: For unified collection of telemetry data across metrics, logs, and traces in Kubernetes.

Final Architecture Overview:

  1. Prometheus for metrics collection, scraping data from Kubernetes and applications.

  2. Fluentd and Loki for log aggregation and query.

  3. Jaeger for distributed tracing.

  4. Grafana as the unified interface to visualize and correlate metrics, logs, and traces.

  5. Alertmanager for alerting and notification.

This setup ensures you have complete observability into your Kubernetes clusters and applications, leveraging CNCF open-source tools.

Saturday, April 24, 2021

[Ballerina] ClassNotFoundException NoClassDefFoundError when upgrading ballerina distribution

I was working with a simple SQL query. The only change I did was upgrading the Ballerina distribution from slalpha3 to slalpha4. I got the following error.

[2021-04-20 18:15:15,527] SEVERE {b7a.log.crash} - ballerinax/mysql/0_7_0-alpha7/$ConfigurationMapper
java.lang.NoClassDefFoundError: ballerinax/mysql/0_7_0-alpha7/$ConfigurationMapper
    at suhan.expose_mysql_data.0_1_0.$ConfigurationMapper.$configureInit(Unknown Source)
    at suhan.expose_mysql_data.0_1_0.$_init.main(expose_mysql_data)
Caused by: java.lang.ClassNotFoundException: ballerinax.mysql.0_7_0-alpha7.$ConfigurationMapper
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
    ... 2 more


Reason on slack channel.
from slalpha4, packages like mysql, java.jdbc which are categorized as ballerinax are not coming with the ballerina distribution. if ballerina user needs such dependencies, user can always download those packages from the central. I think this particular issue is coming from either the caching mechanism we used which is not getting up to date dependencies or erroneous package pushed to ballerina central.

Workaround is as follows.
Answer on slack channel - Fixed.
Clear cache at location ~/.ballerina/repositories/central.ballerina.io


Friday, April 23, 2021

[Ballerina] [HTTPS Listener] Cannot use a Direct Certificate File for Service Listener Configuration - Fix

OS: macOS Big Sur 11.1

Ballerina Version: slalpha4

For the listener side to enable SSL via certs and keys, we should provide the configurations keyFile and certFile. Ballerina supports key files in the format of pkcs8

Commands:

1. openssl req -x509 -newkey rsa:4096 -out cert.pem 2. copy-paste the content appearing in the terminal starting with -----BEGIN ENCRYPTED PRIVATE KEY----- and ending with -----END ENCRYPTED PRIVATE KEY----- to a file named privkey.pem 3. openssl pkcs8 -topk8 -nocrypt -in privkey.pem -out pkcs8_key.pem

Sample https_listener.bal file.

import ballerina/http; http:ListenerConfiguration helloWorldEPConfig = { secureSocket: { key: { certFile: "../path/to/cert.pem", keyFile: "../path/to/pkcs8_key.pem" } } }; listener http:Listener helloWorldEP = new (9095, helloWorldEPConfig); service /hello on helloWorldEP { resource function get .() returns string { return "Hello World!"; } }

Run the ballerina file as follows.

suhan@Suhan httpslistener % bal run https_listener.bal Compiling source https_listener.bal Running executable [ballerina/http] started HTTPS/WSS listener 0.0.0.0:9095

Issue a cURL command as follows.

suhan@Suhan httpslistener % curl -k https://localhost:9095/hello Hello World!%


Friday, September 4, 2020

[Tuturial] Using Asgardio JavaScript OIDC Authentication SDK with WSO2 Identity Server 5.10.0

WSO2 Identity Server is an API-driven open source Identity and Access Management (IAM) product designed to help you build effective Customer Identity and Access Management (CIAM) solutions. It is based on open standards such as SAML, OAuth and OpenID Connect (OIDC) with the deployment options of on-premise, cloud, and hybrid. It supports complex IAM requirements given its high extensibility. Capabilities: SSO, Identity Federation, Multi-factor Authentication or Adaptive Authentication, and many more.

Asgardio's OIDC SDK for JavaScript [1] allows Single Page Applications (SPA) to use OIDC/OAuth2 authentication in a simple and secure way. By using Asgardio and the JavaScript OIDC SDK, developers will be able to add identity management quickly to their Single Page Applications.

For this tutorial we will be using the sample application in the above mentioned Github repo. Let's get started.

Step 1: Setup and Run WSO2 Identity Server (WSO2 IS) 

1. Download WSO2 IS from here. I have selected the "Zip Archive" option for my exercise. 


At the time of writing this article WSO2 IS latest release was 5.10.0 and this exercise was carried out with this version. 

You can find older WSO2 IS versions from this previous releases list, if you cannot find version 5.10.0 as the latest, at the time of reading this article. 

2. Extract the zip archive to your working directory and we call the wso2is-5.10.0 as IS_HOME from now on.

3. Add CORS filter to oauth2 and api#identity#user#v1.0 web.xml files.

When we are invoking a REST endpoint in oauth2 war and api#identity#user#v1.0 war from a javascript of a web app which is located in a different domain than identity server domain, we are getting No 'Access-Control-Allow-Origin' header is present on the requested resource error because this is a cross-origin request. Therefore, your web application is not allowed access. In order to get rid of this issue, you must enable this by sending the following CORS (Cross-Origin Resource Sharing) headers using a custom filter by adding the following configuration to both the web.xml files.

i. IS_HOME/repository/deployment/server/webapps/oauth2/WEB-INF/web.xml  

    <filter>
<filter-name>CORS</filter-name>
<filter-class>com.thetransactioncompany.cors.CORSFilter</filter-class>
<init-param>
<param-name>cors.allowOrigin</param-name>
<param-value>http://localhost:3000</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>CORS</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>

If not, when you click on the Sign In button of the application, you will be getting the following error.

Access to XMLHttpRequest at 'https://localhost:9443/oauth2/token' from origin 'http://localhost:3000' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.

ii. IS_HOME/repository/deployment/server/webapps/api\#identity\#user\#v1.0/WEB-INF/web.xml

    <filter>
<filter-name>CORS</filter-name>
<filter-class>com.thetransactioncompany.cors.CORSFilter</filter-class>
<init-param>
<param-name>cors.allowOrigin</param-name>
<param-value>http://localhost:3000</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>CORS</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>

If not, when you click on the Get user info button of the application, you will be getting the following error.

Access to XMLHttpRequest at 'https://localhost:9443/api/identity/user/v1.0/me' from origin 'http://localhost:3000' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.

5. Start WSO2 IS and login to management console using default credentials admin:admin

sh IS_HOME/bin/wso2server.sh start; tail -f IS_HOME/repository/logs/wso2carbon.log;
Management Console: https://localhost:9443/carbon

6. Go to  Main  ->  Identity  ->  Service Providers  and click  Add  to add a new service provider.

7. Add service provider name as Sample and click on the button Register


8. Expand the Inbound Authentication Configuration section. Then expand OAuth/OpenID Connect Configuration section and click on Configure.


9. Configure as follows.

i. Under Allowed Grant Types uncheck everything except Code and Refresh Token.

ii. Enter http://localhost:3000 as the Callback Url.

iii. Check Allow authentication without the client secret.


iv. Finally click Add button at the bottom.


11. Copy the OAuth Client Key. This will be added to your JavaScript application configuration later.


Step 2: Setup and Run the sample javascript OIDC application

1. Clone Github repo in [1] to your local machine.

 git clone https://github.com/asgardio/asgardio-js-oidc-sdk.git 

2. Go to directory: asgardio-js-oidc-sdk and issue the following command.

 npm run build 

You will see a response as follows.

 > asgardio-js-oidc-sdk@0.1.0 prebuild /Users/Shared/wso2/asgardioOIDCsample/asgardio-js-oidc-sdk
 > npm install && lerna bootstrap
 added 753 packages from 386 contributors and audited 753 packages in 13.512s
 ...
 lerna success run Ran npm script 'build' in 1 package in 31.6s:
 lerna success - @asgardio/oidc-js


3.  Open the index.html file and find the JavaScript section where the app logic is written.

Paste the previously copied OAuth Client Key  in front of the  clientID attribute as follows.

// Initialize the client
auth.initialize({
baseUrls: [serverOrigin],
callbackURL: clientHost,
clientHost: clientHost,
clientID: "AupLfEfrCLsa0f8yf1SGfXUmqGAa",
enablePKCE: true,
serverOrigin: serverOrigin,
storage: "webWorker"
});


If you don't correctly add the clientID you will get the following error.

Cannot find an application associated with the given consumer key : X7idgY33LQYhKoFXHmJFHhT5o7Ma



4. Run the app by entering the following command

 npm start 

> @asgardio/oidc-sample-vanilla@0.1.0 start /Users/Shared/wso2/asgardioOIDCsample/asgardio-js-oidc-sdk/samples/vanilla-js-app
> node server.js

Server listening on 3000

The application should be accessible via http://localhost:3000


5. Click on the Sign In button to login using WSO2 IS. You will be redirected to the SSO page.


For this exercise use the default credentials admin:admin to login.

6. You will be asked to provide consent for this application to access your data.

Once you press Continue, you will be redirected to the application as follows.

7. When you press the Get user info button you will get user data as follows.


-=< End of Tutorial >=-

Reference:

[1] https://github.com/asgardio/asgardio-js-oidc-sdk

[2] https://docs.wso2.com/display/IS530/Invoking+an+Endpoint+from+a+Different+Domain