[DISCUSS] Introducing OpenTelemetry to Apache Flink

John Gerassimou Sat, 04 Feb 2023 11:13:44 -0800

Dear Flink Community,

I am writing to propose the integration of OpenTelemetry into Apache Flink.
As we all know, observability is crucial for ensuring the reliability and
performance of applications. OpenTelemetry provides a comprehensive,
vendor-neutral, open-source way to gather telemetry data, including
metrics, traces, and logs.


OpenTelemetry has gained significant traction in the cloud-native and
open-source communities and is widely adopted by popular projects such as
Istio, Jaeger, and Kubernetes. Integrating it into Apache Flink will allow
us to take advantage of its rich features and easy integration with
existing observability tools to improve the observability of Flink
applications.

However, integrating OpenTelemetry into Apache Flink may also involve
significant changes. We must thoroughly and openly discuss this proposal's
potential benefits, challenges, and trade-offs to reach a consensus on the
best way forward.

Here are some of the questions that we need to consider:

   - What are the benefits of using OpenTelemetry in Apache Flink, and how
   will it improve the observability of Flink applications?
   - What are the potential challenges and trade-offs of integrating
   OpenTelemetry into Apache Flink, and how can we mitigate them?
   - How can we ensure a smooth and seamless transition for existing Flink
   users and observability tools during the integration process?
   - What are the steps and timeline for integrating OpenTelemetry into
   Apache Flink, and what is the expected impact on the development and
   maintenance of the Flink codebase?
   - Will the integration of OpenTelemetry alter the behaviour of features
   or components in a way that may break previous users' programs and setups?
   If yes, is this change desirable?
   - Is the integration conceptually a good fit for Flink? Will it
   complicate the typical case or bloat the abstractions/APIs?
   - Does the integration fit well into Flink's architecture, and will it
   scale and keep Flink flexible for the future?
   Do you think this is a significant new addition to Flink, and will the
   community commit to maintaining it? Does the integration align well with
   Flink's roadmap and ongoing efforts?
   - Does the integration produce added value for Flink users or
   developers, or does it introduce the risk of regression without adding
   relevant user or developer benefits?
   - Could the integration be done in another repository?

I encourage everyone in the Flink community to participate in this
discussion and share their thoughts and opinions. Let's work together to
make Apache Flink an even better and more observable big data platform.

Best regards,
John

[DISCUSS] Introducing OpenTelemetry to Apache Flink

Reply via email to