Brian Putt, All Are you aware of any good tools/services that can ingest the traces and provide an interesting view/story/reporting on it?
I could see us emitting otel events instead of our current provenance mechanism and using that both internally to do what we already do but also have a clear/spec friendly way of exporting it to others. Thanks On Sat, Jul 30, 2022 at 7:43 AM u...@moosheimer.com <u...@moosheimer.com> wrote: > Hello Brian, Bryan, Greg, NiFi devs, > > Integrating OpenTelemetry is a very good idea, especially since the major > cloud providers also rely on it. This could also be interesting for > Stateless NiFi. > > I have a suggestion that I would like to put up for discussion. > > Would it be useful to make a list of what extensions or new development > would be helpful for a complete integration of OpenTelemetry? > > I'm thinking of ConsumeMQTT and PublishMQTT, for example. Currently these > can do max. MQTT version 3.11, but since version 5 the User Properties > exist, which are similar to the HTTP header fields. > Thus one could implement OpenTelemetry in the MQTT processors similarly as > in HTTP. > > With a list we could make an overview of the "necessary" adjustments and > advertise for support. > > If what I write is nonsense, then I may not have understood something and > I take it all back :) > > Mit freundlichen Grüßen / best regards > Kay-Uwe Moosheimer > > > Am 29.07.2022 um 05:09 schrieb Brian Putt <puttbr...@gmail.com>: > > > > Hello Bryan / Greg / NiFi devs, > > > > Distributed tracing (DT) is similar to provenance in that it shows the > path > > a particular flowfile travels, but its core selling point is that it > > supports tracing across multiple systems/services regardless of what's > > receiving the data. Provenance is a fantastic feature and there are > > instances where one might want to draw that bigger picture of identifying > > bottlenecks as data flows from one system to another and that system > > may/may not be using NiFi. > > > > DT utilizes three ids: traceId, parentId, and spanId. While a tree can be > > built using two ids, the third id (traceId) helps bring all of the > relevant > > information out of a datastore more easily. > > DT is focused more on performance and identifying bottlenecks in one or > > more systems. Imagine if NiFi were receiving data from various sources > > (i.e. HTTP, Kafka, SQS) and NiFi egressed to other sources (HTTP, Kafka, > > NiFi). > > DT provides a spec that we'd be able to follow and correlate the data as > it > > traverses from system to system. Each system that participates in the DT > > ecosystem would simply emit information (a trace is made up of one or > more > > spans) and there'd be a collection system which would aggregate all of > > these spans and would draw a bigger picture of the path that data went > > through and could help identify key bottlenecks. > > > > OpenTelemetry (OTEL) provides clients (across many languages, including > > java) where developers can instrument their library's APIs and > participate > > in a DT ecosystem as it adheres to the tracing spec. Egressing trace data > > is possible without using OTEL, but then we may find ourselves having to > > recreate the wheel, but could be optimized for NiFi. > > > > Creating a reporting task could certainly be a path, mainly have a few > > concerns with that: > > > > 1. If provenance is disabled, will provenance events still be emitted and > > be collected by a new reporting task? > > 2. There'll be an impact on performance, how much is unknown. OTEL is > > gaining traction across industry and there are ways to mitigate > > performance, mainly sampling and the fact that *tracing is best effort*. > > Spans would be emitted from NiFi via UDP to a collector on the same > network > > 3. Would there be any issues with appending a flowfile attribute that is > > carried throughout the flow where it maintains the traceId, parentSpanId, > > and trace flags? See below for more details > > > > There's a W3C spec (Trace context) which includes a formatted string that > > would be propagated to services (HTTP, Kafka, etc...). So if NiFi were to > > put information onto kafka, any consumers of that data would be able to > > continue the trace and help draw the bigger picture. > > > > W3C Spec: https://www.w3.org/TR/trace-context/#traceparent-header > > > > For #2, since DT is focused on performance, sampling can help alleviate > > chatter over the wire and ideally, 0.01% would draw the same picture as > 1% > > or 10%+. This is certainly different from provenance as DT is focused on > > performance over quality of the data and should not be thought of as > > auditing. > > > https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampler > > > >> On Thu, Jul 28, 2022 at 5:01 PM Bryan Bende <bbe...@gmail.com> wrote: > >> > >> Hi Greg, > >> > >> I don't really know anything about OpenTelemetry, but from the > >> perspective of integrating something into the framework, some things > >> to consider... > >> > >> Is there some way to piggy-back on provenance and use a ReportingTask > >> to process provenance events and report something to OpenTelemetry? > >> > >> If something new does need to be added, it should probably be an > >> extension point where there is an interface in the framework-api and > >> different implementations can be plugged in. > >> Ideally the framework itself wouldn't have any knowledge of > >> OpenTelemetry specifically, it would only be reporting some > >> information, which could then be used in some way by the OpenTelemetry > >> implementation. > >> > >> How does NiFi actually communicate with OpenTelemetry? Are you > >> expecting to send data to OpenTelemetry in this new method you are > >> suggesting? > >> That would likely have a significant impact on the performance of the > flow. > >> > >> Thanks, > >> > >> Bryan > >> > >>> On Thu, Jul 28, 2022 at 3:17 PM glma...@uwe.nsa.gov < > glma...@uwe.nsa.gov> > >>> wrote: > >>> > >>> Nifi Devs, > >>> > >>> My team and I are looking for guidance on how we can extend Apache > >> Nifi's capabilities. Specifically we're looking to include distributed > >> tracing. We'll approach this effort as if we're the tracing experts and > >> simply seeking implementation guidance. Our developers have good > exposure > >> to working with Nifi and creating custom processors. We plan to fork the > >> project to begin this effort but want to make sure we approach this with > >> the best possible direction for community adoption. > >>> > >>> Our initial thoughts on this approach would be to piggyback on how > >> Provenance was implemented. We essentially want to include a subroutine > or > >> method that gets implicitly invoked upon a processors 'onTrigger' > method. > >> From there we would analyze the FlowFiles attributes to check for the > >> existence of 'traceId' and/or propagate one if found. > >>> > >>> We can expound upon all of these tracing/observability details if that > >> helps by any means. We're able to provide more detailed scope of this > task > >> as well but for now we just want to get feed back for our overall goal > and > >> proposed approach. > >>> > >>> Thanks, > >>> Greg Marshall > >> > >