Re: [C++] Adopting a library for (distributed) tracing

David Li Fri, 19 Nov 2021 07:48:27 -0800

Ah, sorry, I meant Perfetto solely as an example of related library and not 
something that we had actually evaluated. Thanks for the details on the use 
cases.


I think the PR should be ready now, and then we can start instrumenting the 
engine/Flight and seeing how we can best make use of this. The PR now doesn't 
enable OpenTelemetry at all unless a build flag is passed.

For the visualization, Phillip showed how to use the "native" OTel collector in 
that PR and I confirmed it works for me. That at least handles collecting data, 
but visualization will need more work.

-David

On Wed, Nov 17, 2021, at 16:22, Weston Pace wrote:
> Hmm, I see the mention but I don't recall actually working with
> Perfetto (though, it's entirely possible I did and just forgot).  My
> goal isn't entirely identifying code bottlenecks however.  I'd divide
> it into two:
> 
> Improving Arrow's C++ engine: OT is very helpful here, especially when
> working on threading / scheduling type concerns, because it isn't so
> much a "am I computing XYZ as fast as possible?" but more "are we
> working on the correct tasks and utilizing the cores efficiently?"  I
> have found OT is necessary but not sufficient as OT doesn't handle
> analysis / visualization.  I experimented a bit with different
> visualization tools (maybe I mentioned Perfetto then) but I've yet to
> successfully get one configured (you and I encountered issues with
> Jaeger and I haven't tried since then but I think you fixed the
> issues).  So the latest (though not great) workflow I've been using is
> OT + python notebook + perf/vtune/etc.  This sort of task is a
> development-focused task.  Perfetto might be useful here, I can't say.
> 
> Query visibility: This task is less of a "improving the C++ engine"
> and more "introducing visibility into the engine for consumers".  For
> example, people might wonder why a particular query is running slowly
> and need to be able to trace down further.  The resulting fix _might_
> be a JIRA on the C++ engine but it also might be a realization that
> the user has an inefficient query and the user switches to some other
> query.  This case isn't a development use case but more of a user use
> case.  I don't think Perfetto would fit this use case very well.
> 
> -Weston
> 
> On Wed, Nov 17, 2021 at 10:21 AM David Li <lidav...@apache.org> wrote:
> >
> > Ah, right - I'm not suggesting we use Perfetto, rather I'm just generally 
> > curious about people's experience with these kinds of tools.
> >
> > -David
> >
> > On Wed, Nov 17, 2021, at 13:00, Antoine Pitrou wrote:
> > >
> > > Le 16/11/2021 à 17:18, David Li a écrit :
> > > > Following up here: I'm hoping we can enable this in 7.0.0 and am still 
> > > > working on getting all the builds passing (currently RPM packages fail 
> > > > to build with it enabled). OpenTelemetry released their v1.0.0 recently 
> > > > so that should not be a problem anymore.
> > > >
> > > > Some changes in approach:
> > > >   * For now, I've removed integration with Flight and any other 
> > > > components, focusing on just getting the builds working. I'll file 
> > > > follow-up issues for the Flight integration.
> > > >   * Unlike before, I'll change this to be built only when enabled, 
> > > > instead of always. Flight will implicitly enable OpenTelemetry once 
> > > > integrated. (Thanks to @Kou for questioning this.)
> > > >   * I'm now looking at using this for evaluating performance 
> > > > issues/bottlenecks in the C++ query engine, instead of/in addition to 
> > > > the original use case in Flight. I'm curious if others have used 
> > > > OpenTelemetry or similar libraries for this purpose before. I know 
> > > > tools like Perfetto [1] are similar in concept if not approach, and 
> > > > @Weston was experimenting with it for this purpose as well earlier in 
> > > > the thread.
> > > > [1]: https://perfetto.dev/
> > >
> > > Isn't OpenTelemetry language-agnostic while Perfetto is a C++-only
> > > library? (or are the two interoperable?)
> > >
> > > It seems that being language-agnostic would make OpenTracing a better
> > > fit for Arrow (ideally, one could mingle C++, Rust or Java calls and
> > > trace them together).
> > >
> > > Regards
> > >
> > > Antoine.
> > >
>

Re: [C++] Adopting a library for (distributed) tracing

Reply via email to