Re: [C++] Adopting a library for (distributed) tracing

2021-11-19 Thread David Li
Ah, sorry, I meant Perfetto solely as an example of related library and not something that we had actually evaluated. Thanks for the details on the use cases. I think the PR should be ready now, and then we can start instrumenting the engine/Flight and seeing how we can best make use of this.

Re: [C++] Adopting a library for (distributed) tracing

2021-11-17 Thread Weston Pace
Hmm, I see the mention but I don't recall actually working with Perfetto (though, it's entirely possible I did and just forgot). My goal isn't entirely identifying code bottlenecks however. I'd divide it into two: Improving Arrow's C++ engine: OT is very helpful here, especially when working on

Re: [C++] Adopting a library for (distributed) tracing

2021-11-17 Thread David Li
Ah, right - I'm not suggesting we use Perfetto, rather I'm just generally curious about people's experience with these kinds of tools. -David On Wed, Nov 17, 2021, at 13:00, Antoine Pitrou wrote: > > Le 16/11/2021 à 17:18, David Li a écrit : > > Following up here: I'm hoping we can enable this

Re: [C++] Adopting a library for (distributed) tracing

2021-11-17 Thread Antoine Pitrou
Le 16/11/2021 à 17:18, David Li a écrit : Following up here: I'm hoping we can enable this in 7.0.0 and am still working on getting all the builds passing (currently RPM packages fail to build with it enabled). OpenTelemetry released their v1.0.0 recently so that should not be a problem

Re: [C++] Adopting a library for (distributed) tracing

2021-11-16 Thread David Li
Following up here: I'm hoping we can enable this in 7.0.0 and am still working on getting all the builds passing (currently RPM packages fail to build with it enabled). OpenTelemetry released their v1.0.0 recently so that should not be a problem anymore. Some changes in approach: * For now,

Re: [C++] Adopting a library for (distributed) tracing

2021-07-12 Thread David Li
A quick update on this, I don't think this will happen for 5.0; the upstream library still hasn't reached 1.0, and I don't want to cram this in at the end of a cycle, especially as each of their release candidates has needed an upstream fix in order to keep all our CI platforms working.

Re: [C++] Adopting a library for (distributed) tracing

2021-06-09 Thread David Li
I just updated the PR with support for exporting to Jaeger[1], which has a built in trace viewer. 1. Download and run the all-in-one Jaeger binary locally[2] (or their Docker image) 2. Build Arrow with `-DARROW_WITH_OPENTELEMETRY=ON -DARROW_THRIFT=ON` 3. Run your application with `env

Re: [C++] Adopting a library for (distributed) tracing

2021-06-08 Thread David Li
I'll have to do some more digging into that and get back to you. So far I've been using a quick-and-dirty tool that I whipped up using Vega-Lite but that's probably not something we want to maintain. I tried the Chrome trace viewer ("Catapult") but it's not quite built for this kind of trace; I

Re: [C++] Adopting a library for (distributed) tracing

2021-06-08 Thread Weston Pace
FWIW, I tried this out yesterday since I was profiling the execution of the async API reader. It worked great so +1 from me on that basis. I did struggle finding a good simple visualization tool. Do you have any good recommendations on that front? On Mon, Jun 7, 2021 at 10:50 AM David Li

Re: [C++] Adopting a library for (distributed) tracing

2021-06-07 Thread David Li
Just to give an update on where this stands: Upstream recently released v1.0.0-RC1 and I've updated the PR[1] to use it. This contains a few fixes I submitted for the platforms our various CI jobs use, as well as an explicit build flag to support header-only use - I think this should alleviate

Re: [C++] Adopting a library for (distributed) tracing

2021-05-06 Thread David Li
I've created ARROW-12671 [1] to track this work and filed a draft PR [2]; I'd appreciate any feedback, particularly from anyone already trying to use OpenTelemetry/Tracing/Census with Arrow. For dependencies: now we use OpenTelemetry as header-only by default. I also slimmed down the build,

Re: [C++] Adopting a library for (distributed) tracing

2021-05-01 Thread David Li
Thanks everyone for all the comments. Responding to a few things: > It seems to me it would be fairly implementation dependent -- so each > language implementation would choose if it made sense for them and then > implement the appropriate connection to that language's open telemetry > ecosystem.

Re: [C++] Adopting a library for (distributed) tracing

2021-05-01 Thread Bob Tinsman
I agree that OpenTelemetry is the future; I have been following the observability space off and on and I knew about OpenTracing; I just realized that OpenTelemetry is its successor. [1] I have found tracing to be a very powerful approach; at one point, I did a POC of a trace recorder inside a

Re: [C++] Adopting a library for (distributed) tracing

2021-05-01 Thread Antoine Pitrou
Hi David, I'm favorable to adopting a tracing library. My main question is: does integrating OpenTracing complicate our build procedure? Is it header-only as long as you use the no-op tracer? Or do you have to build it and link with it nonetheless? The opentracing-cpp documentations

Re: [C++] Adopting a library for (distributed) tracing

2021-05-01 Thread Andrew Lamb
I agree that having arrow flight play well with open telemetry based observability systems would be great (and the future). It seems to me it would be fairly implementation dependent -- so each language implementation would choose if it made sense for them and then implement the appropriate

Re: [C++] Adopting a library for (distributed) tracing

2021-04-30 Thread Evan Chan
Dear David, OpenTelemetry tracing is definitely the future, I guess the question is how far down the stack we want to put it. I think it would be useful for flight and other higher level modules, and for DataFusion for example it would be really useful. As for being alpha, I don’t think it

[C++] Adopting a library for (distributed) tracing

2021-04-29 Thread David Li
Hello, For Arrow Datasets, I've been working to instrument the scanner to find bottlenecks. For example, here's a demo comparing the current async scanner, which doesn't truly read asynchronously, to one that does; it should be fairly evident where the bottleneck is: