westonpace commented on a change in pull request #11964: URL: https://github.com/apache/arrow/pull/11964#discussion_r778522222
########## File path: cpp/src/arrow/util/tracing_internal.h ########## @@ -97,6 +98,58 @@ AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped, return fut; }; } + +/// \brief Start a new span for each invocation of a generator. +/// +/// The parent span of the new span will be the currently active span +/// (if any) as of when WrapAsyncGenerator was itself called. +template <typename T> +AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped, + const std::string& span_name) { + opentelemetry::trace::StartSpanOptions options; + options.parent = GetTracer()->GetCurrentSpan()->GetContext(); + return WrapAsyncGenerator(std::move(wrapped), std::move(options), span_name); +} + +/// \brief End the given span when the given async generator ends. +/// +/// The span will be made the active span each time the generator is called. +template <typename T> +AsyncGenerator<T> TieSpanToAsyncGenerator( + AsyncGenerator<T> wrapped, + opentelemetry::nostd::shared_ptr<opentelemetry::trace::Span> span) { + return [=]() mutable -> Future<T> { + auto scope = GetTracer()->WithActiveSpan(span); + return wrapped().Then( + [span](const T& result) -> Result<T> { + span->SetStatus(opentelemetry::trace::StatusCode::kOk); + return result; + }, + [span](const Status& status) -> Result<T> { + MarkSpan(status, span.get()); + return status; + }); + }; +} + +/// \brief Activate the given span on each invocation of an async generator. +template <typename T> +AsyncGenerator<T> PropagateSpanThroughAsyncGenerator( + AsyncGenerator<T> wrapped, + opentelemetry::nostd::shared_ptr<opentelemetry::trace::Span> span) { + return [=]() mutable -> Future<T> { + auto scope = GetTracer()->WithActiveSpan(span); + return wrapped(); + }; +} Review comment: Concretely I'm thinking of... ``` #ifdef ARROW_WITH_OPENTELEMETRY batch_gen_gen = arrow::internal::tracing::PropagateSpanThroughAsyncGenerator( std::move(batch_gen_gen)); #endif ``` If you are I/O bound then I would expect `batch_gen_gen` will be transferring to an I/O thread (and back) for every item. There are "async-local" concepts (e.g. https://docs.microsoft.com/en-us/dotnet/api/system.threading.asynclocal-1?view=net-6.0) so maybe we need to adopt something like that. I think that's the same thing as "instrumenting the executor and possibly future classes themselves". I think it would be fairly affordable (submitting a thread task would have to copy a handle to the active span or "async context" to include as part of the task and then the first thing in the task would be setting the active span based on that handle). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org