Surya Hebbar has posted comments on this change. ( http://gerrit.cloudera.org:8080/23777 )
Change subject: IMPALA-14370: [Patch 1 of 2] - OpenTelemetry Query Tracing Skips Queries with Leading Comments ...................................................................... Patch Set 20: (6 comments) Thank you for this change—it's a great improvement! I've left a few inline comments regarding encapsulation and potential lock contention, along with some minor nits and suggestions. http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/observe/buffered-span.h File be/src/observe/buffered-span.h: http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/observe/buffered-span.h@51 PS20, Line 51: using BufferedAttributeValue = : opentelemetry::nostd::variant<bool, int32_t, int64_t, std::string>; : using BufferedAttributesMap = std::unordered_map<std::string, BufferedAttributeValue>; nit: The using alias declarations here are completely un-indented. Consider aligning them with the rest of the class members for readability. http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/observe/buffered-span.h@101 PS20, Line 101: // Returns a string representation of the span id of the underlying span or the empty : // string if the span is not started. : const std::string SpanId() const noexcept; : : // Returns a string representation of the trace id of the underlying span or the empty : // string if the span is not started. : const std::string TraceId() const noexcept; nit: Returning a const object by value (const std::string) is generally considered an anti-pattern in recent C++ because it prevents the compiler from using move semantics, forcing unnecessary string copies. Suggestion: How about removing the leading const so it reads std::string SpanId() const noexcept; ? http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/observe/span-manager.cc File be/src/observe/span-manager.cc: http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/observe/span-manager.cc@260 PS20, Line 260: VLOG(2) << strings::Substitute("Adding event named '$0' to child span '$1' " : "trace_id=\"$2\" span_id=\"$3", name.data(), to_string(a.first), : span->TraceId(), span->SpanId()); : return; : } : } : : LOG(WARNING) << strings::Substitute("Attempted to add event '$0' with no active " : "child span trace_id=\"$1\" span_id=\"$2\"\n$3" nit: Missing a closing quote for the span_id attribute in this log string. It currently reads "span_id=\"$3" but should probably be "span_id=\"$3\"". http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/observe/span-manager.cc@367 PS20, Line 367: impala::to_string(client_request_state_->exec_request().stmt_type)}} nit: Since this entire file is already inside the impala namespace, the impala:: prefix on impala::to_string is redundant and can be removed. http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/23777/20/be/src/service/client-request-state.h@554 PS20, Line 554: otel_span_manager_->HasEnded() Calling otel_span_manager_->HasEnded() here means we acquire the child_span_mu_ mutex every time otel_trace_query() is invoked within the HasEnded() method. Since this method acts as a gatekeeper for tracing and might be called on hot paths, this could introduce severe lock contention. Suggestion: How about backing the "ended" state with a lock-free std::atomic<bool> inside SpanManager so that HasEnded() doesn't require acquiring child_span_mu_. http://gerrit.cloudera.org:8080/#/c/23777/20/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/23777/20/fe/src/main/java/org/apache/impala/service/Frontend.java@3325 PS20, Line 3325: if (planCtx.queryTraced_ == null) { : planCtx.queryTraced_ = Boolean.valueOf( : !parsedStmt.isExplain() : && !parsedStmt.isValuesStmt() : && ( : parsedStmt.isQueryStmt() : || parsedStmt.isAlterTableStmt() : || parsedStmt.isComputeStatsStmt() : || parsedStmt.isCreateDbStmt() : || parsedStmt.isCreateTableAsSelectStmt() : || parsedStmt.isCreateTableLikeStmt() : || parsedStmt.isCreateTableStmt() : || parsedStmt.isCreateViewStmt() : || parsedStmt.isDeleteStmt() : || parsedStmt.isDropDbStmt() : || parsedStmt.isDropTableOrViewStmt() : || parsedStmt.isInsertStmt() : || parsedStmt.isInvalidateMetadata() : || parsedStmt.isUpdateStmt())); : : updateQueryOtelTracingInBE(queryId, planCtx.queryTraced_.booleanValue()); : } Checking all these specific statement types here breaks encapsulation and means we have to update Frontend.java whenever a new traceable statement is added. It also required adding 14+ new is[Type]Stmt() methods to the ParsedStatement interface and its implementations. Suggestion: Would it be simpler to replace this massive boolean chain (and all the new is[Type]Stmt() interface methods) with a single boolean isOtelTraceable() method on the ParsedStatement interface? Frontend.java could then just call planCtx.queryTraced_ = Boolean.valueOf(parsedStmt.isOtelTraceable());, and the specific ParsedStatementImpl / CalciteParsedStatement classes can encapsulate the instanceof logic internally. -- To view, visit http://gerrit.cloudera.org:8080/23777 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1425b32006f81586bf75c2e4045d23bab91e1611 Gerrit-Change-Number: 23777 Gerrit-PatchSet: 20 Gerrit-Owner: Jason Fehr <[email protected]> Gerrit-Reviewer: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Surya Hebbar <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Fri, 27 Mar 2026 10:13:00 +0000 Gerrit-HasComments: Yes
