Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/24008 )
Change subject: IMPALA-14332: Add X-Request-Id as HttpRequestId attribute on root OTel span ...................................................................... Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/otel.cc File be/src/observe/otel.cc: http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/otel.cc@391 PS1, Line 391: size_t last_hyphen = processed.rfind('-'); : if (last_hyphen != string::npos && last_hyphen < processed.length() - 1) { : // Check if everything after the last hyphen is a digit : bool all_digits = true; : for (size_t i = last_hyphen + 1; i < processed.length(); ++i) { : if (!isdigit(processed[i])) { : all_digits = false; : break; : } : } : // If all digits after the last hyphen, remove it (it's the iteration count) : if (all_digits) { : processed = processed.substr(0, last_hyphen); : } : } There is a potential issue here. UUIDs are not guaranteed to have letters in all portions. It's highly unlikely, but a UUID such as "a6dd52ba-18d1-11f1-ba97-665407384305" is possible. Given this UUID, the code will return "a6dd52ba-18d1-11f1-ba97". Per RFC9562, we can assume any provided request_id will be a UUID in string hex-and-dash format and thus will contain hyphens. https://datatracker.ietf.org/doc/html/rfc9562#:~:text=When%20in%20use%20with%20URNs%20or%20as%20text%20in%20applications%2C%20any%20given%20UUID%20should%20be%20represented%20by%20the%20%22hex%2Dand%2Ddash%22%20string%20format%20consisting%20of%20multiple%20groups%20of%20uppercase%20or%20lowercase%20alphanumeric%20hexadecimal%20characters%20separated%20by%20single%20dashes/hyphens http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc File be/src/observe/span-manager.cc: http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc@36 PS1, Line 36: #include "observe/otel.h" I don't want the span-manager.h/.cc files to have a dependency on the otel.h/.cc files. It creates circular dependencies which make the code much more difficult to reason about. http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc@59 PS1, Line 59: static constexpr char const* ATTR_HTTP_REQUEST_ID = "HttpRequestId"; This attribute also needs to be set on the Init span. http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc@147 PS1, Line 147: // Store all string attributes outside the map to ensure string lifetime. Was there a specific issue you ran into that necessitated this code change? The contents of the key and value nostd::string_view are copied into new std::string objects while the Span object is being constructed. The string creation happens in the generated common.pb.h file inline void KeyValue::set_key(const char* value, size_t size) { key_.Set(::PROTOBUF_NAMESPACE_ID::internal::ArenaStringPtr::EmptyDefault{}, ::std::string( reinterpret_cast<const char*>(value), size), GetArena()); // @@protoc_insertion_point(field_set_pointer:opentelemetry.proto.common.v1.KeyValue.key) } inline void AnyValue::set_string_value(const char* value, size_t size) { if (!_internal_has_string_value()) { clear_value(); set_has_string_value(); value_.string_value_.UnsafeSetDefault(&::PROTOBUF_NAMESPACE_ID::internal::GetEmptyStringAlreadyInited()); } value_.string_value_.Set( ::PROTOBUF_NAMESPACE_ID::internal::ArenaStringPtr::EmptyDefault{}, ::std::string( reinterpret_cast<const char*>(value), size), GetArena()); // @@protoc_insertion_point(field_set_pointer:opentelemetry.proto.common.v1.AnyValue.string_value) } http://gerrit.cloudera.org:8080/#/c/24008/1/tests/custom_cluster/test_otel_trace.py File tests/custom_cluster/test_otel_trace.py: http://gerrit.cloudera.org:8080/#/c/24008/1/tests/custom_cluster/test_otel_trace.py@448 PS1, Line 448: def test_http_request_id_attribute(self): Most of the code in this class could be replaced by `calling self.execute_query_expect_failure(self.hs2_http_client, query)` and then adding an optional parameter for the request id to the `assert_trace` function in `otel_trace.py` -- https://github.com/apache/impala/blob/24611f7dba77723a805518a09ca6e334c4794796/tests/util/otel_trace.py#L361 http://gerrit.cloudera.org:8080/#/c/24008/1/tests/custom_cluster/test_otel_trace.py@496 PS1, Line 496: : http_request_id = trace.root_span.attributes["HttpRequestId"].value : : # Verify UUID format (8-4-4-4-12 hex digits with hyphens, 36 chars total) : assert len(http_request_id) == 36, \ : "HttpRequestId should be 36 characters, got: {}".format(len(http_request_id)) : : parts = http_request_id.split('-') : expected_parts_lengths = [8, 4, 4, 4, 12] : assert len(parts) == len(expected_parts_lengths), \ : "HttpRequestId should have {} parts, got: {}".format( : len(expected_parts_lengths), len(parts)) : : for i, expected_len in enumerate(expected_parts_lengths): : assert len(parts[i]) == expected_len, \ : "HttpRequestId part {} should have length {}, got: {}".format( : i, expected_len, len(parts[i])) : assert all(c in '0123456789abcdefABCDEF' for c in parts[i]), \ : "HttpRequestId part {} should contain only hex digits".format(i) Please investigate if the uuid library's `UUID` function can do this validation for us. -- To view, visit http://gerrit.cloudera.org:8080/24008 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e14f5b503ff7379463332bae34c266afc395524 Gerrit-Change-Number: 24008 Gerrit-PatchSet: 1 Gerrit-Owner: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Comment-Date: Thu, 05 Mar 2026 23:47:30 +0000 Gerrit-HasComments: Yes
