Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24008 )

Change subject: IMPALA-14332: Add X-Request-Id as HttpRequestId attribute on 
root OTel span
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/otel.cc
File be/src/observe/otel.cc:

http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/otel.cc@391
PS1, Line 391:   size_t last_hyphen = processed.rfind('-');
             :   if (last_hyphen != string::npos && last_hyphen < 
processed.length() - 1) {
             :     // Check if everything after the last hyphen is a digit
             :     bool all_digits = true;
             :     for (size_t i = last_hyphen + 1; i < processed.length(); 
++i) {
             :       if (!isdigit(processed[i])) {
             :         all_digits = false;
             :         break;
             :       }
             :     }
             :     // If all digits after the last hyphen, remove it (it's the 
iteration count)
             :     if (all_digits) {
             :       processed = processed.substr(0, last_hyphen);
             :     }
             :   }
There is a potential issue here.  UUIDs are not guaranteed to have letters in 
all portions.  It's highly unlikely, but a UUID such as 
"a6dd52ba-18d1-11f1-ba97-665407384305" is possible.  Given this UUID, the code 
will return "a6dd52ba-18d1-11f1-ba97".  Per RFC9562, we can assume any provided 
request_id will be a UUID in string hex-and-dash format and thus will contain 
hyphens.

https://datatracker.ietf.org/doc/html/rfc9562#:~:text=When%20in%20use%20with%20URNs%20or%20as%20text%20in%20applications%2C%20any%20given%20UUID%20should%20be%20represented%20by%20the%20%22hex%2Dand%2Ddash%22%20string%20format%20consisting%20of%20multiple%20groups%20of%20uppercase%20or%20lowercase%20alphanumeric%20hexadecimal%20characters%20separated%20by%20single%20dashes/hyphens


http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc
File be/src/observe/span-manager.cc:

http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc@36
PS1, Line 36: #include "observe/otel.h"
I don't want the span-manager.h/.cc files to have a dependency on the 
otel.h/.cc files.  It creates circular dependencies which make the code much 
more difficult to reason about.


http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc@59
PS1, Line 59: static constexpr char const* ATTR_HTTP_REQUEST_ID = 
"HttpRequestId";
This attribute also needs to be set on the Init span.


http://gerrit.cloudera.org:8080/#/c/24008/1/be/src/observe/span-manager.cc@147
PS1, Line 147:     // Store all string attributes outside the map to ensure 
string lifetime.
Was there a specific issue you ran into that necessitated this code change? The 
contents of the key and value nostd::string_view are copied into new 
std::string objects while the Span object is being constructed.  The string 
creation happens in the generated common.pb.h file

inline void KeyValue::set_key(const char* value,
    size_t size) {

  key_.Set(::PROTOBUF_NAMESPACE_ID::internal::ArenaStringPtr::EmptyDefault{}, 
::std::string(
      reinterpret_cast<const char*>(value), size), GetArena());
  // 
@@protoc_insertion_point(field_set_pointer:opentelemetry.proto.common.v1.KeyValue.key)
}

inline void AnyValue::set_string_value(const char* value,
                             size_t size) {
  if (!_internal_has_string_value()) {
    clear_value();
    set_has_string_value();
    
value_.string_value_.UnsafeSetDefault(&::PROTOBUF_NAMESPACE_ID::internal::GetEmptyStringAlreadyInited());
  }
  value_.string_value_.Set(
      ::PROTOBUF_NAMESPACE_ID::internal::ArenaStringPtr::EmptyDefault{}, 
::std::string(
      reinterpret_cast<const char*>(value), size),
      GetArena());
  // 
@@protoc_insertion_point(field_set_pointer:opentelemetry.proto.common.v1.AnyValue.string_value)
}


http://gerrit.cloudera.org:8080/#/c/24008/1/tests/custom_cluster/test_otel_trace.py
File tests/custom_cluster/test_otel_trace.py:

http://gerrit.cloudera.org:8080/#/c/24008/1/tests/custom_cluster/test_otel_trace.py@448
PS1, Line 448:   def test_http_request_id_attribute(self):
Most of the code in this class could be replaced by `calling 
self.execute_query_expect_failure(self.hs2_http_client, query)` and then adding 
an optional parameter for the request id to the `assert_trace` function in 
`otel_trace.py` -- 
https://github.com/apache/impala/blob/24611f7dba77723a805518a09ca6e334c4794796/tests/util/otel_trace.py#L361


http://gerrit.cloudera.org:8080/#/c/24008/1/tests/custom_cluster/test_otel_trace.py@496
PS1, Line 496:
             :       http_request_id = 
trace.root_span.attributes["HttpRequestId"].value
             :
             :       # Verify UUID format (8-4-4-4-12 hex digits with hyphens, 
36 chars total)
             :       assert len(http_request_id) == 36, \
             :           "HttpRequestId should be 36 characters, got: 
{}".format(len(http_request_id))
             :
             :       parts = http_request_id.split('-')
             :       expected_parts_lengths = [8, 4, 4, 4, 12]
             :       assert len(parts) == len(expected_parts_lengths), \
             :           "HttpRequestId should have {} parts, got: {}".format(
             :               len(expected_parts_lengths), len(parts))
             :
             :       for i, expected_len in enumerate(expected_parts_lengths):
             :         assert len(parts[i]) == expected_len, \
             :             "HttpRequestId part {} should have length {}, got: 
{}".format(
             :                 i, expected_len, len(parts[i]))
             :         assert all(c in '0123456789abcdefABCDEF' for c in 
parts[i]), \
             :             "HttpRequestId part {} should contain only hex 
digits".format(i)
Please investigate if the uuid library's `UUID` function can do this validation 
for us.



--
To view, visit http://gerrit.cloudera.org:8080/24008
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e14f5b503ff7379463332bae34c266afc395524
Gerrit-Change-Number: 24008
Gerrit-PatchSet: 1
Gerrit-Owner: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Comment-Date: Thu, 05 Mar 2026 23:47:30 +0000
Gerrit-HasComments: Yes

Reply via email to