bzp2010 commented on code in PR #12686:
URL: https://github.com/apache/apisix/pull/12686#discussion_r2742044413
##########
apisix/ssl/router/radixtree_sni.lua:
##########
@@ -177,9 +179,10 @@ function _M.match_and_set(api_ctx, match_only, alt_sni)
-- with it sometimes
core.log.error("failed to find any SSL certificate by SNI: ", sni)
end
+ tracer.finish(api_ctx.ngx_ctx, tracer.status.ERROR, "failed match SNI")
return false
end
-
+ tracer.finish(api_ctx.ngx_ctx)
Review Comment:
Strictly, this API design is still strange, with many internal details
hidden within the stack data structure. For developers, it presents an opaque
data structure.
If any span of code begins with `tracer.start` but fails to conclude with
`tracer.finish`, the entire tracing process will encounter an error. This
unclosed span will encompass all subsequent spans, which is not the intended
behavior.
If we examine OTEL implementations in other major programming languages, we
find that many of them follow the following pattern:
Use the tracer's method to create a span, then obtain an instance of that
span. After internal operations complete, use the span instance's methods to
actively terminate the specified span. The ctx will be used to track the
current span, ensuring that parent-child relationships are correctly maintained
when creating child spans.
```lua
-- context-based
local parent_span = tracer.start(ctx, "parent")
local child_span = tracer.start(ctx, "child")
child_span:end()
parent_span:end()
```
ref:
https://opentelemetry.io/docs/languages/go/instrumentation/#create-nested-spans
ref:
https://opentelemetry.io/docs/languages/php/instrumentation/#create-nested-spans
ref:
https://opentelemetry.io/docs/languages/js/instrumentation/#create-nested-spans
ref: https://opentelemetry.io/docs/languages/java/api/#span
ref:
https://opentelemetry.io/docs/languages/cpp/instrumentation/#create-nested-spans
Alternatively, explicitly pass the parent span instance to the child span to
achieve precise hierarchical control.
```lua
-- pass span directly
local parent_span = tracer.start("parent")
local child_span = tracer.start("child", parent_span)
child_span:end()
parent_span:end()
```
Note that the following Rust code typically terminates spans automatically
via drop calls when the span instance's variable scopes end.
ref:
https://docs.rs/tracing/latest/tracing/span/index.html#span-relationships
None of the OTEL libraries for these languages organize and manage the
entire span stack using only global state. Each one terminates spans by
returning span instances and requiring developers to explicitly call
`span.end()` when operations conclude.
Our current approach requires any developer writing trace code to be
familiar with the stack mechanism we've created and to have complete clarity
about where their spans appear in the stack, whether parent spans exist,
whether they've closed as expected, or any other state. Otherwise, they cannot
write correct code. Therefore, I disagree with the current API design.
It's all too easy to make mistakes like the one below.
```lua
tracer.start("parent")
tracer.start("child")
tracer.finish()
-- missing last tracer.finish
```
The result is an output issue, because regardless of what you do, calling
`finish_all` will terminate the span. However, the parent-child relationships
and timestamps for all subsequent spans will be incorrect. This won't trigger
any obvious error messages, making it extremely difficult for users to debug.
---
I have already pointed this out in
https://github.com/apache/apisix/pull/12686#discussion_r2471659658, and
@membphis has clearly reiterated this point through images.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]