ashb commented on code in PR #68591:
URL: https://github.com/apache/airflow/pull/68591#discussion_r3419095658


##########
shared/observability/src/airflow_shared/observability/traces/__init__.py:
##########
@@ -61,15 +62,51 @@ def generate_trace_id(self):
 
 
 def new_dagrun_trace_carrier(task_span_detail_level=None) -> dict[str, str]:
-    """Generate a fresh W3C traceparent carrier without creating a recordable 
span."""
+    """
+    Generate a fresh W3C traceparent carrier without creating a recordable 
span.
+
+    The SAMPLED flag is set from an honest *root* sampling decision made by the
+    configured tracer provider's sampler (driven by ``OTEL_TRACES_SAMPLER`` /
+    ``OTEL_TRACES_SAMPLER_ARG``), rather than being hardcoded. This makes the
+    carrier the single head-sampling decision point for a DAG run: every
+    downstream span (dag_run, task_run, worker) rides on this flag.
+
+    Backcompat: when ``OTEL_TRACES_SAMPLER`` is unset the SDK defaults to
+    ``parentbased_always_on`` whose root decision is ALWAYS_ON, so the flag is
+    SAMPLED and behavior is identical to the previous hardcoded value.
+    """
     gen = RandomIdGenerator()
-    trace_state_entries = build_trace_state_entries(task_span_detail_level)
+    trace_id = gen.generate_trace_id()
+
+    provider = trace.get_tracer_provider()
+    sampler = getattr(provider, "sampler", None)
+    if sampler is not None:
+        result = sampler.should_sample(
+            parent_context=None,  # root decision
+            trace_id=trace_id,
+            name="dag_run",
+        )
+        sampled = result.decision == Decision.RECORD_AND_SAMPLE
+        sampler_trace_state = result.trace_state
+    else:
+        # No sampler attribute means a proxy/no-op provider (otel disabled).
+        # Nothing exports in that case, so the flag is irrelevant; mirror the
+        # observable behavior of today when otel is off.
+        sampled = False

Review Comment:
   Maybe I'm misunderstanding, but doesn't sampled=false mean this is filtering 
out?



##########
shared/observability/src/airflow_shared/observability/traces/__init__.py:
##########
@@ -61,15 +62,51 @@ def generate_trace_id(self):
 
 
 def new_dagrun_trace_carrier(task_span_detail_level=None) -> dict[str, str]:
-    """Generate a fresh W3C traceparent carrier without creating a recordable 
span."""
+    """
+    Generate a fresh W3C traceparent carrier without creating a recordable 
span.
+
+    The SAMPLED flag is set from an honest *root* sampling decision made by the
+    configured tracer provider's sampler (driven by ``OTEL_TRACES_SAMPLER`` /
+    ``OTEL_TRACES_SAMPLER_ARG``), rather than being hardcoded. This makes the
+    carrier the single head-sampling decision point for a DAG run: every
+    downstream span (dag_run, task_run, worker) rides on this flag.
+
+    Backcompat: when ``OTEL_TRACES_SAMPLER`` is unset the SDK defaults to
+    ``parentbased_always_on`` whose root decision is ALWAYS_ON, so the flag is
+    SAMPLED and behavior is identical to the previous hardcoded value.

Review Comment:
   Don't think this belongs in the doc strings, not relevant after this pr is 
merged



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to