nathadfield commented on code in PR #67285:
URL: https://github.com/apache/airflow/pull/67285#discussion_r3401685434


##########
airflow-core/src/airflow/assets/manager.py:
##########
@@ -691,6 +746,27 @@ def _get_or_create_apdr(
                 .limit(1)
             )
             if latest_apdr and latest_apdr.created_dag_run_id is None:
+                existing_partition_date = latest_apdr.partition_date
+                if (
+                    existing_partition_date is not None
+                    and target_partition_date is not None
+                    and existing_partition_date != target_partition_date
+                ):
+                    # Two contributing events carry conflicting 
partition_dates for the same
+                    # (target_key, target_dag). Choosing one would be 
order-dependent, so
+                    # suppress: the consumer DagRun gets partition_date=None 
rather than a
+                    # wrong, unstable value.
+                    log.warning(
+                        "Conflicting partition_date carried for the same 
target key; "
+                        "suppressing it so the consumer DagRun's 
partition_date is None. "
+                        "The producing assets likely disagree on the 
partition's datetime.",
+                        target_dag_id=target_dag.dag_id,
+                        target_key=target_key,
+                        existing_partition_date=existing_partition_date,
+                        incoming_partition_date=target_partition_date,
+                    )
+                    latest_apdr.partition_date = None

Review Comment:
   Good question but, yes, I think we do.
   
   Nulling the APDR is what makes the DagRun None: at run creation the 
scheduler passes `apdr.partition_date` into `_resolve_partition_date` as the 
carried date, and for identity-mapped partitions the resolver returns that 
carry. If we left the first event's date on the APDR there'd be nothing to tell 
the scheduler a conflict happened, so the run would get the first date anyway. 
The conflicting values are still preserved in the warning log. Could add a 
separate conflict flag instead, but that felt heavier than the row owning its 
resolved value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to