bujjibabukatta opened a new pull request, #68426:
URL: https://github.com/apache/airflow/pull/68426
Closes #68417
## Problem
When you use a mapped task group (e.g.
`divide_and_report.expand(i=gen_examples())`),
tasks inside the group should evaluate their trigger rules **per index**. so
if
`divide(0)` fails, only `report_success(0)` should become `upstream_failed`,
while
`report_success(1)`, `(2)`, `(3)` still run successfully.
This was working correctly until PR #59691 landed. That PR fixed a real bug
in XCom
resolution (downstream tasks incorrectly referencing a placeholder that no
longer
existed after upstream expansion), but it accidentally broke trigger-rule
evaluation
as a side effect.
### What went wrong ?
When an upstream task expands (its summary placeholder at `map_index=-1`
gets replaced
by real instances at `0, 1, 2, ...`), PR #59691 introduced a rewrite that
maps the
downstream placeholder's dependency from `-1` to `0`. This rewrite is
correct for
XCom resolution, but it was also kicking in during **trigger-rule
evaluation** of the
downstream summary task instance (the unexpanded placeholder, also at
`map_index=-1`).
The result: the summary instance of `report_success` would evaluate its
`ALL_SUCCESS`
trigger rule against **only** `divide(0)` instead of all divide instances.
Since
`divide(0)` failed, `report_success(-1)` got marked `upstream_failed` before
it ever
had a chance to expand. Once the summary is `upstream_failed`, all its
expanded
instances (`0`, `1`, `2`, `3`) inherit that state so everything ends up
`upstream_failed` instead of just index `0`.
Before the fix:
report_success: {0: upstream_failed, 1: upstream_failed, 2: upstream_failed,
3: upstream_failed} ❌
After the fix:
report_success: {0: upstream_failed, 1: success, 2: success, 3: success} ✅
## The Fix
The fix is a small, targeted change in `trigger_rule_dep.py`.
When evaluating the trigger rule for a **summary task instance** (`map_index
< 0`)
inside a mapped task group, and the upstream task is **not** an expansion
dependency
(i.e. it lives inside the same group), we now skip the
placeholder-to-index-0 rewrite
entirely.
- For **fast-triggered rules** (`ONE_SUCCESS`, `ONE_FAILED`, `ONE_DONE`):
behavior is
unchanged we return `None` so all upstream instances are considered,
letting the
rule fire as soon as one instance meets the condition.
- For **all other rules** (`ALL_SUCCESS`, `NONE_FAILED`, etc.): we return
the summary
map index (`-1`) directly. After the upstream has expanded, its `-1`
placeholder no
longer exists in the database, so the trigger rule sees zero relevant
instances and
passes. The summary task instance is then free to expand, and each
resulting
per-index instance evaluates its trigger rule independently against the
correct
upstream instance.
The XCom resolution path is completely unaffected because it calls
`get_relevant_upstream_map_indexes` directly, not through the trigger-rule
evaluation
code path.
## Tests
The following existing tests cover this fix and were broken before this
change:
- `test_one_failed_trigger_rule_in_mapped_task_group_is_per_index`
verifies that a
single upstream failure only affects the downstream instance at the same
index
-
`test_one_failed_trigger_rule_runs_on_indirect_failure_in_mapped_task_group`
verifies that `ONE_FAILED` still fires correctly before expansion
- `test_downstream_placeholder_handles_upstream_post_expansion` verifies
that the
XCom resolution rewrite still works correctly and is unaffected by this
change
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]