SameerMesiah97 commented on PR #59691: URL: https://github.com/apache/airflow/pull/59691#issuecomment-4558518136
@Quantum0uasar Thanks for the review. I have added another phase in the existing test to cover the initial state where the upstream placeholder has not yet been expanded. A few clarifications on the other points: 1. I do agree that this DB query is in a scheduler-sensitive codepath and will therefore run frequently. However, it is a very narrow query scoped to a single DAG run, a single upstream/downstream task pair, and map indexes `-1` and `0`. Given the very small result set involved here, I do not currently think introducing caching/state-management complexity would be worth it without evidence that this lookup is a measurable bottleneck. 2. I agree the helper would likely work fine without the third condition since it is implied by the second condition. However, this is a fairly subtle placeholder-transition edge case, so I think the explicitness is beneficial for readability and future maintenance unless there is a stronger rationale for simplifying it. 3. I believe the trigger rule discussion is orthogonal to this PR. This fix is focused on resolving the correct upstream mapped task instance during placeholder transition handling, not determining state resolution semantics for the task instances themselves. Trigger rule evaluation lives within `trigger_rule_dep.py`, which is a bit outisde the scope of this PR. 4. In this implementation, we are intentionally only considering the upstream placeholder (`-1`) and its immediate post-expansion successor (`0`). The goal is specifically to handle the transition where the upstream placeholder no longer exists but the downstream task is still referencing placeholder state. Matching against arbitrary expanded indexes (`>0`) would not reflect the semantics of the downstream placeholder task instance correctly. 5. I think there may be a misunderstanding around the caching example provided. The snippet appears to broaden the query scope to all mapped task instances for the DAG run rather than cache previously resolved state, which I suspect would increase the DB query cost rather than reduce it. If I am misunderstanding the intent there, please clarify. 6. I think you may have been looking at an earlier commit. The helper is now called `_should_use_post_expansion_placeholder`, and I believe the current docstring describes the intended contract clearly. The issue has been linked in the PR description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
