SameerMesiah97 commented on PR #59691:
URL: https://github.com/apache/airflow/pull/59691#issuecomment-4558518136

   @Quantum0uasar
   
   Thanks for the review. I have added another phase in the existing test to 
cover the initial state where the upstream placeholder has not yet been 
expanded.
   
   A few clarifications on the other points:
   
   1. I do agree that this DB query is in a scheduler-sensitive codepath and 
will therefore run frequently. However, it is a very narrow query scoped to a 
single DAG run, a single upstream/downstream task pair, and map indexes `-1` 
and `0`. Given the very small result set involved here, I do not currently 
think introducing caching/state-management complexity would be worth it without 
evidence that this lookup is a measurable bottleneck.
   
   2. I agree the helper would likely work fine without the third condition 
since it is implied by the second condition. However, this is a fairly subtle 
placeholder-transition edge case, so I think the explicitness is beneficial for 
readability and future maintenance unless there is a stronger rationale for 
simplifying it.
   
   3. I believe the trigger rule discussion is orthogonal to this PR. This fix 
is focused on resolving the correct upstream mapped task instance during 
placeholder transition handling, not determining state resolution semantics for 
the task instances themselves. Trigger rule evaluation lives within 
`trigger_rule_dep.py`, which is a bit outisde the scope of this PR. 
   
   4. In this implementation, we are intentionally only considering the 
upstream placeholder (`-1`) and its immediate post-expansion successor (`0`). 
The goal is specifically to handle the transition where the upstream 
placeholder no longer exists but the downstream task is still referencing 
placeholder state. Matching against arbitrary expanded indexes (`>0`) would not 
reflect the semantics of the downstream placeholder task instance correctly.
   
   5. I think there may be a misunderstanding around the caching example 
provided. The snippet appears to broaden the query scope to all mapped task 
instances for the DAG run rather than cache previously resolved state, which I 
suspect would increase the DB query cost rather than reduce it. If I am 
misunderstanding the intent there, please clarify.
   
   6. I think you may have been looking at an earlier commit. The helper is now 
called `_should_use_post_expansion_placeholder`, and I believe the current 
docstring describes the intended contract clearly. The issue has been linked in 
the PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to