dstandish opened a new pull request, #68688:
URL: https://github.com/apache/airflow/pull/68688

   ## Summary
   
   When resolving the map length of a mapped upstream, `get_task_map_length` 
(for `SchedulerPlainXComArg`) issued **two sequential queries** once all 
expanded TIs had finished:
   
   1. an `EXISTS` check for any still-unfinished expanded task instance, and
   2. a `COUNT` of the produced XCom outputs.
   
   This is evaluated repeatedly while the scheduler decides whether/how to 
expand downstream mapped tasks. This PR combines the two into a **single 
round-trip** using `select(exists(...), count_subquery)` and reads both values 
from one row.
   
   ## Semantics
   
   Unchanged. A mapped upstream that still has unfinished TIs continues to 
report `None` (length not yet known) rather than a partial count — the boolean 
from the `EXISTS` column is checked exactly as before; only the number of 
round-trips changes. The non-mapped branch (reading `TaskMap.length`) is 
untouched.
   
   The previously-separate `exists_query` helper import is dropped (it was the 
only use in this module).
   
   ## Tests
   
   Existing mapped-expansion coverage passes locally — 
`test_mappedoperator.py`, `test_taskmap.py`, and the mapped/expand subset of 
`test_dagrun.py` (covering the not-ready → `None` path, the ready → count path, 
and re-expansion when the upstream length changes).
   
   > [!NOTE]
   > Draft for review. Internal optimization on the mapped-task scheduling 
path; no user-facing behavior change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to