Dandandan opened a new pull request, #20764:
URL: https://github.com/apache/datafusion/pull/20764
## Which issue does this PR close?
- Closes #.
## Rationale for this change
The current implementation sends error messages to output partitions
sequentially using a for loop. This can be inefficient when there are many
output partitions, as each `send()` operation is awaited individually. By using
`futures::future::join_all()`, we can parallelize these operations, allowing
multiple sends to happen concurrently rather than sequentially.
This change improves performance in error scenarios by reducing the total
time spent notifying all output partitions of errors or completion.
## What changes are included in this PR?
- Added `use futures::future::join_all;` import
- Refactored three error/completion handling paths in
`RepartitionExec::run_input_partition()`:
1. Join error handling: Changed from sequential loop to concurrent sends
using `join_all()`
2. Input task error handling: Changed from sequential loop to concurrent
sends using `join_all()`
3. Successful completion: Changed from sequential loop to concurrent sends
using `join_all()`
- Each path now uses `txs.into_values().map()` to convert the channel
senders into async closures that are executed concurrently
## Are these changes tested?
The changes are covered by existing tests in the DataFusion test suite. The
refactoring maintains the same functional behavior (all error messages and
completion signals are still sent to all output partitions), only changing the
execution model from sequential to concurrent.
## Are there any user-facing changes?
No user-facing changes. This is an internal optimization to the physical
execution layer that improves performance without changing the external API or
behavior.
https://claude.ai/code/session_01GDTBavJzih6tSSBd9SRNmk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]