pepijnve commented on PR #16322:
URL: https://github.com/apache/datafusion/pull/16322#issuecomment-2961538845

   > Previously, as soon as one stream returned Pending, the merge would 
short-circuit and return Pending, minimizing work per cycle. With the new 
approach, we now poll all N streams once before giving up, which can add extra 
CPU cost, especially when you have high parallelism
   
   @zhuqi-lucas this is a really interesting topic and tricky tradeoff. The 
previous code would indeed do one poll per cycle and then do a self-wake yield. 
The new code polls each stream once and then yields without self-wake.
   In terms of the amount of work done there is no difference, not any extra 
CPU cost. On the contrary, the self-wake yield is removal of pure waste. Unless 
I'm missing something, total elapsed time should be less.
   
   But the tradeoff is that we're going to extend the elapsed time per cycle. 
As a consequence it may take a bit longer to yield to the caller, and then the 
cancellation problem rears its ugly head again.
   
   If this actually matters or not is entirely dependent on the streams being 
polled.
   SortPreservingMergeExec for instance lets things up to use 
`RecordBatchReceiverStream` instances as children. Polling those repeatedly 
until the first record batch arrives is quite wasteful because you're really 
just spinning in place. Checking each receiver in a loop is going to be super 
quick, and yielding in between every partition is not useful at all.
   If the child streams are blocking though, then it's a different matter. You 
probably don't want to actually drive the progress of each stream synchronously 
in a loop.
   
   So... it's a complicated balancing act. Would love to hear how others look 
at this problem.
   PR #16319 ties into this. It's an experiment I'm doing to see if we can 
avoid exposing the potentially blocking portion of streams to the caller so 
that the problem described above kind of disappears. It's not yet clear if this 
can be achieved without a performance penalty.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to