Re: [I] Experiment: immediate-mode shuffle [datafusion-comet]

via GitHub Tue, 31 Mar 2026 07:45:15 -0700


andygrove commented on issue #3855:
URL: 
https://github.com/apache/datafusion-comet/issues/3855#issuecomment-4163190297


   Gluten follows this approach:
   
   - Pre-allocate one set of column buffers per partition
   - Scatter-write rows from each input batch directly into partition buffers 
using raw pointers                                                              
                                                          
   - Accumulate across multiple input batches until buffers fill                
                
   - Evict (flush) full buffers and reuse the allocation                        
                                                                                
                                                        
   - Only resize when the estimated size changes beyond a threshold   
   
   This avoids a lot of intermediate allocations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Experiment: immediate-mode shuffle [datafusion-comet]

Reply via email to