milenkovicm commented on issue #3941:
URL: 
https://github.com/apache/arrow-datafusion/issues/3941#issuecomment-1292030069

   thanks for your comment @yjshen, 
   
   concern raised in previous comments is not spill algorithm for aggregation, 
it is about interaction between `non-async` and `async` code.
   
   As it is implemented `GroupedHashAggregateStreamV2` lives in non-async word 
but memory manager / memory consumer expose `async` methods which can't be 
called from `poll_next`: 
   
   ```rust
   impl Stream for GroupedHashAggregateStreamV2 {
       type Item = ArrowResult<RecordBatch>;
   
       fn poll_next(
           mut self: std::pin::Pin<&mut Self>,
           cx: &mut Context<'_>,
       ) -> Poll<Option<Self::Item>> {
           let this = &mut *self;
          // await` is only allowed inside `async` functions and blocks only 
allowed inside `async` functions and blocks
           this.mem_manager.try_grow(200).await;
         // rest of the code which does aggregation and spill
     }
   }
   ```
   so the question is how to bridge the gap
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to