seddonm1 commented on pull request #9523:
URL: https://github.com/apache/arrow/pull/9523#issuecomment-786895577


   @edrevo This is interesting and I wonder if perhaps multiple issues - 
although the nature of RepartitionExec makes it hard to isolate from other 
DataFusion components.
   
   I ran many different scenarios yesterday to try to understand what was 
happening trying to understand why `RepatitionExec::execute` was not spawning 
threads. My aim from this approach was to isolate the `tokio::spawn` from the 
crossbeam channels. I stripped it back to the point where it was basically just 
spawning `num_output_partition` threads that had a single `println!` as their 
only action. I completely removed all references to `channels` from either the 
core `0..num_input_partitions` loop and `channels.empty()` and added delays 
after the spawning to try to prevent a race condition. Consistently the last 
`tokio::spawn` would not be invoked on the final loop iteration (I verified 
this was not `tokio` bound by testing `0..num_input_partitions + 1` which this 
time correctly started `num_input_partitions` but not the new final iteration).
   
   It appears to me (and I have only just started reading the `tokio` 
documentation) that these `tokio::spawn` calls are not yielding to the 
scheduler as that would normally happen via the `.await` call. As we are 
spawning 'start and forget' threads (we don't want the blocking behavior) the 
scheduler does not have a chance to yield for that last `tokio::spawn` hence 
why `tokio::task::yield_now().await` works. 
   
   There may be a second problem that you have identified with `crossbeam` vs 
`tokio::mpsc` which, to me at least, feels like the answer is to switch to 
`tokio::mpsc` as the dependency is already imported and is likely to be well 
tested with/by `tokio` - and the `tokio` runtime is unlikely to be replaced any 
time soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to