seddonm1 commented on pull request #9523: URL: https://github.com/apache/arrow/pull/9523#issuecomment-786895577
@edrevo This is interesting and I wonder if perhaps multiple issues - although the nature of RepartitionExec makes it hard to isolate from other DataFusion components. I ran many different scenarios yesterday to try to understand what was happening trying to understand why `RepatitionExec::execute` was not spawning threads. My aim from this approach was to isolate the `tokio::spawn` from the crossbeam channels. I stripped it back to the point where it was basically just spawning `num_output_partition` threads that had a single `println!` as their only action. I completely removed all references to `channels` from either the core `0..num_input_partitions` loop and `channels.empty()` and added delays after the spawning to try to prevent a race condition. Consistently the last `tokio::spawn` would not be invoked on the final loop iteration (I verified this was not `tokio` bound by testing `0..num_input_partitions + 1` which this time correctly started `num_input_partitions` but not the new final iteration). It appears to me (and I have only just started reading the `tokio` documentation) that these `tokio::spawn` calls are not yielding to the scheduler as that would normally happen via the `.await` call. As we are spawning 'start and forget' threads (we don't want the blocking behavior) the scheduler does not have a chance to yield for that last `tokio::spawn` hence why `tokio::task::yield_now().await` works. There may be a second problem that you have identified with `crossbeam` vs `tokio::mpsc` which, to me at least, feels like the answer is to switch to `tokio::mpsc` as the dependency is already imported and is likely to be well tested with/by `tokio` - and the `tokio` runtime is unlikely to be replaced any time soon. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org