JanKaul commented on issue #5882: URL: https://github.com/apache/arrow-rs/issues/5882#issuecomment-2335133637
As described in [this blogpost](https://ryhl.io/blog/async-what-is-blocking/) tokio already has two threadpools, one for normal tasks and one for blocking tasks. You can spawn tasks for the blocking threadpool by calling the [spawn_blocking](https://docs.rs/tokio/latest/tokio/task/fn.spawn_blocking.html) function. If you set the number of blocking tasks with [max_blocking_threads](https://docs.rs/tokio/latest/tokio/runtime/struct.Builder.html#method.max_blocking_threads) equal to the number of threads, this should work rather well for CPU bound tasks. I think this could be used as an internal threadpool for Datafusion for CPU bound tasks without requiring too much engineering overhead. If we identify the CPU intensive sections of Datafusion and use `spawn_blocking` combined with a channel we could move all the blocking tasks to that separate threadpool and use the default tokio threadpool for IO. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
