Antoine/Wes, thanks for the input. I will focus on the CSV reader and the minimal async needed to get I/O off the thread pool and support for a nested task group. This is just to focus on one small thing at a time. I'll avoid any scheduler work for now but maybe can look at that in the future.
As for your feedback, I think #3 (adding items to the end of the thread pool) could also be mitigated if a promise executed it's callbacks directly (instead of submitting them as new tasks). There is a bit of a "max recursion" case that has to be looked after (similar to what Antoine mentioned) but it could be handled. I may experiment with that some. The Tokio article you posted also talked about this (keeping a spot open for the last thing scheduled and running that if possible). #5 sounds pretty straightforward but I think you'd want a wide variety of test cases to make sure you're improving things overall. You could exceed a thread pool with just a single workload. The CSV reader, for example, will grow to occupy as many threads as there are available (assuming there are enough columns). There are a lot of things to balance for here, balancing for cache cohesion, balancing I/O vs. CPU workload, balancing for fairness. It may not be obvious what exactly to aim for. On Mon, Sep 28, 2020 at 2:32 AM Antoine Pitrou <anto...@python.org> wrote: > > Le 28/09/2020 à 11:38, Antoine Pitrou a écrit : > > > > Hi Weston, > > > > Le 25/09/2020 à 23:21, Weston Pace a écrit : > >> > >> * The current thread pool implementation deadlocks when used in a > >> "nested" case, an asynchronous solution can work around this > > > > If required it may be possible to hack around this. For example, AFAIR > > TBB has a simple heuristic to enable reentrant calls into the thread > > pool until a hardcoded recursion level. > > Closely related: "TaskGroup::Finish should execute tasks" > https://issues.apache.org/jira/browse/ARROW-10014 > > Regards > > Antoine.