Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
alamb closed issue #15323: Reduce number of tokio blocking threads in SortExec spill URL: https://github.com/apache/datafusion/issues/15323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
alamb closed issue #15323: Reduce number of tokio blocking threads in SortExec spill URL: https://github.com/apache/datafusion/issues/15323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781673294 I created a draft PR with a solution, would appreciate your opinion: - #15608 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781577131 > even if you use global tokio runtime and set the number of blocking threads to be a 1000 for example, there can be 1001 spill files. the problem is the same At some point the system is going to be IO bound so having more blocking threads doing I/O isn't going to help IO and will likely consume non trivial time context switching between them I think a better solution is to more carefully manage how many files are being spilled / read at any time. This will be more complicated (as we'll likely have to do multiple merge phases, etc) but I think it is a better approach in the long run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781460121 > > Comet currently creates a new tokio runtime per plan but there is a proposal to move to a global tokio runtime (per executor) instead. > > [apache/datafusion-comet#1590](https://github.com/apache/datafusion-comet/issues/1590) even if you use global tokio runtime and set the number of blocking threads to be 1000 for example, there can be 1001 spill files. the problem is the same -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
andygrove commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781458966 > I have a working version locally and will create a PR soon, just one problem, I don't think we can know the number of blocking threads tokio is configured with. > > this is important as for example Comet set this by default to 10, and tokio default is 512 IIRC. > > the working version can be improved with some optimization like prefetch and more, but it will be good enough for now and we can iterate further Comet currently creates a new tokio runtime per plan but there is a proposal to move to a global tokio runtime (per executor) instead. https://github.com/apache/datafusion-comet/issues/1590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781454412 I have a working version locally and will create a PR soon, just one problem, I don't think I can know the number of blocking threads tokio is configured with. this is important as for example Comet set this by default to 10, and tokio default is 512 IIRC. the working version can be improved with some optimization like prefetch and more, but it will be good enough for now and we can iterate further -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2776820608 > I think I have the the same problem but in `AggregateExec` when using `row_hash`, as it spills as well and use `SortPreservingMergeStream`. > > I think the solution should actually be in `SortPreservingMergeStream` rather than `SpillFileManager` no? although it does not spawn blocking threads it should support the multiple levels to merge I am not sure / familiar enough with the code to know off the top of my head. I do think having hash and sort use the same codepath (that we can then go optimize a lot) sounds like a great idea -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2776605858 I think I have the the same problem but in `AggregateExec` when using `row_hash`, as it spills as well and use `SortPreservingMergeStream`. I think the solution should actually be in `SortPreservingMergeStream` rather than `SpillFileManager` no? although it does not spawn blocking threads it should support the multiple levels to merge -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2744217807 Makes sense -- with 183 spill files, we probably would need to merge in stages For example starting with 183 spill files 1. run 10 jobs, each merging about 10 files into one (results in 10 files) 2. run the final merge of 10 files This results in 2x the IO (have to read / write each row twice) but it would be possible at least to parallelize the merges of the earlier step I think @2010YOUY01 was starting to look into a SpillFileManager -- this is the kind of behavior I would imagine being part of such a thing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
andygrove commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2743795290 > Do you see too many threads when writing the spill files or when reading? This is when reading, during the merge operation. > In merge phase, each spill file will be wrapped by a stream backed by a blocking thread (see [read_spill_as_stream](https://github.com/apache/datafusion/blob/46.0.1/datafusion/physical-plan/src/spill.rs#L44-L55)), so we'll spawn at least 183 blocking threads when there are 183 spill files to merge spilled data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2741967802 Do you see too many threads when writing the spill files or when reading? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org