andygrove commented on issue #824: URL: https://github.com/apache/arrow-datafusion/issues/824#issuecomment-894803873
I am very interested in this work. I have been talking about deterministic memory use as being one of the advantages of Rust over JVM for some time and it would be great to see this implemented. I like the idea of passing in some form of context state with a memory tracker. It would be good if this is not tied specifically to a DataFusion context, so that physical operators can be used in other contexts. I also think this gets us back into discussing scheduling and I have just added the following note to #587: We should also discuss creating a scheduler in DataFusion (see https://github.com/apache/arrow-datafusion/issues/64) since it is related to this work. Rather than try and run all the things at once, it would be better to schedule work based on the available resources (cores / memory). We would still need the ability to track/limit memory use within operators but the scheduler could be aware of this and only allocate tasks if there is memory budget available. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
