2010YOUY01 commented on issue #17267: URL: https://github.com/apache/datafusion/issues/17267#issuecomment-3208947844
One challenge is how to decide the partition number up front, looks like it's a static configuration, and should be configured differently from run to run? I recommend to get sort merge join working reliably before experimenting HJ spilling (i.e. benchmarks should be able to finish under a modest memory limit, perhaps also more tests), the existing solution is not production ready yet, but I think SMJ should have lower maintenance overhead -- It's core is reusing the external sort implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
