[GitHub] [arrow-datafusion] gruuya commented on pull request #7180: Top-K eager batch sorting

via GitHub Mon, 07 Aug 2023 14:35:53 -0700


gruuya commented on PR #7180:
URL: 
https://github.com/apache/arrow-datafusion/pull/7180#issuecomment-1668606350


   Thanks @alamb for the reviews and timely feedback.
   
   > in my opinion this code is now good enough to be merged.
   
   I'd like to emphasize that there are still regressions with this approach. 
In fact in case of larger files (> 1GB) with K in 1000-8000 range, the runtime 
seems to be hit the most, with probably negligible memory improvements (if 
any). Anecdotally, the original file I've been testing does now show 
considerable speedup though, but that is perhaps not a typical file size 
(146M). So it's a mixed bag really, and I'm not sure it's best for this to be 
merged as is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] gruuya commented on pull request #7180: Top-K eager batch sorting

Reply via email to