westonpace commented on pull request #11373: URL: https://github.com/apache/arrow/pull/11373#issuecomment-942665852
> We don't see huge speedups (or slow downs) going from ... %>% arrange() %>% compute() %>% head() %>% collect() to ... %>% arrange() %>% head() %>% collect(). But like you mentioned, we expect a bigger speed up with topk which is not this PR I would not expect much speedup with sort -> top-k because all we're cutting out is the Arrow->R as we still need to load all the data to service the sort. The only query I would expect to see a large speedup on is a head with no arrange (but I think Neal took out the StopProducing because of crashes so I wouldn't expect a speedup in that case either). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org