RyanJamesStewart commented on PR #22416: URL: https://github.com/apache/datafusion/pull/22416#issuecomment-4520292900
Removed the shrink_to_fit in the split_off branch, as suggested. The point about the push-after-emit caller holds: when a caller pushes onto the emitted prefix right away (first_n_offsets.push in multi_group_by/bytes.rs), shrinking to length n just forces a realloc on the next push. I added a test, emitted_prefix_does_not_realloc_on_push, that pins this down. A prefix split from a capacity-64 Vec keeps capacity 64 and absorbs a push with no realloc; the same prefix shrunk to length 8 reallocs up to capacity 16 on that push, ending larger than where it started. The capacity-accounting concern is still valid, but it reads as a caller-side property rather than something to bake into the shared utility, so I have left shrink out for now. Pushed as 28655d8. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
