zhuqi-lucas commented on PR #15447: URL: https://github.com/apache/datafusion/pull/15447#issuecomment-2757931417
Thank you @berkaysynnada for review. Currently only sort-tpch q3 will have the case that string larger than 12bytes for most cases and also high percentage of same 4 bytes prefix. For other testing about sort stringview, it will not affect too much, see q11, it also for stringview sort merge, but with short string less than 12bytes, let's see the result has some overhead due to gc(), but it does not affect too much. ```rust Q11 │ 414.58ms │ 427.22ms │ no change ``` And our code already has batch level gc(), for exampe when we organize_stringview_arrays for spilling: https://github.com/apache/datafusion/blob/18feb8b2702b96a8a77ec4bc52fb67571e857d4d/datafusion/physical-plan/src/sorts/sort.rs#L493 I think it will not cause too much overhead. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org