zhuqi-lucas commented on PR #15447:
URL: https://github.com/apache/datafusion/pull/15447#issuecomment-2757931417

   Thank you @berkaysynnada for review. Currently only sort-tpch q3 will have 
the case that string larger than 12bytes for most cases and also high 
percentage of same 4 bytes prefix.
   
   For other testing about sort stringview, it will not affect too much, see 
q11, it also for stringview sort merge, but with short string less than 
12bytes, let's see the result has some overhead due to gc(), but it does not 
affect too much.
   
   ```rust
   Q11          │  414.58ms │    427.22ms │     no change
   ```
   
   
   And our code already has batch level gc(), for exampe when we 
organize_stringview_arrays for spilling:
   
   
https://github.com/apache/datafusion/blob/18feb8b2702b96a8a77ec4bc52fb67571e857d4d/datafusion/physical-plan/src/sorts/sort.rs#L493
   
   I think it will not cause too much overhead. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to