JigaoLuo commented on issue #8358:
URL: https://github.com/apache/arrow-rs/issues/8358#issuecomment-3311178080

   I’ll continue collecting observations here gradually. 
   
   One thing I’ve noticed after sorting a column for stat-pruning is that the 
sorted integer keys become quite dense—often resulting in an encoded size under 
1MB. In such cases, compression doesn’t seem to provide much benefit.
   - This leads me to consider a potential heuristic: if **the encoded size** 
falls below a certain threshold, it might be better to skip compression 
altogether. Applying compression in these scenarios could add overhead without 
meaningful space savings.
   
   ---
   
   Example:
   
   <img width="1948" height="332" alt="Image" 
src="https://github.com/user-attachments/assets/cc65aa57-0f42-4f47-99dd-48d49fa4383e";
 />


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to