kbendick commented on pull request #3784:
URL: https://github.com/apache/iceberg/pull/3784#issuecomment-1058822726


   cc @dongjoon-hyun 
   
   This is a proposed PR for estimating file size with ORC files, to support 
rolling file writers using ORC in Iceberg.
   
   Right now, the feature is disabled entirely because of inability to estimate 
the file size for an open ORC file that’s still being written to. Adding this 
in would add alot of parity between ORC and Parquet from Iceberg.
   
   @openinx has summarized their thoughts and the current situation pretty well 
here: https://lists.apache.org/thread/g6yo7m46mr86ov1vkm9wnmshgw7hcl6b
   
   If you have time, could you or somebody from the ORC community provide any 
feedback for the better approach to estimating file size, so that ORC might 
have equivalent support to Parquet in this regard? L
   
   I was hoping you or somebody else from the ORC community might chime in, 
given @openinx’s summary of the situation (on the dev list here 
https://lists.apache.org/thread/g6yo7m46mr86ov1vkm9wnmshgw7hcl6b).
   
   Thanks in advance for any guidance you might be able to provide 🙂 
   
   Also cc @marton-bod and other ORC developers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to