alkis commented on issue #8441:
URL: https://github.com/apache/arrow-rs/issues/8441#issuecomment-3335021843

   > I remember at the start of the V3 discussions someone saying Parquet 
doesn't have to be the best, it just has to be good enough to keep people from 
switching to something else. Here we're exploring the "good enough" path.
   
   Absolutely (that was be me btw).
   
   > At the end of the day, what this all boils down to is to realize the 
benefits of flatbuffers we would have to touch virtually every use of metadata 
in our reader as well as downstream consumers of it.
   
   This is why the prototype for flatbufs and our shadow analysis in the fleet 
compares Thrift vs flatbufs + transform to Thrift's `FileMetadata` object and 
not vs raw flatbuf. This brings the bar of adoption very low. We encode flatbuf 
but read out a Thrift object as if we decoded thrift. This is still 10x faster 
at the p999. For virtually all engines this can be plugged in with very little 
effort. Then if one wants the absolute best performance they can go ahead and 
remove the flatbuf to Thrift translation to realize the full benefit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to