alkis commented on issue #8441: URL: https://github.com/apache/arrow-rs/issues/8441#issuecomment-3335021843
> I remember at the start of the V3 discussions someone saying Parquet doesn't have to be the best, it just has to be good enough to keep people from switching to something else. Here we're exploring the "good enough" path. Absolutely (that was be me btw). > At the end of the day, what this all boils down to is to realize the benefits of flatbuffers we would have to touch virtually every use of metadata in our reader as well as downstream consumers of it. This is why the prototype for flatbufs and our shadow analysis in the fleet compares Thrift vs flatbufs + transform to Thrift's `FileMetadata` object and not vs raw flatbuf. This brings the bar of adoption very low. We encode flatbuf but read out a Thrift object as if we decoded thrift. This is still 10x faster at the p999. For virtually all engines this can be plugged in with very little effort. Then if one wants the absolute best performance they can go ahead and remove the flatbuf to Thrift translation to realize the full benefit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
