IIUC a flatbuffer aware decoder would read the last 36 bytes or so of the file and look for a known UUID along with size information. With this it could then read only the flatbuffer bytes. I think this would work as well as current systems that prefetch some number of bytes in an attempt to get the whole footer in a single get.
Old readers, however, will have to fetch both footers, but won't have any additional decoding work because the new footer is a binary field that can be easily skipped. On 2025/10/20 15:59:14 Adrian Garcia Badaracco wrote: > If we embed both a flat buffer footer and a thrift footer, will readers be > able to completely skip the thrift footer to read the flat buffer footer? Or > will they have to download / read both? Especially if they have to download > the bytes for both I’m not sure how big the win will be, on object storage > slow IO can be what dominates. > > > On Oct 20, 2025, at 9:49 AM, Raphael Taylor-Davies > > <[email protected]> wrote: > > > > I don't disagree that two files is much harder than one file, but is that > > the use-case that the flatbuffer format is solving for, or is that > > adequately serviced by the existing thrift-based footer? I had interpreted > > the flatbuffer more as a way to accelerate larger datasets consisting of > > many files, and of less utility for the single-file use-case. > > > > That being said I misread the proposal, I thought it was proposing > > replacing the thrift based footer with a flatbuffer, which would be very > > disruptive. However, it looks like instead the (new?) proposal is to just > > create a duplicate flatbuffer footer embedded within the thrift footer, > > which can just be ignored by readers. The proposal is a bit vague when it > > comes to whether all information would be duplicated, or whether some > > information would only be embedded in the flatbuffer payload, but presuming > > it is a true duplicate, many of my points don't apply. > > > > Kind Regards, > > > > Raphael > > > > On 20/10/2025 15:28, Antoine Pitrou wrote: > >> I don't think it's a "small price to pay". Parquet files are widely > >> used to share or transfer data of all kinds (in a way, they replace CSV > >> with much better characteristics). Sharing a single file is easy, > >> sharing two related files while keeping their relationship intact is an > >> order of magnitude more difficult. > >> > >> Regards > >> > >> Antoine. > >> > >> > >> On Mon, 20 Oct 2025 12:23:20 +0100 > >> Personal > >> <[email protected]> > >> wrote: > >>> Apologies if this has already been discussed, but have we considered > >>> simply storing these flatbuffers as separate files alongside existing > >>> parquet files. I think this would have a number of quite compelling > >>> advantages: > >>> > >>> - no breaking format changes, all readers can continue to still read all > >>> parquet files > >>> - people can generate these "index" files for existing datasets without > >>> having to rewrite all their files > >>> - older and newer readers can coexist - no stop the world migrations > >>> - can potentially combine multiple flatbuffers into a single file for > >>> better IO when scanning collections of files - potentially very valuable > >>> for object stores, and would also help for people on HDFS and other > >>> systems that struggle with small files > >>> - could potentially even inline these flatbuffers into catalogs like > >>> iceberg > >>> - can continue to iterate at a faster rate, without the constraints of > >>> needing to move in lockstep with parquet versioning > >>> - potentially less confusing for users, parquet files are still the same, > >>> they just can be accelerated by these new index files > >>> > >>> This would mean some data duplication, but that seems a small price to > >>> pay, and would be strictly opt-in for users with use-cases that justify > >>> it? > >>> > >>> Kind Regards, > >>> > >>> Raphael > >>> > >>> On 20 October 2025 11:08:59 BST, Alkis Evlogimenos > >>> <[email protected]> wrote: > >>>>> Thank you, these are interesting. Can you share instructions on how to > >>>>> reproduce the reported numbers? I am interested to review the code used > >>>>> to > >>>>> generate these results (esp the C++ thrift code) > >>>> > >>>> The numbers are based on internal code (Photon). They are not very far > >>>> off > >>> >from https://github.com/apache/arrow/pull/43793. I will update that PR in > >>>> the coming weeks so that we can repro the same benchmarks with open > >>>> source > >>>> code too. > >>>> > >>>> On Fri, Oct 17, 2025 at 5:52 PM Andrew Lamb <[email protected]> > >>>> wrote: > >>>> > >>>>> Thanks Alkis, that is interesting data. > >>>>> > >>>>>> We found that the reported numbers were not reproducible on AWS > >>>>>> instances > >>>>> I just updated the benchmark results[1] to include results from > >>>>> AWS m6id.8xlarge instance (and they are indeed about 2x slower than when > >>>>> run on my 2023 Mac laptop) > >>>>> > >>>>>> You can find the summary of our findings in a separate tab in the > >>>>> proposal document: > >>>>> > >>>>> Thank you, these are interesting. Can you share instructions on how to > >>>>> reproduce the reported numbers? I am interested to review the code used > >>>>> to > >>>>> generate these results (esp the C++ thrift code) > >>>>> > >>>>> Thanks > >>>>> Andrew > >>>>> > >>>>> > >>>>> [1]: > >>>>> > >>>>> https://github.com/alamb/parquet_footer_parsing?tab=readme-ov-file#results-on-linux > >>>>> > >>>>> > >>>>> On Fri, Oct 17, 2025 at 10:20 AM Alkis Evlogimenos > >>>>> <[email protected]> wrote: > >>>>> > >>>>>> Thank you Andrew for putting the code in open source so that we can > >>>>>> repro > >>>>>> it. > >>>>>> > >>>>>> We have run the rust benchmarks and also run the flatbuf proposal with > >>>>> our > >>>>>> C++ thrift parser, the flatbuf footer with Thrift conversion, the > >>>>>> flatbuf footer without Thrift conversion, and the flatbuf footer > >>>>>> without Thrift conversion and without verification. You can find the > >>>>>> summary of our findings in a separate tab in the proposal document: > >>>>>> > >>>>>> > >>>>> https://docs.google.com/document/d/1kZS_DM_J8n6NKff3vDQPD1Y4xyDdRceYFANUE0bOfb0/edit?tab=t.ve65qknb3sq1#heading=h.3uwb5liauf1s > >>>>>> The TLDR is that flatbuf is 5x faster with the Thrift conversion vs the > >>>>>> optimized Thrift parsing. It also remains faster than the Thrift parser > >>>>>> even if the Thrift parser skips statistics. Furthermore if Thrift > >>>>>> conversion is skipped, the speedup is 50x, and if verification is > >>>>>> skipped > >>>>>> it goes beyond 150x. > >>>>>> > >>>>>> > >>>>>> On Tue, Sep 30, 2025 at 5:56 PM Andrew Lamb <[email protected]> > >>>>>> wrote: > >>>>>> > >>>>>>> Hello, > >>>>>>> > >>>>>>> I did some benchmarking for the new parser[2] we are working on in > >>>>>>> arrow-rs. > >>>>>>> > >>>>>>> This benchmark achieves nearly an order of magnitude improvement (7x) > >>>>>>> parsing Parquet metadata with no changes to the Parquet format, by > >>>>> simply > >>>>>>> writing a more efficient thrift decoder (which can also skip > >>>>> statistics). > >>>>>>> While we have not implemented a similar decoder in other languages > >>>>>>> such > >>>>>> as > >>>>>>> C/C++ or Java, given the similarities in the existing thrift libraries > >>>>>> and > >>>>>>> usage, we expect similar improvements are possible in those languages > >>>>> as > >>>>>>> well. > >>>>>>> > >>>>>>> Here are some inline images: > >>>>>>> [image: image.png] > >>>>>>> [image: image.png] > >>>>>>> > >>>>>>> > >>>>>>> You can find full details here [1] > >>>>>>> > >>>>>>> Andrew > >>>>>>> > >>>>>>> > >>>>>>> [1]: https://github.com/alamb/parquet_footer_parsing > >>>>>>> [2]: https://github.com/apache/arrow-rs/issues/5854 > >>>>>>> > >>>>>>> > >>>>>>> On Wed, Sep 24, 2025 at 5:59 PM Ed Seidl <[email protected]> wrote: > >>>>>>> > >>>>>>>>> Concerning Thrift optimization, while a 2-3x improvement might be > >>>>>>>>> achievable, Flatbuffers are currently demonstrating a 10x > >>>>> improvement. > >>>>>>>>> Andrew, do you have a more precise estimate for the speedup we could > >>>>>>>> expect > >>>>>>>>> in C++? > >>>>>>>> Given my past experience on cuDF, I'd estimate about 2X there as > >>>>>>>> well. > >>>>>>>> cuDF has it's own metadata parser that I once benchmarked against the > >>>>>>>> thrift generated parser. > >>>>>>>> > >>>>>>>> And I'd point out that beyond the initial 2X improvement, rolling > >>>>>>>> your > >>>>>>>> own parser frees you of having to parse out every structure in the > >>>>>> metadata. > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >> > >> > >
