Thank you Micah. Will follow up on the PR. On Sun, Feb 8, 2026 at 8:31 PM Micah Kornfield <[email protected]> wrote:
> Just wanted to follow-up. I did a first pass review on the > flatbuf definitions. > > Cheers, > Micah > > On Thu, Dec 11, 2025 at 11:58 PM Alkis Evlogimenos via dev < > [email protected]> wrote: > >> PR for linking proposal here: >> https://github.com/apache/parquet-format/pull/543 >> PR for parquet footer flatbuf definition: >> https://github.com/apache/parquet-format/pull/544 >> >> On Tue, Dec 9, 2025 at 1:26 AM Julien Le Dem <[email protected]> wrote: >> >> > Hello Alkis, >> > Do you think you could add your footer proposal to the proposals page? >> > >> > >> > >> https://github.com/apache/parquet-format/tree/master/proposals#active-proposals >> > That way it gets more visibility. >> > Cheers >> > Julien >> > >> > On Tue, Oct 21, 2025 at 11:49 AM Steve Loughran >> > <[email protected]> >> > wrote: >> > >> > > On Mon, 20 Oct 2025 at 18:24, Ed Seidl <[email protected]> wrote: >> > > >> > > > IIUC a flatbuffer aware decoder would read the last 36 bytes or so >> of >> > the >> > > > file and look for a known UUID along with size information. With >> this >> > it >> > > > could then read only the flatbuffer bytes. I think this would work >> as >> > > well >> > > > as current systems that prefetch some number of bytes in an attempt >> to >> > > get >> > > > the whole footer in a single get. >> > > > >> > > > Old readers, however, will have to fetch both footers, but won't >> have >> > any >> > > > additional decoding work because the new footer is a binary field >> that >> > > can >> > > > be easily skipped. >> > > > >> > > >> > > really depends what the readers do with footer prefetching. For the >> java >> > > clients >> > > >> > > >> > > 1. s3a classic stream: the backwards seek() switches it to random >> IO >> > > mode, next read() from base of thrift will pull in >> > > fs.s3a.readahead.range >> > > of data No penalty >> > > 2. google gs://. There's a footer cache option which will need to >> be >> > set >> > > to a larger value >> > > 3. azure abfs:// there's a footer cache option which will need to >> be >> > set >> > > to a larger value >> > > 4. s3a + amazon analytics stream. This stream is *parquet aware* >> and >> > > actually parses the footer to know what to predictively prefetch. >> The >> > > AWS >> > > developers do know of this work -moving to support the new footer >> > would >> > > be >> > > the ideal strategy here. >> > > 5. Iceberg classic input. no idea. >> > > 6. iceberg + amazon analytics. same as S3A though without some of >> the >> > > tuning we've been doing for vector reads. >> > > >> > > I wouldn't worry too much about the impact of that footer size >> increase. >> > > Some extra footer prefetch options should compensate, and once apps >> move >> > to >> > > a parquet v3 reader they've got a faster parse time. Of course, >> > ironically, >> > > read time then may dominate even more there -it'll be important to do >> > that >> > > read as efficiently as possible (use a readFully() into a buffer, not >> > lots >> > > of single byte read() calls) >> > > >> > >> >
