Flatbuf parsing is trivial compared to thrift. Thrift walks the bytes of the serialized form and picks fields out of it one by one. Flatbuf instead takes the serialized form and uses offsets already embedded in it to extract fields from the serialized form directly. In other words there is no parsing done. We have 3 ways to use the flatbuf each of which adds more overhead: 1. raw flatbuf without postprocessing: +150x speedup 2. verified flatbuf: +50x speedup. Verified flatbuf means that before any flatbuf access, all offsets are bounds check to not cause memory accesses outside the flatbuf encoded blob 3. verified flatbuf + conversion to FileMetadata struct generated by thrift compiler: +5x speedup. This is the easy migration path for most engines where we take flatbuf, verify it and then put together the same FileMetadata struct that would come out of thrift parser had we parsed the thrift representation.
On Fri, Oct 17, 2025 at 5:19 PM Andrew Bell <[email protected]> wrote: > On Fri, Oct 17, 2025 at 10:20 AM Alkis Evlogimenos > <[email protected]> wrote: > > > Thank you Andrew for putting the code in open source so that we can repro > > it. > > > > The TLDR is that flatbuf is 5x faster with the Thrift conversion vs the > > optimized Thrift parsing. It also remains faster than the Thrift parser > > even if the Thrift parser skips statistics. Furthermore if Thrift > > conversion is skipped, the speedup is 50x, and if verification is skipped > > it goes beyond 150x. > > > Can you explain a bit the differences/changes in the parser that provides > such a speedup? > > -- > Andrew Bell > [email protected] >
