The big win in v2 pages (if I remember correctly) is that the variable length encoding is no longer interleaved. That would provide a big performance lift when pulling into arrow vectors (and variable length decoding typically dominates total read processing time, on average I've seen 5-10x per cell cpu cost increase for variable reads over scalar reads). AFAIK, there is still no option for that in V1.
On Thu, Oct 8, 2020 at 12:59 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > Thanks for the quick reply Ryan. > > > > We only use v1 and it still works well. That said, I'd love to make some > > progress on better encodings and finalizing v2 so we can use them! > > > Are there JIRAs or other documentation that is tracking this work? > > Thanks, > Micah > > > On Thu, Oct 8, 2020 at 12:55 PM Ryan Blue <rb...@netflix.com> wrote: > > > While there isn't anything wrong with it, the same challenges have been > > solved in different ways with v1 pages. The main difference is that v2 > > pages are broken at record boundaries, and v1 pages weren't guaranteed to > > be. But, in order to write page indexes near the footer, breaking pages > at > > record boundaries is required. So you know if you have page indexes, you > > can actually use them to skip through pages safely. That removes much of > > the need for v2 pages. > > > > The main drawback to using v2 pages is that the v2 spec is unfinished, > and > > I don't think there is a way to use just the new pages. So you'd possibly > > end up pulling in other beta features that probably shouldn't be used if > > you want to stick with what is required for compatibility across > > implementations. > > > > We only use v1 and it still works well. That said, I'd love to make some > > progress on better encodings and finalizing v2 so we can use them! > > > > On Thu, Oct 8, 2020 at 12:44 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > >> What is the current status of support for Data Page V2? Is it > recommended > >> for production workloads? > >> > >> Thanks, > >> Micah > >> > > > > > > -- > > Ryan Blue > > Software Engineer > > Netflix > > >