Re: High performance vectorized reader meeting notes

Brock Noland Tue, 07 Oct 2014 17:35:45 -0700

Hi,

The Hive + Parquet community is very interested in improving performance of
Hive + Parquet and Parquet generally. We are very interested in
contributing to the Parquet vectorization and lazy materialization effort.
Please add myself to any future meetings on this topic.


BTW here it the JIRA tracking this effort from the Hive side:
https://issues.apache.org/jira/browse/HIVE-8120

Brock

On Tue, Oct 7, 2014 at 2:04 PM, Zhenxiao Luo <[email protected]>
wrote:

> Thanks Jason.
>
> Yes, Netflix is using Presto and Parquet for our BigDataPlatform(
>
> http://techblog.netflix.com/2014/10/using-presto-in-our-big-data-platform.html
> ).
>
> The fastest format currently in Presto is ORC, not DWRF(Parquet is fast,
> but not as fast as ORC). We are referring to ORC, not facebook's DWRF
> implementation.
>
> We already get Parquet working in Presto. We definitely would like to get
> it as fast as ORC.
>
> Facebook has did native support for ORC in Presto, which does not use the
> ORCRecordReader at all. They parses the ORC footer, and does Predicate
> Pushdown by skipping row groups, Vectorization by introducing Type Specific
> Vectors, and Lazy Materialization by introducing LazyVectors(their code has
> not been committed yet, I mean their pull request). We are planning to do
> similar optimization for Parquet in Presto.
>
> For the ParquetRecordReader, we need additional APIs to read the next Batch
> of values, and read in a vector of values. For example, here are the
> related APIs in the ORC code:
>
> /**
>    * Read the next row batch. The size of the batch to read cannot be
> controlled
>    * by the callers. Caller need to look at VectorizedRowBatch.size of the
> retunred
>    * object to know the batch size read.
>    * @param previousBatch a row batch object that can be reused by the
> reader
>    * @return the row batch that was read
>    * @throws java.io.IOException
>    */
>   VectorizedRowBatch nextBatch(VectorizedRowBatch previousBatch) throws
> IOException;
>
> And, here are the related APIs in Presto code, which is used for ORC
> support in Presto:
>
> public void readVector(int columnIndex, Object vector);
>
> For lazy materialization, we may also consider adding LazyVectors or
> LazyBlocks, so that the value is not materialized until they are accessed
> by the Operator.
>
> Any comments and suggestions are appreciated.
>
> Thanks,
> Zhenxiao
>
>
> On Tue, Oct 7, 2014 at 1:05 PM, Jason Altekruse <[email protected]>
> wrote:
>
> > Hello All,
> >
> > No updates from me yet, just sending out another message for some of the
> > Netflix engineers that were still just subscribed to the google group
> mail.
> > This will allow them to respond directly with their research on the
> > optimized ORC reader for consideration in the design discussion.
> >
> > -Jason
> >
> > On Mon, Oct 6, 2014 at 10:51 PM, Jason Altekruse <
> [email protected]
> > >
> > wrote:
> >
> > > Hello Parquet team,
> > >
> > > I wanted to report the results of a discussion between the Drill team
> and
> > > the engineers  at Netflix working to make Parquet run faster with
> Presto.
> > > As we have said in the last few hangouts we both want to make
> > contributions
> > > back to parquet-mr to add features and performance. We thought it would
> > be
> > > good to sit down and speak directly about our real goals and the best
> > next
> > > steps to get an engineering effort started to accomplish these goals.
> > >
> > > Below is a summary of the meeting.
> > >
> > > - Meeting notes
> > >
> > >    - Attendees:
> > >
> > >        - Netflix : Eva Tse, Daniel Weeks, Zhenxiao Luo
> > >
> > >        - MapR (Drill Team) : Jacques Nadeau, Jason Altekruse, Parth
> > Chandra
> > >
> > > - Minutes
> > >
> > >    - Introductions/ Background
> > >
> > >    - Netflix
> > >
> > >        - Working on providing interactive SQL querying to users
> > >
> > >        - have chosen Presto as the query engine and Parquet as high
> > > performance data
> > >
> > >          storage format
> > >
> > >        - Presto is providing needed speed in some cases, but others are
> > > missing optimizations
> > >
> > >          that could be avoiding reads
> > >
> > >        - Have already started some development and investigation, have
> > > identified key goals
> > >
> > >        - Some initial benchmarks with a modified ORC reader DWRF,
> written
> > > by the Presto
> > >
> > >          team shows that such gains are possible with a different
> reader
> > > implementation
> > >
> > >        - goals
> > >
> > >            - filter pushdown
> > >
> > >                - skipping reads based on filter evaluation on one or
> more
> > > columns
> > >
> > >                - this can happen at several granularities : row group,
> > > page, record/value
> > >
> > >            - late/lazy materialization
> > >
> > >                - for columns not involved in a filter, avoid
> > materializing
> > > them entirely
> > >
> > >                  until they are know to be needed after evaluating a
> > > filter on other columns
> > >
> > >    - Drill
> > >
> > >        - the Drill engine uses an in-memory vectorized representation
> of
> > > records
> > >
> > >        - for scalar and repeated types we have implemented a fast
> > > vectorized reader
> > >
> > >          that is optimized to transform between Parquet's on disk and
> our
> > > in-memory format
> > >
> > >        - this is currently producing performant table scans, but has no
> > > facility for filter
> > >
> > >          push down
> > >
> > >        - Major goals going forward
> > >
> > >            - filter pushdown
> > >
> > >                - decide the best implementation for incorporating
> filter
> > > pushdown into
> > >
> > >                  our current implementation, or figure out a way to
> > > leverage existing
> > >
> > >                  work in the parquet-mr library to accomplish this goal
> > >
> > >            - late/lazy materialization
> > >
> > >                - see above
> > >
> > >            - contribute existing code back to parquet
> > >
> > >                - the Drill parquet reader has a very strong emphasis on
> > > performance, a
> > >
> > >                  clear interface to consume, that is sufficiently
> > > separated from Drill
> > >
> > >                  could prove very useful for other projects
> > >
> > >    - First steps
> > >
> > >        - Netflix team will share some of their thoughts and research
> from
> > > working with
> > >
> > >          the DWRF code
> > >
> > >            - we can have a discussion based off of this, which aspects
> > are
> > > done well,
> > >
> > >              and any opportunities they may have missed that we can
> > > incorporate into our
> > >
> > >              design
> > >
> > >            - do further investigation and ask the existing community
> for
> > > guidance on existing
> > >
> > >              parquet-mr features or planned APIs that may provide
> desired
> > > functionality
> > >
> > >        - We will begin a discussion of an API for the new functionality
> > >
> > >            - some outstanding thoughts for down the road
> > >
> > >                - The Drill team has an interest in very late
> > > materialization for data stored
> > >
> > >                  in dictionary encoded pages, such as running a join or
> > > filter on the dictionary
> > >
> > >                  and then going back to the reader to grab all of the
> > > values in the data that match
> > >
> > >                  the needed members of the dictionary
> > >
> > >                    - this is a later consideration, but just some of
> the
> > > idea of the reason we are
> > >
> > >                      opening up the design discussion early so that the
> > > API can be flexible enough
> > >                      to allow this in the further, even if not
> > implemented
> > > too soon
> > >
> >
>

Re: High performance vectorized reader meeting notes

Reply via email to