Re: High performance vectorized reader meeting notes

Ryan Blue Mon, 10 Nov 2014 11:02:07 -0800

Hi everyone,

Is there a JIRA issue tracking the vectorized reader API? Brock and Ihave been working through how we would integrate this with Hive and havea few questions and comments. Thanks!


rb

On 11/01/2014 01:03 PM, Brock Noland wrote:

Hi,

Great! I will take a look soon.

Cheers!
Brock

On Mon, Oct 27, 2014 at 11:18 PM, Zhenxiao Luo <[email protected]> wrote:


Thanks Jacques.

Here is the gist:
https://gist.github.com/zhenxiao/2728ce4fe0a7be2d3b30

Comments and Suggestions are appreciated.

Thanks,
Zhenxiao

On Mon, Oct 27, 2014 at 10:55 PM, Jacques Nadeau <[email protected]>
wrote:

You can't send attachments.  Can you post as google doc or gist?

On Mon, Oct 27, 2014 at 7:41 PM, Zhenxiao Luo <[email protected]>
wrote:


Thanks Brock and Jason.

I just drafted a proposed APIs for vectorized Parquet reader(attached in
this email). Any comments and suggestions are appreciated.

Thanks,
Zhenxiao

On Tue, Oct 7, 2014 at 5:34 PM, Brock Noland <[email protected]>

wrote:

Hi,

The Hive + Parquet community is very interested in improving

performance

of
Hive + Parquet and Parquet generally. We are very interested in
contributing to the Parquet vectorization and lazy materialization

effort.

Please add myself to any future meetings on this topic.

BTW here it the JIRA tracking this effort from the Hive side:
https://issues.apache.org/jira/browse/HIVE-8120

Brock

On Tue, Oct 7, 2014 at 2:04 PM, Zhenxiao Luo <[email protected]

wrote:

Thanks Jason.

Yes, Netflix is using Presto and Parquet for our BigDataPlatform(

http://techblog.netflix.com/2014/10/using-presto-in-our-big-data-platform.html

).

The fastest format currently in Presto is ORC, not DWRF(Parquet is

fast,

but not as fast as ORC). We are referring to ORC, not facebook's DWRF
implementation.

We already get Parquet working in Presto. We definitely would like to

get

it as fast as ORC.

Facebook has did native support for ORC in Presto, which does not use

the

ORCRecordReader at all. They parses the ORC footer, and does

Predicate

Pushdown by skipping row groups, Vectorization by introducing Type

Specific

Vectors, and Lazy Materialization by introducing LazyVectors(their

code

has

not been committed yet, I mean their pull request). We are planning

to

do

similar optimization for Parquet in Presto.

For the ParquetRecordReader, we need additional APIs to read the next

Batch

of values, and read in a vector of values. For example, here are the
related APIs in the ORC code:

/**
    * Read the next row batch. The size of the batch to read cannot be
controlled
    * by the callers. Caller need to look at VectorizedRowBatch.size

of

the

retunred
    * object to know the batch size read.
    * @param previousBatch a row batch object that can be reused by

the

reader
    * @return the row batch that was read
    * @throws java.io.IOException
    */
   VectorizedRowBatch nextBatch(VectorizedRowBatch previousBatch)

throws

IOException;

And, here are the related APIs in Presto code, which is used for ORC
support in Presto:

public void readVector(int columnIndex, Object vector);

For lazy materialization, we may also consider adding LazyVectors or
LazyBlocks, so that the value is not materialized until they are

accessed

by the Operator.

Any comments and suggestions are appreciated.

Thanks,
Zhenxiao


On Tue, Oct 7, 2014 at 1:05 PM, Jason Altekruse <

[email protected]>

wrote:

Hello All,

No updates from me yet, just sending out another message for some

of

the

Netflix engineers that were still just subscribed to the google

group

mail.

This will allow them to respond directly with their research on the
optimized ORC reader for consideration in the design discussion.

-Jason

On Mon, Oct 6, 2014 at 10:51 PM, Jason Altekruse <

[email protected]

wrote:

Hello Parquet team,

I wanted to report the results of a discussion between the Drill

team

and

the engineers  at Netflix working to make Parquet run faster with

Presto.

As we have said in the last few hangouts we both want to make

contributions

back to parquet-mr to add features and performance. We thought it

would

be

good to sit down and speak directly about our real goals and the

best

next

steps to get an engineering effort started to accomplish these

goals.


Below is a summary of the meeting.

- Meeting notes

    - Attendees:

        - Netflix : Eva Tse, Daniel Weeks, Zhenxiao Luo

        - MapR (Drill Team) : Jacques Nadeau, Jason Altekruse,

Parth

Chandra


- Minutes

    - Introductions/ Background

    - Netflix

        - Working on providing interactive SQL querying to users

        - have chosen Presto as the query engine and Parquet as

high

performance data

          storage format

        - Presto is providing needed speed in some cases, but

others

are

missing optimizations

          that could be avoiding reads

        - Have already started some development and investigation,

have

identified key goals

        - Some initial benchmarks with a modified ORC reader DWRF,

written

by the Presto

          team shows that such gains are possible with a different

reader

implementation

        - goals

            - filter pushdown

                - skipping reads based on filter evaluation on

one or

more

columns

                - this can happen at several granularities : row

group,

page, record/value

            - late/lazy materialization

                - for columns not involved in a filter, avoid

materializing

them entirely

                  until they are know to be needed after

evaluating a

filter on other columns

    - Drill

        - the Drill engine uses an in-memory vectorized

representation

of

records

        - for scalar and repeated types we have implemented a fast
vectorized reader

          that is optimized to transform between Parquet's on disk

and

our

in-memory format

        - this is currently producing performant table scans, but

has no

facility for filter

          push down

        - Major goals going forward

            - filter pushdown

                - decide the best implementation for incorporating

filter

pushdown into

                  our current implementation, or figure out a way

to

leverage existing

                  work in the parquet-mr library to accomplish

this

goal


            - late/lazy materialization

                - see above

            - contribute existing code back to parquet

                - the Drill parquet reader has a very strong

emphasis on

performance, a

                  clear interface to consume, that is sufficiently
separated from Drill

                  could prove very useful for other projects

    - First steps

        - Netflix team will share some of their thoughts and

research

from

working with

          the DWRF code

            - we can have a discussion based off of this, which

aspects

are

done well,

              and any opportunities they may have missed that we

can

incorporate into our

              design

            - do further investigation and ask the existing

community

for

guidance on existing

              parquet-mr features or planned APIs that may provide

desired

functionality

        - We will begin a discussion of an API for the new

functionality


            - some outstanding thoughts for down the road

                - The Drill team has an interest in very late
materialization for data stored

                  in dictionary encoded pages, such as running a

join or

filter on the dictionary

                  and then going back to the reader to grab all of

the

values in the data that match

                  the needed members of the dictionary

                    - this is a later consideration, but just

some of

the

idea of the reason we are

                      opening up the design discussion early so

that

the

API can be flexible enough
                      to allow this in the further, even if not

implemented

too soon



--
Ryan Blue
Software Engineer
Cloudera, Inc.

Re: High performance vectorized reader meeting notes

Reply via email to