Re: Batch Sizing for Parquet Flat Reader

Paul Rogers Sun, 11 Feb 2018 13:37:16 -0800

Hi All,
Perhaps this topic needs just a bit more thought and discussion to avoid 
working at cross purposes. I've outlined the issues, and a possible path 
forward, in a comment to DRILL-6147.
Quick summary: creating a second batch size implementation just for Parquet 
will be very difficult once we handle all the required use cases as spelled out 
in the comment. We'd want to be very sure that we do, indeed, want to duplicate 
this effort before we head down that route. Duplicating the effort means 
repeating all the work done over the last six months to make the original 
result set loader work, and the future work needed to maintain two parallel 
systems. This is not a decision to make by default.
Thanks,
- Paul


    On Sunday, February 11, 2018, 12:10:58 AM PST, Parth Chandra 
<par...@apache.org> wrote:  
 
 Thanks Salim.
Can you add this to the JIRA/design doc. Also, I would venture to suggest
that the section on predicate pushdown can be made clearer.
Also, Since you're proposing the average batch size approach with overflow
handling, some detail on the proposed changes to the framework would be
useful in the design doc. (Perhaps pseudo code and affected classes.)
 Essentially some guarantees provided by the framework will change and this
may affect (or not) the existing usage. These should be enumerated in the
design doc.

Re: Batch Sizing for Parquet Flat Reader

Reply via email to