Hi All,
Perhaps this topic needs just a bit more thought and discussion to avoid
working at cross purposes. I've outlined the issues, and a possible path
forward, in a comment to DRILL-6147.
Quick summary: creating a second batch size implementation just for Parquet
will be very difficult once we handle all the required use cases as spelled out
in the comment. We'd want to be very sure that we do, indeed, want to duplicate
this effort before we head down that route. Duplicating the effort means
repeating all the work done over the last six months to make the original
result set loader work, and the future work needed to maintain two parallel
systems. This is not a decision to make by default.
Thanks,
- Paul
On Sunday, February 11, 2018, 12:10:58 AM PST, Parth Chandra
<[email protected]> wrote:
Thanks Salim.
Can you add this to the JIRA/design doc. Also, I would venture to suggest
that the section on predicate pushdown can be made clearer.
Also, Since you're proposing the average batch size approach with overflow
handling, some detail on the proposed changes to the framework would be
useful in the design doc. (Perhaps pseudo code and affected classes.)
Essentially some guarantees provided by the framework will change and this
may affect (or not) the existing usage. These should be enumerated in the
design doc.