[ 
https://issues.apache.org/jira/browse/DRILL-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360194#comment-16360194
 ] 

Paul Rogers commented on DRILL-6147:
------------------------------------

Salim Says:

Predicate Pushdown
- The reason I invoked Predicate Pushdown within the document is to help the 
analysis:
  o Notice how Record Batch materialization could involve many more pages
  o A solution that relies mainly on the current set of pages (one per column) 
might pay a heavy IO price without much to show for
      + By waiting for all columns to have at least one page loaded so that 
upfront stats are gathered
      + Batch memory is then divided optimally across columns and the current 
batch size is computed
      + Unfortunately, such logic will fail if more pages are involved than the 
ones taken in consideration
  o Example -
      + Two variable length columns c1 and c2
      + Reader waits for two pages P1-1 and P2-1 so that we a) allocate memory 
optimally across c1 and c2 and b) compute a batch size that will minimize 
overflow logic
      + Assume, because of data length skew or predicate pushdown, that more 
pages are involved in loading the batch
      + for c1: {P1-1, P1-2, P1-3, P1-4}, c2: {P2-1, P2-2}
      + It is now highly possible that overflow logic might not be optimal 
since only  two pages statistics were considered instead of six

- I have added new logic to the ScanBatch so to log (on-demand) extra batch 
statistics which will help us assess the efficiency of the batch sizing 
strategy; will add this information to the document when this sub-task is done

Paul's comment:

Perhaps we are conflating a number of things here and things are not, in fact, 
this simple. Predicates must be applied per row, not per column or per page. If 
the predicate is "X = 5", then we must exclude all column values when X != 5. 
This means we cannot do independent bulk operations if we are also doing 
per-row predicate filtering. Said another way, predicate push-down forces 
row-by-row processing, even though the underlying storage format is columnar. 
(This is why the Filter operator works row-by-row.)

The result set loader has built-in support for predicate push-down. Once a row 
is loaded, before it is saved, we can apply a predicate to that row. If we 
don't want the row, we don't save it, and the result set loader overwrites that 
row with a new value. That code works and is tested. (The predicate evaluation 
logic would have to be added; the row-handling parts are done. Here again, why 
not build on the existing work, adding additional value such as that filter 
evaluation logic.)

The discussion above is about optimizing I/O. But, predicates work row by row. 
Not sure how page buffering impacts per-row predicates: one has to load all 
pages for a batch, and decode them, if we need even a single value from that 
page. (Parquet does not allow random access to values within a page due to 
encoding, if I understand correctly.) One optimization would be to read just 
the predicate column and apply the predicate to those values (X in my example 
above.) Then, if no X value matches the predicate, we can skip reading the 
other pages. But, I'd have thought we could get the same result from the footer 
metadata...

> Limit batch size for Flat Parquet Reader
> ----------------------------------------
>
>                 Key: DRILL-6147
>                 URL: https://issues.apache.org/jira/browse/DRILL-6147
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>             Fix For: 1.13.0
>
>
> The Parquet reader currently uses a hard-coded batch size limit (32k rows) 
> when creating scan batches; there is no parameter nor any logic for 
> controlling the amount of memory used. This enhancement will allow Drill to 
> take an extra input parameter to control direct memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to