Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8480 )

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic 
pruning
......................................................................


Patch Set 4:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@179
PS4, Line 179:   // A collection type is "required" if it is checked to be not 
empty. Collection
> Shrink? The following seems sufficient.
went with just referring to not-empty instead of not-empty and required.


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@184
PS4, Line 184:   // Analysis adds IsNotEmptyPredicates where needed to enforce 
required collections.
> I think this could be misunderstood. IsNotEmptyPredicates are not required,
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@436
PS4, Line 436:   // Checks if slot refers to an array "pos" pseudo-column.
> nit: we typically use /** */ style class and function comments
whoops, Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@440
PS4, Line 440:   private boolean isArrayPosReference(SlotRef slot) {
> Move to SlotRef. I generally prefer 'slotRef' instead of 'slot' because the
Done. sync'd with similar uses below as well.


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@442
PS4, Line 442:     if (parent != null) {
> reverse? if (parent == null) return false;
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@533
PS4, Line 533:    * Adds nested collections that are required from 'conjuncts' 
to requiredCollections_.
> Populates 'requiredCollections_' based on IsNotEmptyPredicates in the given
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@537
PS4, Line 537:    * Example: table T has field A which is of type 
array<array<int>>.
> I don't think examples really belong here because this is not where we are
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@548
PS4, Line 548:         
Preconditions.checkArgument(ref.getDesc().getType().isComplexType());
> checkState
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@549
PS4, Line 549:         
Preconditions.checkArgument(ref.getDesc().getItemTupleDesc() != null);
> checkState
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@574
PS4, Line 574:       // Note: it is conservative (but correct) to depend on a 
collection to be required.
> I think we should state what we mean more directly. We rely on analysis to
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@586
PS4, Line 586:    * collection-typed slot. As conjuncts are seen, mark nested 
collections that are
> mark -> collect
Done


http://gerrit.cloudera.org:8080/#/c/8480/4/testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test
File 
testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/8480/4/testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test@33
PS4, Line 33: aggregation(SUM, NumRowGroups): 2
> we should also have planner tests to see how the explain looks
Done



--
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 4
Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Nov 2017 00:33:26 +0000
Gerrit-HasComments: Yes

Reply via email to