Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20133 )

Change subject: IMPALA-12089: Be able to skip pushing down a subset of the 
predicates
......................................................................


Patch Set 4:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/20133/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20133/4//COMMIT_MSG@24
PS4, Line 24: residual expression backtracking
This 'residual expression backtracking' sounds pretty mysterious. Might be 
better this way: 'Iceberg residual expression handling'


http://gerrit.cloudera.org:8080/#/c/20133/4//COMMIT_MSG@37
PS4, Line 37: If the expression backtracking
'If Impala is unable to match all residual expression to Impala conjuncts then 
all the conjunct are pushed dow to Impala scanner' ?


http://gerrit.cloudera.org:8080/#/c/20133/4//COMMIT_MSG@53
PS4, Line 53: iceberg_predicate_subsetting
I think it's in fact 'iceberg_predicate_pushdown_subsetting'


http://gerrit.cloudera.org:8080/#/c/20133/4//COMMIT_MSG@57
PS4, Line 57: Tests:
Did you have the chance to make some perf measurements when you start 
increasing the number of predicates so that we can be sure converting Iceberg 
preds back to Impala preds doesn't get too expensive with scale?


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/analysis/PredicateVisitor.java
File fe/src/main/java/org/apache/impala/analysis/PredicateVisitor.java:

http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/analysis/PredicateVisitor.java@33
PS4, Line 33: PredicateVisitor
>From Impala's perspective this class is only a Visitor for Iceberg 
>predicates/expressions, right? IcebergPredicateVisitor would be a more 
>suitable name, then.
But being even more precise, this class is actually a collector of Iceberg 
predicates (using the visitor pattern). So something like 
IcebergPredicateCollector or IcebergExpressionCollector would be even more 
precise in my opinion.


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@118
PS4, Line 118: translatedExpressions_
I think something like 'impalaIcebergPredicateMapping' would be more verbose.


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@375
PS4, Line 375:              IcebergUtil.planFiles(getIceTable(),
indentation is off with these 2 lines.


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@418
PS4, Line 418:       conjuncts_.removeAll(translatedExpressions_.values());
Might be personal preference, but if you returned here then you wouldn't need 
to put the rest of the function in an else block.


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@421
PS4, Line 421:         return;
Doesn't this fit into the line above?


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@430
PS4, Line 430:           if (expr == null) {
Could you please add some comment here for more clarity, like "If we fail to 
translate any of the Iceberg residual expressions then we skip filtering the 
predicates to be pushed down to Impala scanner."


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@431
PS4, Line 431:             return;
merge into the line above


http://gerrit.cloudera.org:8080/#/c/20133/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@437
PS4, Line 437:       expressionsToRetain.forEach(skippedExpressions_::remove);
I think something the other way around would be more readable if feasible. Like 
skippedExpressions_.stream().filter(some condition) ?


http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates-disabled-subsetting.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates-disabled-subsetting.test:

http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates-disabled-subsetting.test@1
PS4, Line 1: # If one or more non-partition predicates are present, then all 
predicates are pushed down
Please mention in this comment that the pred subsetting feature is off. I know 
this info is in the file name but you can't emphasise this enough.


http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test:

http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test@77
PS4, Line 77: ====
I think this iceberg-predicates.test file is the place where we should test 
this feature more heavily. e.g. include the use case I mentioned in another 
test file where some of the residuals can't be matched to Impala predicates so 
we fallback to push down everything.

Also some further combinations of predicates, and also predicates vs partition 
transformed table would be nice here. You can go wild and come up with all kind 
of tricky predicates to see how this feature handle them.


http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test:

http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test@1005
PS4, Line 1005: user = null
user is null ?


http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test@1025
PS4, Line 1025: untranslated
What is an 'untranslated expression' and it's not 100% clear for me why 'action 
= null' is untranslated.


http://gerrit.cloudera.org:8080/#/c/20133/4/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test@1035
PS4, Line 1035: ====
Would be nice to have a test case where there are residual expression but not 
all of them could be matched to an Impala predicate so we backtrack to push 
down all the predicates to Impala.



--
To view, visit http://gerrit.cloudera.org:8080/20133
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I597f69ad03ecaf9e304613ef934654e3d9614ae8
Gerrit-Change-Number: 20133
Gerrit-PatchSet: 4
Gerrit-Owner: Peter Rozsa <pro...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pro...@cloudera.com>
Gerrit-Comment-Date: Tue, 01 Aug 2023 12:06:07 +0000
Gerrit-HasComments: Yes

Reply via email to