Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19534 )

Change subject: IMPALA-11701 Part1: Don't push down predicates to scanner if 
already applied by Iceberg
......................................................................


Patch Set 6:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@16
PS1, Line 16: t (ass
> Nit: filters.
Done


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@17
PS1, Line 17: ).
> Wouldn't "assuming" be better?
Done


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@17
PS1, Line 17:
> Nit: closing parenthesis should come at the end of the sentence.
Done


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@20
PS1, Line 20: scanner is that we can potentially materialize les
> Am I right that here we mean pushing down predicates to the scanner, as opp
Done


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@20
PS1, Line 20: tentiall
> not pushing down
Done


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@28
PS1, Line 28:
> We could include a Testing section describing what tests have been added/mo
Done


http://gerrit.cloudera.org:8080/#/c/19534/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19534/3//COMMIT_MSG@20
PS3, Line 20: iall
> nit: pushing down?
Done


http://gerrit.cloudera.org:8080/#/c/19534/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19534/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1390
PS3, Line 1390: to
> 'is' is not needed here?
Done


http://gerrit.cloudera.org:8080/#/c/19534/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2027
PS3, Line 2027: indentPrefix
> 'explainLevel' may not be the best name as it could be confused with the ve
Done


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@64
PS1, Line 64:
> The comment on 'nonIdentityConjuncts_' says that it is a subset of 'conjunc
'nonIdentityConjuncts_' is in fact a subset of 'conjuncts_' that doesn't 
include conjuncts on identity partitioned columns.
'skippedConjuncts_' indeed contains identity conjuncts but not necessarily all 
of them just the ones that won't filter any further rows.

OTOH As I see 'nonIdentityConjuncts_' is used for cardinality estimates. 
However, I wonder if from this point on can we drop use 'conjuncts_' - 
'skippedConjuncts_' for this purpose.
@Zoltan, do you have a thought on this?


http://gerrit.cloudera.org:8080/#/c/19534/3/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/19534/3/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@115
PS3, Line 115: has
> Nit: has
Done


http://gerrit.cloudera.org:8080/#/c/19534/3/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@241
PS3, Line 241:
> I don't think we need this as this is the DELETE SCAN node which wouldn't e
Done


http://gerrit.cloudera.org:8080/#/c/19534/3/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@388
PS3, Line 388: : %s", getIceTable().
> nit: could be 'Collections.emptyList()'
Done


http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/datasets/functional/functional_schema_template.sql@3265
PS3, Line 3265: ---- DEPENDENT_LOAD
              : # We can use 'date_string_col' as it is once IMPALA-11954 is 
done.
              : INSERT INTO {db_name}{db_suffix}.iceberg_partition_evolution
> Since we are writing the table via Impala we probably don't need to set the
Indeed, done


http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test:

http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test@49
PS3, Line 49: could no
> Could not be?
Indeed :)


http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test:

http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test@990
PS3, Line 990: event_time='2020-01-01 11:00:00';
> iceberg_partitioned is partitioned by HOUR(event_time), how can we skip pus
I had to check, but apparently we push down predicates of an HOUR() partition 
to Iceberg. Found this in the logs:
"Push down the predicate: ref(name="event_time") == 1577872800000000 to iceberg"
I'm wondering if this is because the partition value contains date+hour and not 
just hour. "event_time_hour=2020-01-01-08"

In this particular test (due to some timezone mismatch) Iceberg filters out all 
the files using this pushed down filter, hence it shows up in "skipped Iceberg 
predicates".


http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test:

http://gerrit.cloudera.org:8080/#/c/19534/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test@51
PS3, Line 51: predicat
> Nit: predicate
Done



--
To view, visit http://gerrit.cloudera.org:8080/19534
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfa80ce469cecfcfbcd0dcb595a6b04b7027285b
Gerrit-Change-Number: 19534
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Mar 2023 15:49:20 +0000
Gerrit-HasComments: Yes

Reply via email to