[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Sep 2023 17:00:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Sep 2023 12:34:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 5: (1 comment) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20460/4/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java File fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java: http://gerrit.cloudera.org:8080/#/c/20460/4/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@94 PS4, Line 94: Please note that left side's cardinality already takes the selectivity into : // account > We could mention this also in the commit message. Done -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Sep 2023 12:32:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Hello Tamas Mate, Daniel Becker, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20460 to look at the new patch set (#5). Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes Currently IcebergDeleteNode's cardinality is the same as the LHS's cardinality, i.e. we don't take the RHS into account. The RHS contains the position delete records, so it is a fair assumption that all records at RHS remove a record from LHS (duplicated delete records should be extremely rare). If there are conjuncts on the Iceberg table we can assume that they have the same selectivity on the data records and on the delete records. With the above assumptions this change updates the cardinality of the IcebergDeleteNode with basically the following formula: Card(IcebergDeleteNode) = Card(LHS) - Selectivity(LHS) * Card(RHS); Please note that left side's cardinality already takes the selectivity into account. To deal with edge cases when there are lots of duplicated delete records (shouldn't happen in normal usage), we return at least 1 cardinality, so the actual formula is: Card(IcebergDeleteNode) = Max( 1, Card(LHS) - Selectivity(LHS) * Card(RHS) ); Testing: * updated the planner tests Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 --- M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test 3 files changed, 97 insertions(+), 79 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/20460/5 -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9764/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Sep 2023 12:34:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 4: (1 comment) Just a small thing, I can +2 after this. http://gerrit.cloudera.org:8080/#/c/20460/4/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java File fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java: http://gerrit.cloudera.org:8080/#/c/20460/4/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@94 PS4, Line 94: Please note that left side's cardinality already takes the selectivity into : // account We could mention this also in the commit message. -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Sep 2023 12:27:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 4: Code-Review+1 LGTM! Thanks Zoltan! -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Sep 2023 12:16:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 4: Code-Review+1 Nice change, looks good to me. -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 26 Sep 2023 12:30:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/20460/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20460/3//COMMIT_MSG@12 PS3, Line 12: LHS > Is it LHS? Every position delete record removes a data record from the LHS, Yes, thank you! -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 26 Sep 2023 09:54:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Hello Tamas Mate, Daniel Becker, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20460 to look at the new patch set (#4). Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes Currently IcebergDeleteNode's cardinality is the same as the LHS's cardinality, i.e. we don't take the RHS into account. The RHS contains the position delete records, so it is a fair assumption that all records at RHS remove a record from LHS (duplicated delete records should be extremely rare). If there are conjuncts on the Iceberg table we can assume that they have the same selectivity on the data records and on the delete records. With the above assumptions this change updates the cardinality of the IcebergDeleteNode with basically the following formula: Card(IcebergDeleteNode) = Card(LHS) - Selectivity(LHS) * Card(RHS); To deal with edge cases when there are lots of duplicated delete records (shouldn't happen in normal usage), we return at least 1 cardinality, so the actual formula is: Card(IcebergDeleteNode) = Max( 1, Card(LHS) - Selectivity(LHS) * Card(RHS) ); Testing: * updated the planner tests Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 --- M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test 3 files changed, 97 insertions(+), 79 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/20460/4 -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 3: Code-Review+1 (1 comment) LGTM http://gerrit.cloudera.org:8080/#/c/20460/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20460/3//COMMIT_MSG@12 PS3, Line 12: RHS Is it LHS? Every position delete record removes a data record from the LHS, right? -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Sep 2023 14:03:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 3: Code-Review+1 Thanks, looks good to me. Someone more knowledgeable about Iceberg should also take a look. -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 14 Sep 2023 15:24:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13988/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Sep 2023 16:44:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 3: (2 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java File fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java: http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@98 PS2, Line 98: have non-ze > Thanks, I like the idea about a precondition check and using 1 in the Math. Done http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@100 PS2, Line 100: ondition > Thanks. It may be worth mentioning it either here or in the commit message. Done -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Sep 2023 16:17:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Hello Daniel Becker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20460 to look at the new patch set (#3). Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes Currently IcebergDeleteNode's cardinality is the same as the LHS's cardinality, i.e. we don't take the RHS into account. The RHS contains the position delete records, so it is a fair assumption that all records at RHS remove a record from RHS (duplicated delete records should be extremely rare). If there are conjuncts on the Iceberg table we can assume that they have the same selectivity on the data records and on the delete records. With the above assumptions this change updates the cardinality of the IcebergDeleteNode with basically the following formula: Card(IcebergDeleteNode) = Card(LHS) - Selectivity(LHS) * Card(RHS); To deal with edge cases when there are lots of duplicated delete records (shouldn't happen in normal usage), we return at least 1 cardinality, so the actual formula is: Card(IcebergDeleteNode) = Max( 1, Card(LHS) - Selectivity(LHS) * Card(RHS) ); Testing: * updated the planner tests Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 --- M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test 3 files changed, 97 insertions(+), 79 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/20460/3 -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java File fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java: http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@98 PS2, Line 98: is not zero > If there are input files to scan, leftCard will always be greater than 0, s Thanks, I like the idea about a precondition check and using 1 in the Math.math() call. http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@100 PS2, Line 100: leftCard > Yes. Thanks. It may be worth mentioning it either here or in the commit message. -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Sep 2023 11:19:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 2: (2 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java File fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java: http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@98 PS2, Line 98: is not zero > What happens if it is zero? We don't reach this code then? If there are input files to scan, leftCard will always be greater than 0, so we always report at least one cardinality. This is in line with: https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/fe/src/main/java/org/apache/impala/planner/PlanNode.java#L747 Actually, if the leftCard is 0 (no input files), then the planner shouldn't even create this node. So alternatively we could probably use a Preconditions.checkState() and a bit simpler formula: cardinality_ = Math.max( 1, leftCard - (long)(leftSelectivity * rightCard)); http://gerrit.cloudera.org:8080/#/c/20460/2/fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java@100 PS2, Line 100: leftCard > For 'leftCard' selectivity has already been taken into account, that's why Yes. -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Sep 2023 14:28:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20460 ) Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13929/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 06 Sep 2023 14:32:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20460 to look at the new patch set (#2). Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes Currently IcebergDeleteNode's cardinality is the same as the LHS's cardinality, i.e. we don't take the RHS into account. The RHS contains the position delete records, so it is a fair assumption that all records at RHS remove a record from RHS (duplicated delete records should be extremely rare). If there are conjuncts on the Iceberg table we can assume that they have the same selectivity on the data records and on the delete records. With the above assumptions this change updates the cardinality of the IcebergDeleteNode with basically the following formula: Card(IcebergDeleteNode) = Card(LHS) - Selectivity(LHS) * Card(RHS); To deal with edge cases when there are lots of duplicated delete records (shouldn't happen in normal usage), we actually use a slightly more complex formula: Card(IcebergDeleteNode) = Max( Min(1, Card(LHS))), Card(LHS) - Selectivity(LHS) * Card(RHS) ); Testing: * updated the planner tests Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 --- M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test 3 files changed, 94 insertions(+), 79 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/20460/2 -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20460 Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes .. IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with deletes Currently IcebergDeleteNode's cardinality is the same as the LHS's cardinality, i.e. we don't take the RHS into account. The RHS contains the position delete records, so it is a fair assumption that all records at RHS remove a record from RHS (duplicated delete records should be extremely rare). If there are conjuncts on the Iceberg table we can assume that they have the same selectivity on the data records and on the delete records. With the above assumptions this change updates the cardinality of the IcebergDeleteNode with the basically the following formula: Card(IcebergDeleteNode) = Card(LHS) - Selectivity(LHS) * Card(RHS); To deal with edge cases when there are lots of duplicated delete records, we actually use a slightly more complex formula: Card(IcebergDeleteNode) = Max( Min(1, Card(LHS))), Card(LHS) - Selectivity(LHS) * Card(RHS) ); Testing: * updated the planner tests Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 --- M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test 3 files changed, 94 insertions(+), 79 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/20460/1 -- To view, visit http://gerrit.cloudera.org:8080/20460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5 Gerrit-Change-Number: 20460 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy