[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chance of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. It is especially important for queries that prune partitions via runtime filters (e.g. due to a JOIN), because it doesn't matter that we schedule the scan ranges evenly, the scan ranges that survive the runtime filters can still be clustered on certain executors. E.g. TPC-DS Q22 has the following JOIN and WHERE predicates: inv_date_sk=d_date_sk and d_month_seq between 1199 and 1199 + 11 The Inventory table is partitioned by column inv_date_sk, and we filter the rows in the joined table by 'd_month_seq between 1199 and 1199 + 11'. This means that we will only need a range of partitions from the Inventory table, but that range will only be revealed during runtime. Scheduling neighbouring partitions to different executors means that the surviving partitions are spread across executors more evenly. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Reviewed-on: http://gerrit.cloudera.org:8080/20973 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 64 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 31 Jan 2024 00:39:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10216/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 20:05:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 20:05:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15113/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 18:40:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 18:23:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/20973/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20973/5//COMMIT_MSG@41 PS5, Line 41: > nit: that Done http://gerrit.cloudera.org:8080/#/c/20973/5/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/20973/5/tests/query_test/test_iceberg.py@1025 PS5, Line 1025: splits = [l.strip() for l in profile.splitlines() if "Hdfs split stats" in l] > nit: impala-flake8 catch 1 issue here: Done -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 18:15:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Hello Riza Suminto, Daniel Becker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20973 to look at the new patch set (#6). Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chance of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. It is especially important for queries that prune partitions via runtime filters (e.g. due to a JOIN), because it doesn't matter that we schedule the scan ranges evenly, the scan ranges that survive the runtime filters can still be clustered on certain executors. E.g. TPC-DS Q22 has the following JOIN and WHERE predicates: inv_date_sk=d_date_sk and d_month_seq between 1199 and 1199 + 11 The Inventory table is partitioned by column inv_date_sk, and we filter the rows in the joined table by 'd_month_seq between 1199 and 1199 + 11'. This means that we will only need a range of partitions from the Inventory table, but that range will only be revealed during runtime. Scheduling neighbouring partitions to different executors means that the surviving partitions are spread across executors more evenly. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 64 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/6 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 18:00:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15104/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:34:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15103/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:33:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10215/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:27:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15102/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:25:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:12:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/20973/3/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/20973/3/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@55 PS3, Line 55: need > Nit: "that need to be". Went with "needed". -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:09:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Hello Riza Suminto, Daniel Becker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20973 to look at the new patch set (#5). Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chance of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. It is especially important for queries that prune partitions via runtime filters (e.g. due to a JOIN), because it doesn't matter that we schedule the scan ranges evenly, the scan ranges that survive the runtime filters can still be clustered on certain executors. E.g. TPC-DS Q22 has the following JOIN and WHERE predicates: inv_date_sk=d_date_sk and d_month_seq between 1199 and 1199 + 11 The Inventory table is partitioned by column inv_date_sk, and we filter the rows in the joined table by 'd_month_seq between 1199 and 1199 + 11'. This means the we will only need a range of partitions from the Inventory table, but that range will only be revealed during runtime. Scheduling neighbouring partitions to different executors means that the surviving partitions are spread across executors more evenly. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 63 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/5 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Hello Riza Suminto, Daniel Becker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20973 to look at the new patch set (#4). Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chance of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. It is especially important for queries that prune partitions via runtime filters (e.g. due to a JOIN), because it doesn't matter that we schedule the scan ranges evenly, the scan ranges that survive the runtime filters can still be clustered on certain executors. E.g. TPC-DS Q22 has the following JOIN and WHERE predicates: inv_date_sk=d_date_sk and d_month_seq between 1199 and 1199 + 11 The Inventory table is partitioned by column inv_date_sk, and we filter the rows in the joined table by 'd_month_seq between 1199 and 1199 + 11'. This means the we will only need a range of partitions from the Inventory table, but that range will only be revealed during runtime. Scheduling neighbouring partitions to different executors means that the surviving partitions are spread across executors more evenly. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 63 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/4 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 3: Code-Review+1 (1 comment) Thanks. http://gerrit.cloudera.org:8080/#/c/20973/3/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/20973/3/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@55 PS3, Line 55: need Nit: "that need to be". -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:05:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/20973/3/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/20973/3/tests/query_test/test_iceberg.py@1071 PS3, Line 1071: o flake8: E501 line too long (92 > 90 characters) http://gerrit.cloudera.org:8080/#/c/20973/3/tests/query_test/test_iceberg.py@1083 PS3, Line 1083: \ flake8: W605 invalid escape sequence '\d' http://gerrit.cloudera.org:8080/#/c/20973/3/tests/query_test/test_iceberg.py@1083 PS3, Line 1083: \ flake8: W605 invalid escape sequence '\(' http://gerrit.cloudera.org:8080/#/c/20973/3/tests/query_test/test_iceberg.py@1083 PS3, Line 1083: \ flake8: W605 invalid escape sequence '\d' http://gerrit.cloudera.org:8080/#/c/20973/3/tests/query_test/test_iceberg.py@1083 PS3, Line 1083: \ flake8: W605 invalid escape sequence '\)' -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:00:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 3: (5 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20973/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20973/2//COMMIT_MSG@25 PS2, Line 25: e > Nit: chance. Done http://gerrit.cloudera.org:8080/#/c/20973/2//COMMIT_MSG@27 PS2, Line 27: With this patch, IcebergScanNode orders its file descriptors based on > Could you elaborate why it i beneficial to assign neighbouring partitions t Added some details and examples. http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@55 PS1, Line 55: // List of files need to be scanned by t > It is only sorted if the table is partitioned, isn't it? Currently yes, because there's no need to sort ranges of unpartitioned tables. OTOH, that might wouldn't add too much overhead, and the code would become simpler. http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@210 PS1, Line 210: verride : protected Map It is only sorted if the table is partitioned, isn't it? Yes, I added a condition to the sort. http://gerrit.cloudera.org:8080/#/c/20973/2/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/20973/2/tests/query_test/test_iceberg.py@1086 PS2, Line 1086: for files_rejected_str in files_rejected_array: > Optional: I find 'continue' to be a bit more difficult to follow than a con Done -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 30 Jan 2024 13:00:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Hello Riza Suminto, Daniel Becker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20973 to look at the new patch set (#3). Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chance of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. It is especially important for queries that prune partitions via runtime filters (e.g. due to a JOIN), because it doesn't matter that we schedule the scan ranges evenly, the scan ranges that survive the runtime filters can still be clustered on certain executors. E.g. TPC-DS Q22 has the following JOIN and WHERE predicates: inv_date_sk=d_date_sk and d_month_seq between 1199 and 1199 + 11 The Inventory table is partitioned by column inv_date_sk, and we filter the rows in the joined table by 'd_month_seq between 1199 and 1199 + 11'. This means the we will only need a range of partitions from the Inventory table, but that range will only be revealed during runtime. Scheduling neighbouring partitions to different executors means that the surviving partitions are spread across executors more evenly. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 62 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/3 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/20973/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20973/2//COMMIT_MSG@25 PS2, Line 25: s Nit: chance. http://gerrit.cloudera.org:8080/#/c/20973/2//COMMIT_MSG@27 PS2, Line 27: With this patch, IcebergScanNode orders its file descriptors based on Could you elaborate why it i beneficial to assign neighbouring partitions to different executors? http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@55 PS1, Line 55: private List fileDescs_; > Put comment that this is always ordered. It is only sorted if the table is partitioned, isn't it? http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@210 PS1, Line 210: List orderedFds = Lists.newArrayList(fileDescs_); : Collections.sort(orderedFds); > Now that fileDescs_ is always sorted, is this still needed? It is only sorted if the table is partitioned, isn't it? http://gerrit.cloudera.org:8080/#/c/20973/2/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/20973/2/tests/query_test/test_iceberg.py@1086 PS2, Line 1086: if files_rejected == 0: continue Optional: I find 'continue' to be a bit more difficult to follow than a conditional, especially that there is only one line after it. -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 30 Jan 2024 12:14:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15092/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 18:48:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@55 PS1, Line 55: private List fileDescs_; Put comment that this is always ordered. http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@210 PS1, Line 210: List orderedFds = Lists.newArrayList(fileDescs_); : Collections.sort(orderedFds); Now that fileDescs_ is always sorted, is this still needed? -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 18:30:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20973 to look at the new patch set (#2). Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chances of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. Queries that operate on a range of partitions are quite common, so it makes sense to optimize this case. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 50 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/2 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1071 PS1, Line 1071: o flake8: E501 line too long (92 > 90 characters) http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\d' http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\(' http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\d' http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\)' -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 29 Jan 2024 18:19:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20973 Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N replicas as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N replicas as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chances of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. Queries that operate on a range of partitions are quite common, so it makes sense to optimize this case. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 50 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/1 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy