[Impala-ASF-CR] IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause
Zihao Ye has posted comments on this change. ( http://gerrit.cloudera.org:8080/20968 ) Change subject: IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20968/1/be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc File be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/20968/1/be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc@198 PS1, Line 198: *eos = true; I think if we exit the loop because of row_batch->AtCapacity(), *eos should not be set to true, because we haven't reached the limit yet and there are more rows to scan. -- To view, visit http://gerrit.cloudera.org:8080/20968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea73716fe475c8b063235d2ae971a4074b8e2a20 Gerrit-Change-Number: 20968 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 30 Jan 2024 07:53:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/20950/2/docs/topics/impala_identifiers.xml File docs/topics/impala_identifiers.xml: http://gerrit.cloudera.org:8080/#/c/20950/2/docs/topics/impala_identifiers.xml@54 PS2, Line 54: The maximum length of an identifier is currently 128 characters except for column names which > It's 767 characters for column names, the same is applied in hive. Ack. Double checked in Hive codes: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/resources/package.jdo#L24-L25 It's added in HIVE-1364. http://gerrit.cloudera.org:8080/#/c/20950/4/docs/topics/impala_identifiers.xml File docs/topics/impala_identifiers.xml: http://gerrit.cloudera.org:8080/#/c/20950/4/docs/topics/impala_identifiers.xml@62 PS4, Line 62: can contain unicode characters nit: This is talking about the first charactor. Let's use "can start with any unicode character". BTW, is it truth that column names can start with whitespaces? -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 07:47:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12744: Support for regr count() aggregate/analytic function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20936 ) Change subject: IMPALA-12744: Support for regr_count() aggregate/analytic function .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15098/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20936 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I773d1e0edc8a9c8ee003f75721f4844685b2eb38 Gerrit-Change-Number: 20936 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 30 Jan 2024 07:24:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12744: Support for regr count() aggregate/analytic function
pranav.lo...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/20936 ) Change subject: IMPALA-12744: Support for regr_count() aggregate/analytic function .. Patch Set 2: (10 comments) > Uploaded patch set 2. http://gerrit.cloudera.org:8080/#/c/20936/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20936/1//COMMIT_MSG@7 PS1, Line 7: aggregate > If it can be used as both an aggregate and an analytical function, having " Done http://gerrit.cloudera.org:8080/#/c/20936/1//COMMIT_MSG@9 PS1, Line 9: regr_count > Nit: "The regr_count() functin..." Done http://gerrit.cloudera.org:8080/#/c/20936/1//COMMIT_MSG@9 PS1, Line 9: as > Nit: "as an" Done http://gerrit.cloudera.org:8080/#/c/20936/1//COMMIT_MSG@9 PS1, Line 9: and > Nit: "and an" Done http://gerrit.cloudera.org:8080/#/c/20936/1//COMMIT_MSG@10 PS1, Line 10: function > Nit: function (singular). Done http://gerrit.cloudera.org:8080/#/c/20936/1//COMMIT_MSG@10 PS1, Line 10: s c > Nit: is. Done http://gerrit.cloudera.org:8080/#/c/20936/1/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/20936/1/be/src/exprs/aggregate-functions-ir.cc@315 PS1, Line 315: CHECK(!dst->is_null); : if (!src1.is_null && !src2.is_null) { : --dst->val; > Is this different here than in the non-timestamp version, i.e. on L300? Removed http://gerrit.cloudera.org:8080/#/c/20936/1/be/src/exprs/aggregate-functions.h File be/src/exprs/aggregate-functions.h: http://gerrit.cloudera.org:8080/#/c/20936/1/be/src/exprs/aggregate-functions.h@74 PS1, Line 74: > We don't use 'ctx' and in the above two functions this variable was unnamed Done http://gerrit.cloudera.org:8080/#/c/20936/1/testdata/workloads/functional-query/queries/QueryTest/aggregation.test File testdata/workloads/functional-query/queries/QueryTest/aggregation.test: http://gerrit.cloudera.org:8080/#/c/20936/1/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@2713 PS1, Line 2713: 1 in > This doesn't return null, it returns 1. Ack http://gerrit.cloudera.org:8080/#/c/20936/1/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@2802 PS1, Line 2802: rand() > Is there a specific reason for using rand() and not a simple constant? No, not really, it was brought up in corr() patch I think and since then it's kept in the same way to maintain uniformity. -- To view, visit http://gerrit.cloudera.org:8080/20936 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I773d1e0edc8a9c8ee003f75721f4844685b2eb38 Gerrit-Change-Number: 20936 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 30 Jan 2024 07:01:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12744: Support for regr count() aggregate/analytic function
pranav.lo...@cloudera.com has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/20936 ) Change subject: IMPALA-12744: Support for regr_count() aggregate/analytic function .. IMPALA-12744: Support for regr_count() aggregate/analytic function regr_count() function can be used both as aggregate and analytic function and is commonly used in regression analysis. regr_count(y, x) returns an integer that is the number of non-null number pairs. It indicates how many observations are included in the analysis. Testing: The functions are extensively tested and cross-checked with Hive. The tests can be found in aggregation.test. Change-Id: I773d1e0edc8a9c8ee003f75721f4844685b2eb38 --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/workloads/functional-query/queries/QueryTest/aggregation.test 4 files changed, 303 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/20936/2 -- To view, visit http://gerrit.cloudera.org:8080/20936 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I773d1e0edc8a9c8ee003f75721f4844685b2eb38 Gerrit-Change-Number: 20936 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. Patch Set 4: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/747/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 06:45:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. Patch Set 3: Verified-1 Build Failed https://jenkins.impala.io/job/gerrit-docs-auto-test/746/ : Doc tests failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 06:39:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
pranav.lo...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. Patch Set 4: (4 comments) > Uploaded patch set 4. http://gerrit.cloudera.org:8080/#/c/20950/2/docs/topics/impala_identifiers.xml File docs/topics/impala_identifiers.xml: http://gerrit.cloudera.org:8080/#/c/20950/2/docs/topics/impala_identifiers.xml@54 PS2, Line 54: The maximum length of an identifier is currently 128 characters except for column names which > Could you double check this? What's the longest length we can used when usi It's 767 characters for column names, the same is applied in hive. http://gerrit.cloudera.org:8080/#/c/20950/2/docs/topics/impala_identifiers.xml@60 PS2, Line 60: > Add "except for column names which can start with unicode characters" Done http://gerrit.cloudera.org:8080/#/c/20950/2/docs/topics/impala_identifiers.xml@67 PS2, Line 67: > Add "except for column names which can contain unicode characters" Done http://gerrit.cloudera.org:8080/#/c/20950/2/docs/topics/impala_identifiers.xml@124 PS2, Line 124: > I think we'd better merge this into the previous items. Done -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 06:39:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. Patch Set 4: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/747/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 06:39:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
pranav.lo...@cloudera.com has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. IMPALA-12756: [DOCS] Unicode column name support documentation The patch focuses on documenting that Impala supports unicode column names, consistent with Hive's current support (as we use Hive MetaStore to store table metadata). Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe --- M docs/topics/impala_identifiers.xml 1 file changed, 6 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/20950/4 -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
pranav.lo...@cloudera.com has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. IMPALA-12756: [DOCS] Unicode column name support documentation The patch focuses on documenting that Impala supports unicode column names, consistent with Hive's current support (as we use Hive MetaStore to store table metadata). Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe --- M docs/topics/impala_identifiers.xml 1 file changed, 6 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/20950/3 -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-12756: [DOCS] Unicode column name support documentation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20950 ) Change subject: IMPALA-12756: [DOCS] Unicode column name support documentation .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/746/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/20950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe Gerrit-Change-Number: 20950 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 06:34:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976
Riza Suminto has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20972 ) Change subject: IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 .. IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 This patch upgrade Guava version from 31.1-jre to 32.0.1-jre to address CVE-2023-2976. Testing: - Run and pass full build ./buildall.sh -skiptests -notests Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 Reviewed-on: http://gerrit.cloudera.org:8080/20972 Tested-by: Impala Public Jenkins Reviewed-by: Joe McDonnell --- M bin/impala-config.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Joe McDonnell: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/20972 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 Gerrit-Change-Number: 20972 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/20972 ) Change subject: IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20972 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 Gerrit-Change-Number: 20972 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 30 Jan 2024 05:31:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20916 ) Change subject: IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibac5335c65a860963ef0ccd890a926af80585ef3 Gerrit-Change-Number: 20916 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 04:43:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20972 ) Change subject: IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20972 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 Gerrit-Change-Number: 20972 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 30 Jan 2024 03:47:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12762: Fix cmake error in package building
Zihao Ye has posted comments on this change. ( http://gerrit.cloudera.org:8080/20965 ) Change subject: IMPALA-12762: Fix cmake error in package building .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/20965 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice0cbb0671d915f997fa74217521a82be164ae57 Gerrit-Change-Number: 20965 Gerrit-PatchSet: 1 Gerrit-Owner: Yifan Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 30 Jan 2024 01:49:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15097/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 14 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 30 Jan 2024 01:05:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12503: Support date data type for predicates for external data source table
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/20915 ) Change subject: IMPALA-12503: Support date data type for predicates for external data source table .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/20915/10/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java File java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java: http://gerrit.cloudera.org:8080/#/c/20915/10/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@219 PS10, Line 219: dateCol+ 1 Need to set the time zone for the formatter as 'UDC' to avoid +1: formatter.setTimeZone(TimeZone.getTimeZone("UDC")); -- To view, visit http://gerrit.cloudera.org:8080/20915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf13cbefaad812a0f78755c5791d82b24a3395e4 Gerrit-Change-Number: 20915 Gerrit-PatchSet: 10 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Tue, 30 Jan 2024 00:51:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12426: Query History Table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/20770/14/tests/custom_cluster/test_query_log.py File tests/custom_cluster/test_query_log.py: http://gerrit.cloudera.org:8080/#/c/20770/14/tests/custom_cluster/test_query_log.py@31 PS14, Line 31: from tests.util.assert_time import assert_time_str flake8: F401 'tests.util.assert_time.assert_time_str' imported but unused -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 14 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 30 Jan 2024 00:39:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12426: Query History Table
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 14: (18 comments) http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/impala-server.h@1261 PS13, Line 1261: /// The QueryDriverMap maps query ids to QueryDrivers. The QueryDrivers are owned by the : /// ImpalaServer and QueryDriverMap references them using shared_ptr to allow : /// asynchronous deletion. : /// > Mention in comment if they are synchronize using completed_queries_lock_. Done http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/impala-server.h@1653 PS13, Line 1653: }; > nit: unnecessary added whitespace Done http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h File be/src/service/query-state-record.h: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h@183 PS13, Line 183: class EventsTimelineIterator { > Missing code comments outlining the purpose of this class, and what each fu Holding on this as we are moving towards defining individual fields for specific events in the timeline and only storing those specific events. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h@192 PS13, Line 192: EventsTimelineIterator(const std::vector* labels, > These would make more sense as const&. Passing nullptr is invalid, and we'r Holding on this as we are moving towards defining individual fields for specific events in the timeline and only storing those specific events. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h@206 PS13, Line 206: iter_t operator*() const; > This class is serving two purposes that might be better served as two separ Holding on this as we are moving towards defining individual fields for specific events in the timeline and only storing those specific events. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h@212 PS13, Line 212: EventsTimelineIterator cbegin(); > If you called these begin/end, I think you could use this iterator as part Putting this on hold for now as we are moving towards a defined list of events store in individual fields instead of storing the entire events timeline. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h@287 PS13, Line 287: }; // struct QueryStateExpanded > nit: it looks strange to prefix things with C just to say they're const. Holding on this as we are moving towards defining individual fields for specific events in the timeline and only storing those specific events. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.cc File be/src/service/query-state-record.cc: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.cc@182 PS13, Line 182: ->GetInfoString(AdmissionController::PROFILE_INFO_KEY_EXECUTOR_GROUP); > This and compute_scan_range_assignment are only initialized if summary_prof Both these fields are being dropped from this table. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.cc@252 PS13, Line 252: > nit: you don't need to use cbegin/cend on const objects. See https://en.cpp Holding on this as we are moving towards defining individual fields for specific events in the timeline and only storing those specific events. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/workload-management.cc@461 PS13, Line 461: > Can we remove these when we store it instead? Done http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/workload-management.cc@615 PS13, Line 615: || !FLAGS_enable_workload_mgmt){ : return; : } : > This might not catch query such as: Good catch. This code does miss use/show sql statements that start with comments. I can use the existing catalog_op_type to ignore the use/show sqls. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/util/CMakeLists.txt File be/src/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/util/CMakeLists.txt@272 PS13, Line 272: ADD_UNIFIED_BE_LSAN_TEST(string-util-test "TruncateDownTest.*:TruncateUpTest.*:CommaSeparatedContainsTest.*:FindUtf8PosForwardTest.*:FindUtf8PosBackwardTest.*:RandomFindUtf8PosTest.*:ToSnakeCaseTest.*:StringStreamPopTest.*") > typo: StringStreamPopTest Done http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/util/string-util-test.cc File be/src/util/string-util-test.cc: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/util/string-util-test.cc@319 PS13, Line 3
[Impala-ASF-CR] IMPALA-12426: Query History Table
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20770 to look at the new patch set (#14). Change subject: IMPALA-12426: Query History Table .. IMPALA-12426: Query History Table Adds the ability for users to specify that Impala will create and maintain an internal Iceberg table that contains data about all completed queries. This table is automatically created at startup by each coordinator if it does not exist. Then, most completed queries are queued in memory and flushed to the query history table at a set interval (either minutes or number of records). Set, use, and show queries are not written to this table. This commit leverages the InternalServer class to maintain the query history table. Ctest unit tests have been added to assert the various pieces of code. New custom cluster tests have been added to assert the query history table is properly populated with completed queries. Negative testing consists of attempting sql injection attacks and syntactically incorrect queries. Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 --- M be/src/benchmarks/expr-benchmark.cc M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/CMakeLists.txt M be/src/service/client-request-state.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/internal-server-test.cc M be/src/service/internal-server.cc M be/src/service/internal-server.h A be/src/service/query-state-record.cc A be/src/service/query-state-record.h A be/src/service/workload-management.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h A be/src/util/network-util-test.cc M be/src/util/network-util.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/string-util-test.cc M be/src/util/string-util.cc M be/src/util/string-util.h A be/src/util/ticker.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M bin/run_clang_tidy.sh M common/function-registry/impala_functions.py M common/thrift/BackendGflags.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java M tests/beeswax/impala_beeswax.py M tests/common/custom_cluster_test_suite.py A tests/custom_cluster/test_query_log.py A tests/util/assert_time.py A tests/util/memory.py A tests/util/retry.py 47 files changed, 3,309 insertions(+), 336 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/14 -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 14 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20916 ) Change subject: IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15096/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibac5335c65a860963ef0ccd890a926af80585ef3 Gerrit-Change-Number: 20916 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 00:33:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 35: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10210/ -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 35 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 30 Jan 2024 00:30:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/20916 ) Change subject: IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE .. Patch Set 2: (6 comments) Hi all, I have addressed Csaba's and Quanlong's comments on patch set 1. Please let me know if there is any additional suggestion. Thank you very much for your help! http://gerrit.cloudera.org:8080/#/c/20916/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20916/1//COMMIT_MSG@33 PS1, Line 33: the principal type could be : a group or a role. > Will this be supported in future tasks? I.e. a user belongs to the owner gr Thanks Quanlong! Taking a look at the definition of the class GrantRevokeRequest at https://github.com/apache/ranger/blob/master/agents-common/src/main/java/org/apache/ranger/plugin/util/GrantRevokeRequest.java, I found that there is no field that denotes the type of the owner. The only field related to the owner of a resource is 'ownerUser' and thus it seems Ranger implicitly assumes that the provided owner corresponds to a user. With the above being said, it looks like Impala would not be able to support allowing a user belonging to an owner group to execute GRANT and REVOKE unless Ranger adds an additional field in GrantRevokeRequest to distinguish an owner user from an owner group. For easy reference, we provide the class definition in the following. public class GrantRevokeRequest implements Serializable { private static final long serialVersionUID = 1L; private String grantor; private Set grantorGroups; private Map resource; private Set users; private Set groups; private Set roles; private Set accessTypes; private ListforwardedAddresses; private String remoteIPAddress; private Boolean delegateAdmin = Boolean.FALSE; private Boolean enableAudit= Boolean.TRUE; private Boolean replaceExistingPermissions = Boolean.FALSE; private Boolean isRecursive= Boolean.FALSE; private String clientIPAddress; private String clientType; private String requestData; private String sessionId; private String clusterName; private String zoneName; private String ownerUser; ... } http://gerrit.cloudera.org:8080/#/c/20916/1/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java File fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java: http://gerrit.cloudera.org:8080/#/c/20916/1/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java@236 PS1, Line 236: analyzeTargetDatabase(analyzer); : break; : case URI: : Preconditions.checkNotNull(uri_); : if (privilegeLevel_ != TPrivilegeLevel.ALL) { : throw new AnalysisException("Only 'ALL' privilege may be applied at " + : "URI scope in privilege spec."); : } > nit: We can move codes of the precondition check and the try-catch clause i Done http://gerrit.cloudera.org:8080/#/c/20916/1/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/20916/1/tests/authorization/test_ranger.py@1191 PS1, Line 1191: revoke_table_stmt, set_table_owner_group_stmt, resource_owner_group, > +1. This test has ~300 lines. It'd be better to refactor it. Thanks Csaba and Quanlong! Yeah. Indeed. I will move the code for each case to its respective sub-function in the next patch. http://gerrit.cloudera.org:8080/#/c/20916/1/tests/authorization/test_ranger.py@1198 PS1, Line 1198: admin_client, revoke_column_stmt, set_table_owner_group_stmt, > nit: do we really need this refresh? I thought we only need it when we add/ Thanks Quanlong! I am actually not sure whether or not after a GRANT statement we really need a REFRESH AUTHORIZATION before a subsequent SHOW GRANT statement. I added the REFRESH statements because I found that there are several places (but not every place) in this test file that have the same pattern (GRANT/REVOKE, REFRESH, SHOW GRANT). For instance, in _test_grant_revoke(), we have the following. self.execute_query_expect_success(admin_client, "grant select on database {0} to {1} {2}" .format(unique_database, kw, ident), user=ADMIN) self._refresh_authorization(admin_client, refresh_stmt)
[Impala-ASF-CR] IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE
Hello Quanlong Huang, Aman Sinha, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20916 to look at the new patch set (#2). Change subject: IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE .. IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE After RANGER-1200, Ranger allows the owner user of a resource to grant/revoke a privilege to/from a grantee/revokee, which requires the client of the Ranger server to provide the ownership information in the requests for granting and revoking accesses. Before this patch, Impala did not provide its Ranger plug-in with the owner user of resource in the GRANT and REVOKE statements and thus the owner user of a resource was not able to grant/revoke a privilege to/from other principals. This patch passes to the Ranger server the owner user of resource in the GRANT and REVOKE statements when the resource is a database, a table, or a column. This way the owner user does not have to be explicitly granted additional privileges on the resource to execute the GRANT and REVOKE statements for the aforementioned resource types. For user-defined functions, we will deal with this resource type in IMPALA-12685 in that it depends on IMPALA-11743 where we will have to make Impala load from Hive MetaStore the owner user of a user-defined function. The patch also fixes a potential bug in getOwnerUser() of Db, LocalDb, Table, and LocalTable. Specifically, before this patch, when determining the owner user of a database or a table, Impala returned the owner name without verifying the corresponding principal type is indeed a user. This was problematic because the principal type could be a group or a role. In addition, we note that Ranger assumes implicitly that the provided owner is a user. This could be seen from the definition of GrantRevokeRequest. Before Ranger adds an additional field in GrantRevokeRequest to distinguish an owner user from an owner group, Impala will not be able to support allowing a user in an owner group to grant or revoke privileges on the resources owned by the owner group. Testing: - Added an end-to-end test to verify that the owner user of a resource is able to execute the GRANT/REVOKE statements without being granted additional privileges if the resource is a database, a table, or a column. Change-Id: Ibac5335c65a860963ef0ccd890a926af80585ef3 --- M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java M tests/authorization/test_ranger.py 10 files changed, 371 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/20916/2 -- To view, visit http://gerrit.cloudera.org:8080/20916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibac5335c65a860963ef0ccd890a926af80585ef3 Gerrit-Change-Number: 20916 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20916 ) Change subject: IMPALA-12578: Pass owner user of database and table to Ranger in GRANT/REVOKE .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10212/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibac5335c65a860963ef0ccd890a926af80585ef3 Gerrit-Change-Number: 20916 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 30 Jan 2024 00:09:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12463: Batch non-consecutive table events in the event processor
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20533 ) Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor .. IMPALA-12463: Batch non-consecutive table events in the event processor The current batching of events requires events to be consecutive. When there are multiple tables being modified, events can be interleaved such that each batch is very small. If the batching criteria can be relaxed, the non-consecutive events could be batched and processed more efficiently. This implements batching for non-consecutive events by keeping state on each table individually. Different tables can continue to accumulate batchable events independently unless they hit a condition that cuts the batch. The batching can ignore some events on unrelated tables, but the same rules apply about the batching of events on an individual table. For example, for a particular table, any non-INSERT event between two INSERT events on that table continues to cut the batching. In addition, there are certain cross-table events that need to cut batches across multiple tables: 1. Drop database / alter database cuts any batches on tables in the affected database. 2. Alter table rename cuts any batches on the source or destination table. This emits events in monotonically increasing order by maintaining the resulting events in a sorted map. All non-batchable events will be emitted in the original order. Batchable events are emitted based on the ending Event ID of the batch. This means that batchable events can move later in the sequence, but they cannot move earlier. This is based on the original design by Wenzhe Zhou. Testing: - MetastoreEventsProcessorTest has new tests for interleaved events on two tables as well as tests for events that cut batches across tables (alter table, drop database, alter database). - A core job showed no other test failures. Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Reviewed-on: http://gerrit.cloudera.org:8080/20533 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 2 files changed, 379 insertions(+), 66 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 12 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages When using bin/dump_breakpad_symbols.py to dump symbols for RPM/DEB packages, the script extracts the packages to a temporary directory and relies on keeping that directory around until the processing is finished. The parallel processing added in IMPALA-11511 breaks the logic that keeps the temporary directory around, so the script generates errors like: Found debugging info in /tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug Failed to open ELF file '/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug': No such file or directory Failed to write symbol file. This turns off parallelism for bin/dump_breakpad_symbols.py when processing RPM/DEB packages (i.e. -r/--pkg). This also avoids using a ThreadPool when num_processes <= 1. Testing: - Hand tested with Redhat 7 RPMs Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Reviewed-on: http://gerrit.cloudera.org:8080/20943 Reviewed-by: Impala Public Jenkins Tested-by: Joe McDonnell --- M bin/dump_breakpad_symbols.py 1 file changed, 21 insertions(+), 11 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved Joe McDonnell: Verified -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Patch Set 3: Verified+1 The test failure is unrelated -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 23:33:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Joe McDonnell has removed a vote on this change. Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10208/ -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 23:28:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20972 ) Change subject: IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10211/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20972 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 Gerrit-Change-Number: 20972 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 23:17:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12463: Batch non-consecutive table events in the event processor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20533 ) Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 11 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 29 Jan 2024 23:07:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 13: (18 comments) http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1256 PS12, Line 1256: std::unique_ptr admission_heartbeat_thr > Yes. This boolean is used to guard against spurious wakeups of the complet Ack http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1281 PS12, Line 1281: : /// Default query options in the form of TQueryOptions and b > Done Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1797 PS12, Line 1797: : : > Done Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.cc File be/src/service/internal-server.cc: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.cc@117 PS12, Line 117: internal_query_id.hi != 0 && internal_query_id.lo != 0 > Done Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.cc@121 PS12, Line 121: session_id > The CloseSession function does all the necessary checks on its own. Ack http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/internal-server.cc File be/src/service/internal-server.cc: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/internal-server.cc@117 PS13, Line 117: internal_query_id.hi != 0 && internal_query_id.lo != 0 Use the new function in uid-util.h http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.cc File be/src/service/query-state-record.cc: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.cc@216 PS13, Line 216: if (labels_iter_->find("Ready to start on ", 0) == 0) { : it.first = READY_TO_START; : } else if(labels_iter_->find("All ", 0) == 0) { : it.first = ALL_EXEC_BACKENDS; Can you make the corresponding event string in coordinator.cc into constant and then use that constant for find argument here? http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/query-state.cc File be/src/service/query-state.cc: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/query-state.cc@95 PS12, Line 95: > I would prefer to leave this code as-is since it was directly moved from ht Ack http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc@50 PS12, Line 50: : DEFINE_bool(enable_workload_mgmt, false, : "Specifies if Impala will automatically write completed queries in the query log " : "table. If this value is set to true and then later removed, the query log table " : "will remain intact and accessible."); : > I've thought about this some, and I think we should go with a bool flag for Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc@63 PS12, Line 63: return false; : }); : > Done Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc@186 PS12, Line 186: /// table. : static std::list> fields_; : > I added comments in impala-server.h for the functions implemented in worklo Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc@494 PS12, Line 494: fields_.push_back(MakeTuple("network_address", _clientNetworkAddress)); : fields_.push_back(MakeTuple("start_time_utc", _queryStartTime, "TIMESTAMP")); : fields_.push_back(MakeTuple("total_time_us", _queryDuration, "BIGINT")); : fields_.push_back(MakeTuple("s > No, field order is not important except to the custom cluster python test ( Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc@598 PS12, Line 598: : void ImpalaServer::ShutdownWorkloadManagement() { : unique_lock l(completed_queries_threadstate_mu_); : if (completed_queries_thread_state_ == RUNNING) { : completed_queries_thread_state_ = SHUTDOWN_REQUESTED; : completed_queries_cv_.notify_all(); : completed_queries_shutdown_cv_.wait_for(l, : std::chrono::seconds(FLAGS_query_log_shutdown_deadline_s), : []{ return completed_queries_thread_state_ == SHUTDOWN; }); : } > Modifications to the completed queries table will be handled in the future. Ack http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/util/network-util.h File be/
[Impala-ASF-CR] IMPALA-12378: Add jar file of JdbcDataSource library to classpath
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20971 ) Change subject: IMPALA-12378: Add jar file of JdbcDataSource library to classpath .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f Gerrit-Change-Number: 20971 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Mon, 29 Jan 2024 21:48:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 13: (31 comments) http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/CMakeLists.txt File be/src/service/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/CMakeLists.txt@46 PS12, Line 46: query-state-re > I recommend to rename this to query-state-record.cc to not confuse with be/ Good idea. Renamed. http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1256 PS12, Line 1256: std::unique_ptr admission_heartbeat_thr > Is this protected by completed_queries_lock_ as well? Yes. This boolean is used to guard against spurious wakeups of the completed_queries_thread_. I added a separate comment on the completed_queries_ticker_ and completed_queries_ready_ variables. http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1281 PS12, Line 1281: : /// Default query options in the form of TQueryOptions and b > Mention constant/flag that represent maximum number of attempts in this doc Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1797 PS12, Line 1797: : : > Is this also immutable like QueryStateRecord? Please clarify in comment. Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1861 PS12, Line 1861: > You'd need to std::move the value into place in the constructor. Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.h@1861 PS12, Line 1861: > Making this move-only is unnecessarily specific. It could take by value and Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.cc@3192 PS12, Line 3192: > nit: redundant "this->" here and below? Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/impala-server.cc@3433 PS12, Line 3433: return conn_id; : } : } // namespace impala > nit: unnecessary whitespace. Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.h File be/src/service/internal-server.h: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.h@96 PS12, Line 96: /// `persist_in_db` Optional boolean indicating if the query data should be : /// written to the completed queries table afte > nit: Clarify in the doc that `persist_in_db` is optional and default to Tru Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.cc File be/src/service/internal-server.cc: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.cc@117 PS12, Line 117: internal_query_id.hi != 0 && internal_query_id.lo != 0 > Move this check to uid-util.h Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/internal-server.cc@121 PS12, Line 121: session_id > Should this went through the same check as internal_query_id before closing The CloseSession function does all the necessary checks on its own. The reason the query id had to be checked is because query execution could fail early enough that a query id was never assigned. http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h File be/src/service/query-state-record.h: http://gerrit.cloudera.org:8080/#/c/20770/13/be/src/service/query-state-record.h@161 PS13, Line 161: private: > nit: private should be indented differently than the other lines. Previousl Done http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/query-state.cc File be/src/service/query-state.cc: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/query-state.cc@95 PS12, Line 95: > nit: other place use this from debug-util.h: I would prefer to leave this code as-is since it was directly moved from https://github.com/jasonmfehr/impala/blob/cdac777c51febc99500b8426c2b3aabc7e9addd7/be/src/service/impala-server.cc#L2572 http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/12/be/src/service/workload-management.cc@50 PS12, Line 50: : DEFINE_bool(enable_workload_mgmt, false, : "Specifies if Impala will automatically write completed queries in the query log " : "table. If this value is set to true and then later removed, the query log table " : "will remain intact and accessible."); : > Since it only support iceberg table now, shall this turn to bool flag? I've thought about this some, and
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 34: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10209/ -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 34 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 20:39:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 35: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15095/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 35 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 20:24:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20367 to look at the new patch set (#35). Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs The idea is that when any DDL/DML operation is performed by Impala, it also syncs the db/table to its latest event ID as per HMS. This way updates to a db/table's are applied in the same order as they appear in the Notification log table in HMS which ensures consistency. Currently catalogD applies any updates received from Impala clients in-place. Instead it should perform an HMS operation first and then replay all the HMS events since the last synced event id. Implementation: when the enable_sync_to_latest_event_on_ddls flag is set to true, we do the DDL/DML operation first, i.e., perform HMS operation and then sync the db/table in the catalogD's cache to the latest event in HMS for the corresponding db/table. Currently we fetch all events greater than the db/table's lastSyncEventId and filter them and if possible batch them in the events processor to sync only the current db/table events. Once HIVE-27499 is implemented, we can directly fetch the events only for the respective db/table and process them. Currently, there is no efficient way to identify if there are pending events for a db/table. Set 'enable_sync_to_latest_event_on_ddls'to true to enable this feature. Performance impact: DDL/DML might need more time to execute due to fetching and applying other events for corresponding metadata object. Note: We don't modify the cache using MetastoreEventsProcessor for alter table rename operation as this is a complex operation regarding cache modification (IMPALA-12553 has more details about this). We also don't modify the cache this way for the truncate/insert table operation if the table is in Iceberg format. We don't modify cache using above process for 'refresh table'/'invalidate metadata table' commands. Testing: 1) Added few tests in the MetaStoreEventProcessorForTest to verify this feature that simulates the metadata sync between HMS and Impala. 2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to the metadata sync between HMS end point, Catalog Metastore Server and Impala. The HMS end point serves as common interface to metadata changes outside the current Impala service such as Hive, Spark or other Impala service. Also verified the table lastSyncEventId is updated after the events are sync and confirmed that metastore event processor ignored these synced events. 3) Added some end-to-end tests in test_sync_to_latest_hms_events.py Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf --- M be/src/catalog/catalog-server.cc M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_sync_to_latest_hms_events.py A tests/metadata/__init__.py A tests/metadata/test_common_ddl.py M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py M tests/metadata/test_recover_partitions.py M tests/metadata/test_reset_metadata.py M tests/util/event_processor_utils.py 26 files changed, 1,619 insertions(+), 755 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/35 -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 35 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong H
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 35: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10210/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 35 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 19:59:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 34: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/15094/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 34 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 19:22:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 34: (5 comments) http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1401 PS33, Line 1401: trace > Let's use trace() since this will be logged for each new loaded partition. Ack http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3357 PS33, Line 3357: : } : if (!isTableBeingReplicated(metaStoreClient.getHiveClient(), hdfsTable) && : !syncToLatestEventId) { : // when table is replicated we let the HMS API handle the file deletion logic : // otherwise we delete the files. : Collection parts = FeCatalogUtils : .loadAllPartitions(hdfsTable); : for (FeFsPartition part : parts) { : FileSystemUtil.deleteAllVisibleFiles(new Path(part.getLocation())); : } > It'd be better to keep these unchanged since previously they are outside th Ack http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py@958 PS33, Line 958: result = cls.__execute_query(impalad_client, query, query_options, user) > Let's check success first, i.e. like what we does in execute_query_expect_s Ack http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py@1046 PS33, Line 1046: > nit: use 'cls' for classmethod Ack http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py@1052 PS33, Line 1052: > nit: use 'cls' for classmethod Ack -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 34 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 19:02:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20367 to look at the new patch set (#34). Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs The idea is that when any DDL/DML operation is performed by Impala, it also syncs the db/table to its latest event ID as per HMS. This way updates to a db/table's are applied in the same order as they appear in the Notification log table in HMS which ensures consistency. Currently catalogD applies any updates received from Impala clients in-place. Instead it should perform an HMS operation first and then replay all the HMS events since the last synced event id. Implementation: when the enable_sync_to_latest_event_on_ddls flag is set to true, we do the DDL/DML operation first, i.e., perform HMS operation and then sync the db/table in the catalogD's cache to the latest event in HMS for the corresponding db/table. Currently we fetch all events greater than the db/table's lastSyncEventId and filter them and if possible batch them in the events processor to sync only the current db/table events. Once HIVE-27499 is implemented, we can directly fetch the events only for the respective db/table and process them. Currently, there is no efficient way to identify if there are pending events for a db/table. Set 'enable_sync_to_latest_event_on_ddls'to true to enable this feature. Performance impact: DDL/DML might need more time to execute due to fetching and applying other events for corresponding metadata object. Note: We don't modify the cache using MetastoreEventsProcessor for alter table rename operation as this is a complex operation regarding cache modification (IMPALA-12553 has more details about this). We also don't modify the cache this way for the truncate/insert table operation if the table is in Iceberg format. We don't modify cache using above process for 'refresh table'/'invalidate metadata table' commands. Testing: 1) Added few tests in the MetaStoreEventProcessorForTest to verify this feature that simulates the metadata sync between HMS and Impala. 2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to the metadata sync between HMS end point, Catalog Metastore Server and Impala. The HMS end point serves as common interface to metadata changes outside the current Impala service such as Hive, Spark or other Impala service. Also verified the table lastSyncEventId is updated after the events are sync and confirmed that metastore event processor ignored these synced events. 3) Added some end-to-end tests in test_sync_to_latest_hms_events.py Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf --- M be/src/catalog/catalog-server.cc M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_sync_to_latest_hms_events.py A tests/metadata/__init__.py A tests/metadata/test_common_ddl.py M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py M tests/metadata/test_recover_partitions.py M tests/metadata/test_reset_metadata.py M tests/util/event_processor_utils.py 26 files changed, 1,620 insertions(+), 755 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/34 -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 34 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong H
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 34: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10209/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 34 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 19:02:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12378: Add jar file of JdbcDataSource library to classpath
gaurav singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/20971 ) Change subject: IMPALA-12378: Add jar file of JdbcDataSource library to classpath .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/20971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f Gerrit-Change-Number: 20971 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Mon, 29 Jan 2024 18:53:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12463: Batch non-consecutive table events in the event processor
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/20533 ) Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor .. Patch Set 11: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 11 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 29 Jan 2024 18:53:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12463: Batch non-consecutive table events in the event processor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20533 ) Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15093/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 11 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 29 Jan 2024 18:56:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 18:53:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10208/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 18:53:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15092/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 18:48:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20972 ) Change subject: IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15091/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20972 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 Gerrit-Change-Number: 20972 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 18:42:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12503: Support date data type for predicates for external data source table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20915 ) Change subject: IMPALA-12503: Support date data type for predicates for external data source table .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15090/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf13cbefaad812a0f78755c5791d82b24a3395e4 Gerrit-Change-Number: 20915 Gerrit-PatchSet: 10 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Mon, 29 Jan 2024 18:39:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12463: Batch non-consecutive table events in the event processor
Hello Quanlong Huang, Daniel Becker, Zoltan Borok-Nagy, Sai Hemanth Gantasala, Csaba Ringhofer, Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20533 to look at the new patch set (#11). Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor .. IMPALA-12463: Batch non-consecutive table events in the event processor The current batching of events requires events to be consecutive. When there are multiple tables being modified, events can be interleaved such that each batch is very small. If the batching criteria can be relaxed, the non-consecutive events could be batched and processed more efficiently. This implements batching for non-consecutive events by keeping state on each table individually. Different tables can continue to accumulate batchable events independently unless they hit a condition that cuts the batch. The batching can ignore some events on unrelated tables, but the same rules apply about the batching of events on an individual table. For example, for a particular table, any non-INSERT event between two INSERT events on that table continues to cut the batching. In addition, there are certain cross-table events that need to cut batches across multiple tables: 1. Drop database / alter database cuts any batches on tables in the affected database. 2. Alter table rename cuts any batches on the source or destination table. This emits events in monotonically increasing order by maintaining the resulting events in a sorted map. All non-batchable events will be emitted in the original order. Batchable events are emitted based on the ending Event ID of the batch. This means that batchable events can move later in the sequence, but they cannot move earlier. This is based on the original design by Wenzhe Zhou. Testing: - MetastoreEventsProcessorTest has new tests for interleaved events on two tables as well as tests for events that cut batches across tables (alter table, drop database, alter database). - A core job showed no other test failures. Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 2 files changed, 379 insertions(+), 66 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/20533/11 -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 11 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12463: Batch non-consecutive table events in the event processor
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/20533 ) Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/20533/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/20533/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@31 PS10, Line 31: import java.util.Iterator; > nit: unused import Done http://gerrit.cloudera.org:8080/#/c/20533/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@440 PS10, Line 440: Map dbMap = pendingTableEventsMap.get(dbName); : if (dbMap == null) { : dbMap = new HashMap<>(); : pendingTableEventsMap.put(dbName, dbMap); : } > nit: Can be replaced with single 'Map.computeIfAbsent' method call Done -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 10 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 29 Jan 2024 18:34:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12463: Batch non-consecutive table events in the event processor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20533 ) Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10207/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 11 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 29 Jan 2024 18:34:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@55 PS1, Line 55: private List fileDescs_; Put comment that this is always ordered. http://gerrit.cloudera.org:8080/#/c/20973/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@210 PS1, Line 210: List orderedFds = Lists.newArrayList(fileDescs_); : Collections.sort(orderedFds); Now that fileDescs_ is always sorted, is this still needed? -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 18:30:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15089/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 18:28:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12695: Crash with UNION with complex types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20953 ) Change subject: IMPALA-12695: Crash with UNION with complex types .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10204/ -- To view, visit http://gerrit.cloudera.org:8080/20953 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 Gerrit-Change-Number: 20953 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Mon, 29 Jan 2024 18:26:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20973 to look at the new patch set (#2). Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N executors as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chances of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. Queries that operate on a range of partitions are quite common, so it makes sense to optimize this case. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 50 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/2 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20973 ) Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1071 PS1, Line 1071: o flake8: E501 line too long (92 > 90 characters) http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\d' http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\(' http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\d' http://gerrit.cloudera.org:8080/#/c/20973/1/tests/query_test/test_iceberg.py@1081 PS1, Line 1081: \ flake8: W605 invalid escape sequence '\)' -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 29 Jan 2024 18:19:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12765: Balance consecutive partitions better for Iceberg tables
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20973 Change subject: IMPALA-12765: Balance consecutive partitions better for Iceberg tables .. IMPALA-12765: Balance consecutive partitions better for Iceberg tables During remote read scheduling Impala does the following: Non-Iceberg tables * The scheduler processes the scan ranges in partition key order * The scheduler selects N replicas as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions are more likely to be assigned to different executors Iceberg tables * The scheduler processes the scan ranges in random order * The scheduler selects N replicas as candidates * The scheduler chooses the executor from the candidates based on minimum number of assigned bytes * So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chances of clustering With this patch, IcebergScanNode orders its file descriptors based on their paths, so we will have a more balanced scheduling for consecutive partitions. Queries that operate on a range of partitions are quite common, so it makes sense to optimize this case. Testing: * e2e test Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M tests/query_test/test_iceberg.py 2 files changed, 50 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/20973/1 -- To view, visit http://gerrit.cloudera.org:8080/20973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578 Gerrit-Change-Number: 20973 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20972 Change subject: IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 .. IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976 This patch upgrade Guava version from 31.1-jre to 32.0.1-jre to address CVE-2023-2976. Testing: - Run and pass full build ./buildall.sh -skiptests -notests Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 --- M bin/impala-config.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/20972/1 -- To view, visit http://gerrit.cloudera.org:8080/20972 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8 Gerrit-Change-Number: 20972 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-12503: Support date data type for predicates for external data source table
gaurav singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/20915 ) Change subject: IMPALA-12503: Support date data type for predicates for external data source table .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/DatabaseAccessor.java File java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/DatabaseAccessor.java: http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/DatabaseAccessor.java@36 PS9, Line 36: dateCol) > nit: rename to dateCol Done http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java File java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java: http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@217 PS9, Line 217: dateCol > rename to dateCol Done http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@219 PS9, Line 219: dateToString = > Naming convention, rename variable dateToString Done -- To view, visit http://gerrit.cloudera.org:8080/20915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf13cbefaad812a0f78755c5791d82b24a3395e4 Gerrit-Change-Number: 20915 Gerrit-PatchSet: 10 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Mon, 29 Jan 2024 18:12:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12503: Support date data type for predicates for external data source table
Hello Abhishek Rawat, Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20915 to look at the new patch set (#10). Change subject: IMPALA-12503: Support date data type for predicates for external data source table .. IMPALA-12503: Support date data type for predicates for external data source table This patch adds support for datatype date as predicates for external data sources. Testing: - Manually tested binary date predicates with operators: '=', '>', '<', 'BETWEEN'. - Modified the existing schema and add test cases for date predicate for postgres, mysql and impala databases. alltypes_impala and AllTypesCaseSensitiveNames_impala Change-Id: Ibf13cbefaad812a0f78755c5791d82b24a3395e4 --- M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java M java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/DatabaseAccessor.java M java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java M java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java M testdata/bin/create-ext-data-source-table.sql M testdata/bin/load-ext-data-sources.sh M testdata/bin/setup-mysql-env.sh M testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test M testdata/workloads/functional-query/queries/QueryTest/mysql-ext-jdbc-tables.test 11 files changed, 218 insertions(+), 81 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/20915/10 -- To view, visit http://gerrit.cloudera.org:8080/20915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf13cbefaad812a0f78755c5791d82b24a3395e4 Gerrit-Change-Number: 20915 Gerrit-PatchSet: 10 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 18:15:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12378: Add jar file of JdbcDataSource library to classpath
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/20971 ) Change subject: IMPALA-12378: Add jar file of JdbcDataSource library to classpath .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20971/1/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java File java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java: http://gerrit.cloudera.org:8080/#/c/20971/1/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@285 PS1, Line 285: FileSystemUtil.copyFileFromUriToLocal(driverUrl); > Remove this copy ? No, this is jar file of jdbc driver, not the jar file of jdbc data source library. -- To view, visit http://gerrit.cloudera.org:8080/20971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f Gerrit-Change-Number: 20971 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Mon, 29 Jan 2024 18:14:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12378: Add jar file of JdbcDataSource library to classpath
gaurav singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/20971 ) Change subject: IMPALA-12378: Add jar file of JdbcDataSource library to classpath .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20971/1/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java File java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java: http://gerrit.cloudera.org:8080/#/c/20971/1/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@285 PS1, Line 285: FileSystemUtil.copyFileFromUriToLocal(driverUrl); Remove this copy ? -- To view, visit http://gerrit.cloudera.org:8080/20971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f Gerrit-Change-Number: 20971 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Mon, 29 Jan 2024 18:09:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/20943 ) Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20943/1/bin/dump_breakpad_symbols.py File bin/dump_breakpad_symbols.py: http://gerrit.cloudera.org:8080/#/c/20943/1/bin/dump_breakpad_symbols.py@369 PS1, Line 369: status = 1 > Do we need a break here when something happens? Good point, let's have the behavior match the parallel case. Before the parallelism, this was the old behavior. I filed IMPALA-12766 to add a flag to get the old behavior that keeps going. -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 18:01:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
Hello Yida Wu, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20943 to look at the new patch set (#2). Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages .. IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages When using bin/dump_breakpad_symbols.py to dump symbols for RPM/DEB packages, the script extracts the packages to a temporary directory and relies on keeping that directory around until the processing is finished. The parallel processing added in IMPALA-11511 breaks the logic that keeps the temporary directory around, so the script generates errors like: Found debugging info in /tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug Failed to open ELF file '/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug': No such file or directory Failed to write symbol file. This turns off parallelism for bin/dump_breakpad_symbols.py when processing RPM/DEB packages (i.e. -r/--pkg). This also avoids using a ThreadPool when num_processes <= 1. Testing: - Hand tested with Redhat 7 RPMs Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 --- M bin/dump_breakpad_symbols.py 1 file changed, 21 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/20943/2 -- To view, visit http://gerrit.cloudera.org:8080/20943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3 Gerrit-Change-Number: 20943 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12125: Support for dumping symbols from RPMs without separate symbols
Yida Wu has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20944 ) Change subject: IMPALA-12125: Support for dumping symbols from RPMs without separate symbols .. IMPALA-12125: Support for dumping symbols from RPMs without separate symbols Some RPMs contain binaries with debug symbols with no separate debuginfo package needed. bin/dump_breakpad_symbols.py does not allow this combination, as it expects a corresponding symbol package. This adds a --no_symbol_pkg option to dump_breakpad_symbols.py to turn off the requirement that --pkg be combined with --symbol_pkg. Testing: - Tested with an RPM package with an unstripped impalad binary - Tested with the usual RPM + debuginfo RPM combination Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b Reviewed-on: http://gerrit.cloudera.org:8080/20944 Reviewed-by: Impala Public Jenkins Tested-by: Yida Wu --- M bin/dump_breakpad_symbols.py 1 file changed, 22 insertions(+), 13 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved Yida Wu: Verified -- To view, visit http://gerrit.cloudera.org:8080/20944 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b Gerrit-Change-Number: 20944 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12125: Support for dumping symbols from RPMs without separate symbols
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/20944 ) Change subject: IMPALA-12125: Support for dumping symbols from RPMs without separate symbols .. Patch Set 2: Verified+1 Irrelevant failure at TestEventProcessingCustomConfigs.test_skipping_older_events -- To view, visit http://gerrit.cloudera.org:8080/20944 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b Gerrit-Change-Number: 20944 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 29 Jan 2024 17:32:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12125: Support for dumping symbols from RPMs without separate symbols
Yida Wu has removed a vote on this change. Change subject: IMPALA-12125: Support for dumping symbols from RPMs without separate symbols .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/20944 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b Gerrit-Change-Number: 20944 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12378: Add jar file of JdbcDataSource library to classpath
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20971 ) Change subject: IMPALA-12378: Add jar file of JdbcDataSource library to classpath .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10205/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f Gerrit-Change-Number: 20971 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 29 Jan 2024 17:10:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12378: Add jar file of JdbcDataSource library to classpath
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20971 ) Change subject: IMPALA-12378: Add jar file of JdbcDataSource library to classpath .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15088/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f Gerrit-Change-Number: 20971 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 29 Jan 2024 17:06:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12726: Simulate large-scale query in TpcdsCpuCostPlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20922 ) Change subject: IMPALA-12726: Simulate large-scale query in TpcdsCpuCostPlannerTest .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15087/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20922 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaffddd70c2da8376ca6c40f65606bbac46c34de7 Gerrit-Change-Number: 20922 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 16:58:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12378: Add jar file of JdbcDataSource library to classpath
Wenzhe Zhou has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20971 Change subject: IMPALA-12378: Add jar file of JdbcDataSource library to classpath .. IMPALA-12378: Add jar file of JdbcDataSource library to classpath This patch adds the jar file of JdbcDataSource library to classpath. With the change, We can get the instance of JdbcDataSourceData class in current ClassLoader and don't need to download the jar file from the given HDFS location to local file system. Testing: - Passed core test - Passed dockerised-tests Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f --- M CMakeLists.txt M be/src/exec/external-data-source-executor.cc M bin/set-classpath.sh M docker/setup_build_context.py M fe/src/main/java/org/apache/impala/extdatasource/ExternalDataSourceExecutor.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java M testdata/bin/copy-ext-data-sources.sh 8 files changed, 63 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/20971/1 -- To view, visit http://gerrit.cloudera.org:8080/20971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0daff8db6231f161ec27b45b51d78e21733d9b1f Gerrit-Change-Number: 20971 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-12726: Simulate large-scale query in TpcdsCpuCostPlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20922 ) Change subject: IMPALA-12726: Simulate large-scale query in TpcdsCpuCostPlannerTest .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/20922/1/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java File fe/src/main/java/org/apache/impala/common/RuntimeEnv.java: http://gerrit.cloudera.org:8080/#/c/20922/1/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java@100 PS1, Line 100: > nit: "tables with their" Done http://gerrit.cloudera.org:8080/#/c/20922/1/fe/src/test/java/org/apache/impala/planner/TpcdsCpuCostPlannerTest.java File fe/src/test/java/org/apache/impala/planner/TpcdsCpuCostPlannerTest.java: http://gerrit.cloudera.org:8080/#/c/20922/1/fe/src/test/java/org/apache/impala/planner/TpcdsCpuCostPlannerTest.java@53 PS1, Line 53: / Granular scan limit that will injected into individual ScanNode of tables. : private static Map< > Looks like some dim tables are also scaled but not linearly. I'll check TPC ps2 fully inject all stats based on actual dataset instead of just scaling them. -- To view, visit http://gerrit.cloudera.org:8080/20922 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaffddd70c2da8376ca6c40f65606bbac46c34de7 Gerrit-Change-Number: 20922 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 16:36:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12503: Support date data type for predicates for external data source table
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/20915 ) Change subject: IMPALA-12503: Support date data type for predicates for external data source table .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/DatabaseAccessor.java File java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/DatabaseAccessor.java: http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/DatabaseAccessor.java@36 PS9, Line 36: date_col nit: rename to dateCol http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java File java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java: http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@217 PS9, Line 217: date_co rename to dateCol http://gerrit.cloudera.org:8080/#/c/20915/9/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@219 PS9, Line 219: date_to_string Naming convention, rename variable dateToString -- To view, visit http://gerrit.cloudera.org:8080/20915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf13cbefaad812a0f78755c5791d82b24a3395e4 Gerrit-Change-Number: 20915 Gerrit-PatchSet: 9 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Mon, 29 Jan 2024 16:36:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12726: Simulate large-scale query in TpcdsCpuCostPlannerTest
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20922 to look at the new patch set (#2). Change subject: IMPALA-12726: Simulate large-scale query in TpcdsCpuCostPlannerTest .. IMPALA-12726: Simulate large-scale query in TpcdsCpuCostPlannerTest Querying against large-scale databases is good way for testing Impala. However, it is impractical to do in a single-node development machine. Frontend testing does not run the test query in the backend executor and can benefit from simulated large-scale test cases. This patch attempts to do it by instrumenting the CatalogD metadata loading code to scale tpcds_partitioned_parquet_snap by injecting column stats from a 3TB TPC-DS dataset TpcdsCpuCostPlannerTest. The large-scale column stats are expressed in stats-3TB.json, taken by running "SHOW COLUMN STATS" and "DESCRIBE FORMATTED" queries on a 3TB dataset loaded using impala-tpcds-kit. It is parsed and then piggy-backed through RuntimeEnv. Code that populates stats metadata (caller of FeCatalogUtils.getRowCount(), FeCatalogUtils.getTotalSize(), and FeCatalogUtils.injectColumnStats()) is instrumented to populate stats from RuntimeEnv instead of Metastore. Scaled-up tables are invalidated before a test run to reload them with new high-scale stats. This patch also add a scan range limit injection to force ScanNode over a single file table to act as if it scans a muti-files table. tpcds_partitioned_schema_template.sql is modified to match column names and types from impala-tpcds-kit). After this patch, the test files under PlannerTest/tpcds_cpu_cost/ have matching query plan shapes with the actual impala-tpcds-kit queries run against the 3TB TPC-DS dataset (https://github.com/cloudera/impala-tpcds-kit/tree/master/queries), except for a few mismatches due to different SQL and hard limit on number of files. Below are 16 queries out of 103 that still does not have matching shape and the reasons. +-+--+ | Q | Reason | +-+--+ | 6 | extra limit 1| | 10a | different num files in customer_demographics | | 23b | different frequent_ss_items CTE | | 22 | extra warehouse table| | 27 | different predicate for store table | | 34 | extra limit 10 | | 36 | different predicate for store table | | 53 | missing avg_quarterly_sales | | 66 | different SQL| | 68 | different predicate for data_dim table | | 69 | different num files in customer | | 73 | different order by, extra limit 1000 | | 74 | different num files in customer | | 84 | missing customer_demographics table | | 96 | missing limit 100| | 98 | extra limit 1000 | +-+--+ Testing: - Scale tables of tpcds_partitioned_parquet_snap in TpcdsCpuCostPlannerTest to simulate 3TB TPC-DS. The number of executors is raised from 3 to 10, and REPLICA_PREFERENCE=REMOTE to ignore data locality. - Pass core tests. Change-Id: Iaffddd70c2da8376ca6c40f65606bbac46c34de7 --- M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java A fe/src/main/java/org/apache/impala/catalog/SideloadTableStats.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M fe/src/test/java/org/apache/impala/planner/TpcdsCpuCostPlannerTest.java A fe/src/test/java/org/apache/impala/testutil/StatsJsonParser.java M testdata/datasets/tpcds_partitioned/tpcds_partitioned_schema_template.sql A testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/stats-3TB.json M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries
[Impala-ASF-CR] IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20968 ) Change subject: IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20968/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test: http://gerrit.cloudera.org:8080/#/c/20968/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@300 PS1, Line 300: select snapshot_id from functional_parquet.iceberg_query_metadata.snapshots limit 2; Add check with low BATCH_SIZE option? https://impala.apache.org/docs/build/html/topics/impala_batch_size.html -- To view, visit http://gerrit.cloudera.org:8080/20968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea73716fe475c8b063235d2ae971a4074b8e2a20 Gerrit-Change-Number: 20968 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 16:13:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20968 ) Change subject: IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20968/1/be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc File be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/20968/1/be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc@198 PS1, Line 198: *eos = true; Is GetNext() will be called again if *eos is true and last struct_like_row is not nullptr? -- To view, visit http://gerrit.cloudera.org:8080/20968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea73716fe475c8b063235d2ae971a4074b8e2a20 Gerrit-Change-Number: 20968 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Jan 2024 16:12:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12763: Union with string struct crashes in ASAN
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20970 ) Change subject: IMPALA-12763: Union with string struct crashes in ASAN .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15086/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20970 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id728f1254b74636be594a33313a478b0b77c7ae4 Gerrit-Change-Number: 20970 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Mon, 29 Jan 2024 15:50:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12763: Union with string struct crashes in ASAN
Daniel Becker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20970 Change subject: IMPALA-12763: Union with string struct crashes in ASAN .. IMPALA-12763: Union with string struct crashes in ASAN In ASAN builds, if we UNION ALL an array containing a struct of a string with itself, Impala crashes. This is how to reproduce it: In Hive: create table su (arr ARRAY>) stored as parquet; insert into su values (array(named_struct("s", "A"))); In Impala: select 1, arr from su union all select 2, arr from su; The ASAN error message indicates a heap-use-after-free. Normally, UNIONs of structs are not supported yet (see IMPALA-10752), but if the struct is inside an array it is allowed now. This was probably not intentional and it leads to the above error, so this change disables structs in unions completely, including embedded structs. Testing: - adjusted existing tests - added a query that tests that types with embedded structs are not allowed in a UNION statement, in mixed-collections-and-structs.test Change-Id: Id728f1254b74636be594a33313a478b0b77c7ae4 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-query/queries/QueryTest/mixed-collections-and-structs.test M testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test 6 files changed, 38 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20970/1 -- To view, visit http://gerrit.cloudera.org:8080/20970 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id728f1254b74636be594a33313a478b0b77c7ae4 Gerrit-Change-Number: 20970 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker
[Impala-ASF-CR] IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20968 ) Change subject: IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15085/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea73716fe475c8b063235d2ae971a4074b8e2a20 Gerrit-Change-Number: 20968 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 29 Jan 2024 14:34:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12695: Crash with UNION with complex types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20953 ) Change subject: IMPALA-12695: Crash with UNION with complex types .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15084/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20953 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 Gerrit-Change-Number: 20953 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Mon, 29 Jan 2024 14:25:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12695: Crash with UNION with complex types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20953 ) Change subject: IMPALA-12695: Crash with UNION with complex types .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15083/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20953 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 Gerrit-Change-Number: 20953 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Mon, 29 Jan 2024 14:18:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause
Tamas Mate has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20968 Change subject: IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause .. IMPALA-12764: Fix Iceberg metadata table scan's LIMIT clause Earlier the ReachedLimit() condition was not evaluated during IcebergMetadataScanNode::GetNext(). This missing condition has been added and LIMIT is now being evaluated. Testing: - Added E2E test Change-Id: Iea73716fe475c8b063235d2ae971a4074b8e2a20 --- M be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc M testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test 2 files changed, 13 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/20968/1 -- To view, visit http://gerrit.cloudera.org:8080/20968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iea73716fe475c8b063235d2ae971a4074b8e2a20 Gerrit-Change-Number: 20968 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate
[Impala-ASF-CR] IMPALA-12695: Crash with UNION with complex types
Daniel Becker has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/20953 ) Change subject: IMPALA-12695: Crash with UNION with complex types .. IMPALA-12695: Crash with UNION with complex types If we unnest an array coming from a UNION ALL, we read invalid memory and in ASAN builds we crash. Example: with v as (select arr1 from complextypes_arrays union all select arr1 from complextypes_arrays) select am.item from v, v.arr1 am; The problem seems to be that in the item tuple of the collections, the item slots are present twice. This is because both the inline view analyzer and the main analyzer add slots with the same path to the tuple. This is possible because - the target tuple is determined based on the path via Path.getRootDesc(), so it will be the same both in the inline view and in the main scope AND - the inline view analyzer and the main one do not share 'slotPathMap_', so the analyzer cannot recognise that a slot for the path has already been added. This commit solves the problem by checking the target tuple whether a slot with the same path already exists in it, and if it does, we reuse that slot. Note, however, that when Analyzer.registerSlotRef() is called with 'duplicateIfCollections=true', a separate slot is added for collections which should not be reused. This commit adds a set, 'duplicateCollectionSlots', in Analyzer.GlobalState to keep track of such collection slots, and these slots are never reused. Note that there is another bug, IMPALA-12753, that a predicate on the collection item in the above query is only enforced on the first child of the union. Therefore this commit disallows placing a predicate on a collection item when the unnested collection comes from a union. Testing: - added test queries in nested-array-in-select-list.test, nested-map-in-select-list.test, zipping-unnest-in-from-clause.test and zipping-unnest-in-select-list.test Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test M testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test 7 files changed, 373 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/20953/3 -- To view, visit http://gerrit.cloudera.org:8080/20953 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 Gerrit-Change-Number: 20953 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs
[Impala-ASF-CR] IMPALA-12695: Crash with UNION with complex types
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20953 ) Change subject: IMPALA-12695: Crash with UNION with complex types .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20953/1/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test File testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test: http://gerrit.cloudera.org:8080/#/c/20953/1/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@156 PS1, Line 156: item > Can you also add tests with unnest() in select list /from clause,e.g in zip The two queries you mentioned produce the same results on master as on this patch and unfortunately I don't have capacity to fix them now. But if we remove the outer view from the first query it executes correctly, added it as a test in both zipping unnest in select list and from clause. -- To view, visit http://gerrit.cloudera.org:8080/20953 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 Gerrit-Change-Number: 20953 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Mon, 29 Jan 2024 13:51:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12695: Crash with UNION with complex types
Daniel Becker has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/20953 ) Change subject: IMPALA-12695: Crash with UNION with complex types .. IMPALA-12695: Crash with UNION with complex types If we unnest an array coming from a UNION ALL, we read invalid memory and in ASAN builds we crash. Example: with v as (select arr1 from complextypes_arrays union all select arr1 from complextypes_arrays) select am.item from v, v.arr1 am; The problem seems to be that in the item tuple of the collections, the item slots are present twice. This is because both the inline view analyzer and the main analyzer add slots with the same path to the tuple. This is possible because - the target tuple is determined based on the path via Path.getRootDesc(), so it will be the same both in the inline view and in the main scope AND - the inline view analyzer and the main one do not share 'slotPathMap_', so the analyzer cannot recognise that a slot for the path has already been added. This commit solves the problem by checking the target tuple whether a slot with the same path already exists in it, and if it does, we reuse that slot. Note, however, that when Analyzer.registerSlotRef() is called with 'duplicateIfCollections=true', a separate slot is added for collections which should not be reused. This commit adds a set, 'duplicateCollectionSlots', in Analyzer.GlobalState to keep track of such collection slots, and these slots are never reused. Note that there is another bug, IMPALA-12753, that a predicate on the collection item in the above query is only enforced on the first child of the union. Therefore this commit disallows placing a predicate on a collection item when the unnested collection comes from a union. Testing: - added test queries in nested-array-in-select-list.test, nested-map-in-select-list.test, zipping-unnest-in-from-clause.test and zipping-unnest-in-select-list.test Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test M testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test 7 files changed, 373 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/20953/2 -- To view, visit http://gerrit.cloudera.org:8080/20953 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I340adc50e6d7cda6f59dacd7a46b6adc31635d46 Gerrit-Change-Number: 20953 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs
[Impala-ASF-CR] IMPALA-12762: Fix cmake error in package building
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20965 ) Change subject: IMPALA-12762: Fix cmake error in package building .. Patch Set 1: Code-Review+1 (2 comments) Thanks for fixing this so quickly! http://gerrit.cloudera.org:8080/#/c/20965/1/bin/jenkins/build-all-flag-combinations.sh File bin/jenkins/build-all-flag-combinations.sh: http://gerrit.cloudera.org:8080/#/c/20965/1/bin/jenkins/build-all-flag-combinations.sh@42 PS1, Line 42: skiptests Let's change this to "notests" so new changes won't break the build again. http://gerrit.cloudera.org:8080/#/c/20965/1/bin/jenkins/build-all-flag-combinations.sh@50 PS1, Line 50: skiptests Let's change this to "notests" as well. -- To view, visit http://gerrit.cloudera.org:8080/20965 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice0cbb0671d915f997fa74217521a82be164ae57 Gerrit-Change-Number: 20965 Gerrit-PatchSet: 1 Gerrit-Owner: Yifan Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Mon, 29 Jan 2024 13:07:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 33: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10203/ -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 33 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 13:01:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12762: Fix cmake error in package building
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20965 ) Change subject: IMPALA-12762: Fix cmake error in package building .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15082/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20965 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice0cbb0671d915f997fa74217521a82be164ae57 Gerrit-Change-Number: 20965 Gerrit-PatchSet: 1 Gerrit-Owner: Yifan Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Mon, 29 Jan 2024 10:38:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12762: Fix cmake error in package building
Yifan Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20965 Change subject: IMPALA-12762: Fix cmake error in package building .. IMPALA-12762: Fix cmake error in package building This patch adds extra processing of option 'BUILD_WITH_NO_TESTS' in be/src/exec/json/CMakeLists.txt, so test targets will not be generated by the CMake when building Impala with -package and -notests. Testing: - Run './buildall.sh -noclean -notests -package' with no error Change-Id: Ice0cbb0671d915f997fa74217521a82be164ae57 --- M be/src/exec/json/CMakeLists.txt 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/20965/1 -- To view, visit http://gerrit.cloudera.org:8080/20965 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ice0cbb0671d915f997fa74217521a82be164ae57 Gerrit-Change-Number: 20965 Gerrit-PatchSet: 1 Gerrit-Owner: Yifan Zhang
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 33: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10203/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 33 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 08:29:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 ) Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. Patch Set 33: (5 comments) > Patch Set 31: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10201/ There are still test failures. Retriggered the job on the latest patch set. http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1401 PS33, Line 1401: debug Let's use trace() since this will be logged for each new loaded partition. For a table with 10K partitions, when it's loaded, we'll log 10K lines for this.. Use parameters so the string won't be constructed when TRACE logging is off, i.e. LOG.trace("Previous id for the partition object is not set. Partition info: {}", obj.hdfs_partition); http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20367/33/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3357 PS33, Line 3357: else { : // when table is replicated we let the HMS API handle the file deletion logic : // otherwise we delete the files. : Collection parts = FeCatalogUtils : .loadAllPartitions(hdfsTable); : for (FeFsPartition part : parts) { : FileSystemUtil.deleteAllVisibleFiles(new Path(part.getLocation())); : } : LOG.trace("Time elapsed after deleting files for table {}: {} msec", : table.getFullName(), sw.elapsed(TimeUnit.MILLISECONDS)); : } It'd be better to keep these unchanged since previously they are outside the try-block so don't need to acquire a MetaStoreClient. http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py@958 PS33, Line 958: result = cls.__execute_query(impalad_client, query, query_options, user) Let's check success first, i.e. like what we does in execute_query_expect_success(): assert result.success http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py@1046 PS33, Line 1046: self nit: use 'cls' for classmethod http://gerrit.cloudera.org:8080/#/c/20367/33/tests/common/impala_test_suite.py@1052 PS33, Line 1052: self nit: use 'cls' for classmethod -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 33 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Jan 2024 08:28:27 + Gerrit-HasComments: Yes