[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. IMPALA-10168: Expose JSON catalog objects in catalogd's debug page Catalogd has a debug page at '/catalog_object' showing catalog objects in thrift debug strings. It's inconvenient for tests to parse the thrift string and get interesting infos. This patch extends this page to support returning JSON results, which eases tests to extract complex infos from the catalog objects, e.g. partition ids of a hdfs table. Just like getting json results from other pages, the usage is adding a ‘json’ argument in the URL, e.g. http://localhost:25020/catalog_object?json&object_type=TABLE&object_name=db1.tbl1 Implementation: Csaba helped to find that Thrift has a protocol, TSimpleJSONProtocol, which can convert thrift objects to human readable JSON strings. This simplifies the implementation a lot. However, TSimpleJSONProtocol is not implemented in cpp yet (THRIFT-2476). So we do the conversion in FE to use its java implementation. Tests: - Add tests to verify json fields existence. Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Reviewed-on: http://gerrit.cloudera.org:8080/16449 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M fe/src/main/java/org/apache/impala/service/JniCatalog.java M tests/webserver/test_web_pages.py 5 files changed, 96 insertions(+), 6 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 14 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 21 Oct 2020 06:31:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6628: Use unqualified table references in .test files run from test queries.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16603 ) Change subject: IMPALA-6628: Use unqualified table references in .test files run from test_queries.py .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Gerrit-Change-Number: 16603 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 21 Oct 2020 05:20:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6628: Use unqualified table references in .test files run from test queries.py
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16603 ) Change subject: IMPALA-6628: Use unqualified table references in .test files run from test_queries.py .. IMPALA-6628: Use unqualified table references in .test files run from test_queries.py This fix modified the following tests launched from test_queries.py by removing references to database 'functional' whenever possible. The objective of the change is to allow more testing coverage with different databases than the single 'functional' database. In the fix, neither new tables were added nor expected results were altered. empty.test inline-view-limit.test inline-view.test limit.test misc.test sort.test subquery-single-node.test subquery.test top-n.test union.test with-clause.test It was determined that other tests in testdata/workloads/functional-query/queries/QueryTest do not refer to 'functional' or the references are a must for some reason. Testing Ran query_tests on these changed tests with exhaustive exploration strategy. Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Reviewed-on: http://gerrit.cloudera.org:8080/16603 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/empty.test M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test M testdata/workloads/functional-query/queries/QueryTest/inline-view.test M testdata/workloads/functional-query/queries/QueryTest/limit.test M testdata/workloads/functional-query/queries/QueryTest/misc.test M testdata/workloads/functional-query/queries/QueryTest/sort.test M testdata/workloads/functional-query/queries/QueryTest/subquery-single-node.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test M testdata/workloads/functional-query/queries/QueryTest/top-n.test M testdata/workloads/functional-query/queries/QueryTest/union.test M testdata/workloads/functional-query/queries/QueryTest/with-clause.test M tests/query_test/test_queries.py 13 files changed, 269 insertions(+), 253 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Gerrit-Change-Number: 16603 Gerrit-PatchSet: 11 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10166 (part 1): ALTER TABLE for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16606 ) Change subject: IMPALA-10166 (part 1): ALTER TABLE for Iceberg tables .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7502/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16606 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5104cc47c7b42dacdb52983f503cd263135d6bfc Gerrit-Change-Number: 16606 Gerrit-PatchSet: 4 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 21 Oct 2020 03:54:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10166 (part 1): ALTER TABLE for Iceberg tables
wangsheng has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16606 ) Change subject: IMPALA-10166 (part 1): ALTER TABLE for Iceberg tables .. IMPALA-10166 (part 1): ALTER TABLE for Iceberg tables This patch mainly implement ALTER TABLE for Iceberg tables, we currently supported these statements: * ADD COLUMNS * RENAME TABLE * SET TBL_PROPERTIES * SET OWNER We forbidden DROP COLUMN/REPLACE COLUMNS/ALTER COLUMN in this patch, since these statemens may caused Iceberg tables unreadable. We may support resolved column by field id in the near future, after that, we will support COLUMN/REPLACE COLUMNS/ALTER COLUMN for Iceberg tables. Here something we still need to pay attention: 1.RENAME TABLE is not supported for HadoopCatalog/HadoopTables, even if we already implement 'RENAME TABLE' statement. 2.We cannot ADD/DROP PARTITION now since Iceberg no related api, related work is already in progess in Iceberg. Testing: - Iceberg table alter test in test_iceberg.py - Iceberg table negative test in test_scanners.py Change-Id: I5104cc47c7b42dacdb52983f503cd263135d6bfc --- M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableRecoverPartitionsStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M tests/query_test/test_iceberg.py 19 files changed, 529 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/16606/4 -- To view, visit http://gerrit.cloudera.org:8080/16606 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5104cc47c7b42dacdb52983f503cd263135d6bfc Gerrit-Change-Number: 16606 Gerrit-PatchSet: 4 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 21 Oct 2020 01:09:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6592/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 21 Oct 2020 01:09:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 12: Code-Review+2 Carry on Csaba's +2. -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 21 Oct 2020 01:08:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6628: Use unqualified table references in .test files run from test queries.py
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16603 ) Change subject: IMPALA-6628: Use unqualified table references in .test files run from test_queries.py .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Gerrit-Change-Number: 16603 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 23:59:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6628: Use unqualified table references in .test files run from test queries.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16603 ) Change subject: IMPALA-6628: Use unqualified table references in .test files run from test_queries.py .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6591/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Gerrit-Change-Number: 16603 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 21 Oct 2020 00:00:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6628: Use unqualified table references in .test files run from test queries.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16603 ) Change subject: IMPALA-6628: Use unqualified table references in .test files run from test_queries.py .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Gerrit-Change-Number: 16603 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 21 Oct 2020 00:00:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 45: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 45 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 23:30:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. IMPALA-10178 Run-time profile shall report skews This fix addresses the current limitation in runtime profile that skews existing in certain operators such as the rows read counter (RowsRead) in the scan operators are not reported. A skew condition exists when the number of rows processed at each operator instance is not about the same and can be detected through coefficient of variation (CoV). A high CoV (say > 1.0) usually implies the existence of skew. With the fix, such skew is detected for the following counters 1. RowsRead in HDFS_SCAN_NODE and KUDU_SCAN_NODE 2. ProbeRows and BuildRows in HASH_JOIN_NODE 3. RowsReturned in GroupingAggregator, EXCHANGE and SORT_NODE and reported as follows: 1. In execution profile, a new skew summary that lists the names of the operators with skews; 2. In the averaged profile for the corresponding operator, the list of values of the counter across all fragment instances in the backend processes; 3. Skew detection formula: CoV > limit and mean > 5,000 4. A new query option 'report_skew_limit' < 0: disable skew reporting >= 0: enable skew reporting and supply the CoV limit Examples of skews reported for a hash join and an hdfs scan. In execution profile: ... ... skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0) Per Node Peak Memory Usage: ... ... ... In averaged profiles: HDFS_SCAN_NODE (id=2): ... Skew details: RowsRead ([2004992,1724693,2001351], CoV=0.07, mean=1910345) Testing: 1. Added test_skew_reporting_in_runtime_profile in test_observability.py to verify that the skews are reported. 2. Ran Core tests successfully. Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Reviewed-on: http://gerrit.cloudera.org:8080/16474 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/runtime/coordinator.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/stat-util.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_hash_join_timer.py M tests/query_test/test_observability.py 11 files changed, 249 insertions(+), 12 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 46 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6671: Skip locked tables from topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16549 ) Change subject: IMPALA-6671: Skip locked tables from topic updates .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7501/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:55:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10219: Expose DEBUG ACTION query option in catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16548 ) Change subject: IMPALA-10219: Expose DEBUG_ACTION query option in catalog .. Patch Set 6: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7500/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16548 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7196b1ce76415a5faf3fa8575a26d22b2bf50b1 Gerrit-Change-Number: 16548 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:53:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10219: Expose DEBUG ACTION query option in catalog
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16548 ) Change subject: IMPALA-10219: Expose DEBUG_ACTION query option in catalog .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16548 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7196b1ce76415a5faf3fa8575a26d22b2bf50b1 Gerrit-Change-Number: 16548 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:49:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Skip locked tables from topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16549 ) Change subject: IMPALA-6671: Skip locked tables from topic updates .. Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/16549/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16549/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1358 PS4, Line 1358: "Unable to update the pending version of the table " + tbl.getFullName(), ex); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py File tests/metadata/test_topic_update_frequency.py: http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@15 PS4, Line 15: import threading flake8: F401 'threading' imported but unused http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@18 PS4, Line 18: from time import sleep flake8: F401 'time.sleep' imported but unused http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@20 PS4, Line 20: from tests.beeswax.impala_beeswax import ImpalaBeeswaxException flake8: F401 'tests.beeswax.impala_beeswax.ImpalaBeeswaxException' imported but unused http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@54 PS4, Line 54: l flake8: E501 line too long (146 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@113 PS4, Line 113: l flake8: E501 line too long (145 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@181 PS4, Line 181: f flake8: E126 continuation line over-indented for hanging indent -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:41:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10219: Expose DEBUG ACTION query option in catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16548 ) Change subject: IMPALA-10219: Expose DEBUG_ACTION query option in catalog .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/16548/6/tests/metadata/test_catalogd_debug_actions.py File tests/metadata/test_catalogd_debug_actions.py: http://gerrit.cloudera.org:8080/#/c/16548/6/tests/metadata/test_catalogd_debug_actions.py@21 PS6, Line 21: class TestDebugActions(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/16548/6/tests/metadata/test_catalogd_debug_actions.py@24 PS6, Line 24: @ flake8: E303 too many blank lines (2) http://gerrit.cloudera.org:8080/#/c/16548/6/tests/metadata/test_catalogd_debug_actions.py@46 PS6, Line 46: d flake8: E303 too many blank lines (2) -- To view, visit http://gerrit.cloudera.org:8080/16548 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7196b1ce76415a5faf3fa8575a26d22b2bf50b1 Gerrit-Change-Number: 16548 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:41:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6671: Skip locked tables from topic updates
Vihang Karajgaonkar has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16549 ) Change subject: IMPALA-6671: Skip locked tables from topic updates .. IMPALA-6671: Skip locked tables from topic updates This change adds a mechanism for topic-update thread to skip a table which is locked for more than a configurable interval from the topic updates. This is especially useful in scenarios where long running operations on a locked table (refresh, recover partitions, compute stats) block the topic update thread. This causes unrelated queries which are waiting on metadata via topic updates (catalog-v1 mode) to unnecessarily block. The ideal solution of this problem would be to make HdfsTable immutable so that there is no need for table lock. But that is large change and not easily portable to older releases of Impala. It would be taken up as a separate patch. This change introduces 2 new configurations for catalogd: 1. topic_update_tbl_max_wait_time_ms: This defines the maximum time in msecs the topic update thread waits on a locked table before skipping the table from that iteration of topic updates. The default value is 500. 2. catalog_max_lock_skipped_topic_updates: This defines the maximum number of distinct lock operations which are skipped by topic update thread due to lock contention. Once this limit is reached, topic update thread will block until it acquires the table lock and adds it to the updates. Testing: 1. Added a test case which introduces a simulated delay in a few potentially long running statements. This causes the table to be locked for a long time. The topic update thread skips that table from updates and unrelated queries are unblocked since they receive the required metadata from updates. 2. Added a test where multiple threads run blocking statements in a loop to stress the table lock. It makes sure that topic update thread is not starved and eventually blocks on table lock by hitting the limit defined by catalog_max_lock_skipped_topic_updates. 2.Ran exhaustive tests with default configurations. Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/metadata/test_topic_update_frequency.py 10 files changed, 461 insertions(+), 54 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16549/4 -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10219: Expose DEBUG ACTION query option in catalog
Vihang Karajgaonkar has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/16548 ) Change subject: IMPALA-10219: Expose DEBUG_ACTION query option in catalog .. IMPALA-10219: Expose DEBUG_ACTION query option in catalog This patches enables DEBUG_ACTION in the catalog service's java code. Specifically, DEBUG_ACTION query option is now exposed to TResetMetadataRequest and TExecDdlRequest so that we can inject delays while executing refresh or ddl statements. For example, 1. To inject a delay of 100ms per HDFS list operation during refresh statement set the following query option: set debug_action=catalogd_refresh_hdfs_listing_delay:SLEEP@100; 2. To inject a delay of 100ms in alter table recover partitions statement: set debug_action=catalogd_table_recover_delay:SLEEP@100; 3. To inject a delay of 100ms in compute stats statement set debug_action=catalogd_update_stats_delay:SLEEP@100; Note that this option only adds the delay during the update_stats phase of the compute stats execution. Testing: 1. Added a test which sets the query option and makes sure that command takes more time than without query option. 2. Added unit tests for the debugAction implementation logic. Change-Id: Ia7196b1ce76415a5faf3fa8575a26d22b2bf50b1 --- M be/src/exec/catalog-op-executor.cc M be/src/util/debug-util.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/main/java/org/apache/impala/util/DebugUtils.java A fe/src/test/java/org/apache/impala/util/DebugUtilsTest.java M tests/common/impala_test_suite.py A tests/metadata/test_catalogd_debug_actions.py 15 files changed, 389 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/16548/6 -- To view, visit http://gerrit.cloudera.org:8080/16548 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7196b1ce76415a5faf3fa8575a26d22b2bf50b1 Gerrit-Change-Number: 16548 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10219: Expose DEBUG ACTION query option in catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16548 ) Change subject: IMPALA-10219: Expose DEBUG_ACTION query option in catalog .. Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2284 PS5, Line 2284: public TCatalogObject reloadTable(Table tbl, TResetMetadataRequest request, String reason) > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1152 PS5, Line 1152: storageMetadataLoadTime_ += updateUnpartitionedTableFileMd(client, debugAction); > line too long (94 > 90) Done http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@945 PS5, Line 945: reloadFileMetadata, reloadTableSchema, false, partitionsToUpdate, null, reason); > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3702 PS5, Line 3702: List> partitionsNotInHms = hdfsTable.getPathsWithoutPartitions(debugAction); > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/util/DebugUtils.java File fe/src/main/java/org/apache/impala/util/DebugUtils.java: http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/util/DebugUtils.java@29 PS5, Line 29: backend > I think that it would be useful to add a reference in the BE and FE impleme good point. I updated the code documentation to a pointer from be to fe and vice-versa. http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/util/DebugUtils.java@72 PS5, Line 72: if (components.get(0).equalsIgnoreCase(label)) { > maybe use continue in the !equal case to reduce nesting? Done http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/main/java/org/apache/impala/util/DebugUtils.java@78 PS5, Line 78: Preconditions.checkState(actionParams.size() > 1, : "Illegal debug action format found in " + debugActions + " for label" : + label); > This seems consistent with BE, but I don't understand why some format error yes, make sense to validate the debug action when it is set instead of when it is evaluated. Created IMPALA-10268 for this. http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/test/java/org/apache/impala/util/DebugUtilsTest.java File fe/src/test/java/org/apache/impala/util/DebugUtilsTest.java: http://gerrit.cloudera.org:8080/#/c/16548/5/fe/src/test/java/org/apache/impala/util/DebugUtilsTest.java@32 PS5, Line 32: DebugUtils.executeDebugAction("TEST_SLEEP_ACTION:SLEEP@100|SOME_OTHER_ACTION:SLEEP@10", > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/16548/5/tests/metadata/test_catalogd_debug_actions.py File tests/metadata/test_catalogd_debug_actions.py: http://gerrit.cloudera.org:8080/#/c/16548/5/tests/metadata/test_catalogd_debug_actions.py@21 PS5, Line 21: class TestDebugActions(ImpalaTestSuite): > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/16548/5/tests/metadata/test_catalogd_debug_actions.py@48 PS5, Line 48: > flake8: E203 whitespace before ':' Done http://gerrit.cloudera.org:8080/#/c/16548/5/tests/metadata/test_catalogd_debug_actions.py@50 PS5, Line 50: > flake8: W391 blank line at end of file Done -- To view, visit http://gerrit.cloudera.org:8080/16548 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7196b1ce76415a5faf3fa8575a26d22b2bf50b1 Gerrit-Change-Number: 16548 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:40:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6671: Skip locked tables from topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16549 ) Change subject: IMPALA-6671: Skip locked tables from topic updates .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7499/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:37:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Skip locked tables from topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16549 ) Change subject: IMPALA-6671: Skip locked tables from topic updates .. Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/16549/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16549/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1358 PS3, Line 1358: "Unable to update the pending version of the table " + tbl.getFullName(), ex); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/16549/3/tests/metadata/test_topic_update_frequency.py File tests/metadata/test_topic_update_frequency.py: http://gerrit.cloudera.org:8080/#/c/16549/3/tests/metadata/test_topic_update_frequency.py@15 PS3, Line 15: import threading flake8: F401 'threading' imported but unused http://gerrit.cloudera.org:8080/#/c/16549/3/tests/metadata/test_topic_update_frequency.py@18 PS3, Line 18: from time import sleep flake8: F401 'time.sleep' imported but unused http://gerrit.cloudera.org:8080/#/c/16549/3/tests/metadata/test_topic_update_frequency.py@20 PS3, Line 20: from tests.beeswax.impala_beeswax import ImpalaBeeswaxException flake8: F401 'tests.beeswax.impala_beeswax.ImpalaBeeswaxException' imported but unused http://gerrit.cloudera.org:8080/#/c/16549/3/tests/metadata/test_topic_update_frequency.py@54 PS3, Line 54: l flake8: E501 line too long (146 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16549/3/tests/metadata/test_topic_update_frequency.py@113 PS3, Line 113: l flake8: E501 line too long (145 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16549/3/tests/metadata/test_topic_update_frequency.py@181 PS3, Line 181: f flake8: E126 continuation line over-indented for hanging indent -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:22:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9990: Support SET OWNER for Kudu tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16273 ) Change subject: IMPALA-9990: Support SET OWNER for Kudu tables .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7498/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16273 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29d641efc8db314964bc5ee9828a86d4a44ae95c Gerrit-Change-Number: 16273 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 21:21:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Skip locked tables from topic updates
Vihang Karajgaonkar has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16549 ) Change subject: IMPALA-6671: Skip locked tables from topic updates .. IMPALA-6671: Skip locked tables from topic updates This change adds a mechanism for topic-update thread to skip a table which is locked for more than a configurable interval from the topic updates. This is especially useful in scenarios where long running operations on a locked table (refresh, recover partitions, compute stats) block the topic update thread. This causes unrelated queries which are waiting on metadata via topic updates (catalog-v1 mode) to unnecessarily block. The ideal solution of this problem would be to make HdfsTable immutable so that there is no need for table lock. But that is large change and not easily portable to older releases of Impala. It would be taken up as a separate patch. This change introduces 2 new configurations for catalogd: 1. topic_update_tbl_max_wait_time_ms: This defines the maximum time in msecs the topic update thread waits on a locked table before skipping the table from that iteration of topic updates. The default value is 500. 2. catalog_max_lock_skipped_topic_updates: This defines the maximum number of distinct lock operations which are skipped by topic update thread due to lock contention. Once this limit is reached, topic update thread will block until it acquires the table lock and adds it to the updates. Testing: 1. Added a test case which introduces a simulated delay in a few potentially long running statements. This causes the table to be locked for a long time. The topic update thread skips that table from updates and unrelated queries are unblocked since they receive the required metadata from updates. 2. Added a test where multiple threads run blocking statements in a loop to stress the table lock. It makes sure that topic update thread is not starved and eventually blocks on table lock by hitting the limit defined by catalog_max_lock_skipped_topic_updates. 2.Ran exhaustive tests with default configurations. Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/metadata/test_topic_update_frequency.py 10 files changed, 461 insertions(+), 54 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16549/3 -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9990: Support SET OWNER for Kudu tables
Fang-Yu Rao has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16273 ) Change subject: IMPALA-9990: Support SET OWNER for Kudu tables .. IMPALA-9990: Support SET OWNER for Kudu tables KUDU-3090 adds the support for table ownership and exposes the API's of setting owner on creating and altering tables, which allows Impala to also pass to Kudu the new owner of the Kudu table for the ALTER TABLE SET OWNER statement. Specifically, based on the API of AlterTableOptions#setOwner(), this patch stores the ownership information of the Kudu table in the corresponding instance of AlterTableOptions, which will then be passed to Kudu via a KuduClient. Testing: - Added a FE test in AnalyzeKuduDDLTest.java to verify the statement could be correctly analyzed. - Added an E2E test in kudu_alter.test to verify the statement could be correctly executed when the integration between Kudu and HMS is not enabled. - Added an E2E test in kudu_hms_alter.test and verified that the statement could be correctly executed when the integration between Kudu and HMS is enabled after manually re-enabling TestKuduHMSIntegration::test_kudu_alter_table(). Note that this was not possible before IMPALA-10092 was resolved due to a bug in the class of CustomClusterTestSuite. In addition, we may need to delete the Kudu table 'simple' via a Kudu-Python client if the E2E test complains that the Kudu table already exists, which may be related to IMPALA-8751. - Manually verified that the views of Kudu server and HMS are consistent for a synchronized Kutu table after the ALTER TABLE SET OWNER statement even though the Kudu table was once an external and non-synchronized table, meaning that the owner from Kudu's perspective could be different than that from HMS' perspective. The test is performed manually because currently the Kudu-Python client adopted in Impala's E2E tests is not up to date so that the field of 'owner' cannot be accessed in the E2E tests. On the other hand, to verify the owner of a Kudu table from Kudu's perspective, we used the latest Kudu-Python client as provided at github.com/apache/kudu/tree/master/examples/python/basic-python-example. - Verified that the patch could pass the exhaustive tests in the DEBUG mode. Change-Id: I29d641efc8db314964bc5ee9828a86d4a44ae95c --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test 5 files changed, 67 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/16273/4 -- To view, visit http://gerrit.cloudera.org:8080/16273 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I29d641efc8db314964bc5ee9828a86d4a44ae95c Gerrit-Change-Number: 16273 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10210: Skip Authentication for connection from a trusted domain
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16542 ) Change subject: IMPALA-10210: Skip Authentication for connection from a trusted domain .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/rpc/authentication-util.h File be/src/rpc/authentication-util.h: http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/rpc/authentication-util.h@40 PS2, Line 40: a list comma separated http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/rpc/authentication-util.h@40 PS2, Line 40: picks the first one on the : // list what's the rationale for only checking the first one? http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/rpc/authentication-util.h@48 PS2, Line 48: false nit: an error http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/util/webserver.h File be/src/util/webserver.h: http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/util/webserver.h@201 PS2, Line 201: std::string const&? http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/util/webserver.cc File be/src/util/webserver.cc: http://gerrit.cloudera.org:8080/#/c/16542/2/be/src/util/webserver.cc@610 PS2, Line 610: // Connections originating from trusted domains should not require authentication. Might be good to mention that we're doing this after checking cookies to avoid the reverse DNS lookup, here and in THttpServer http://gerrit.cloudera.org:8080/#/c/16542/2/common/thrift/metrics.json File common/thrift/metrics.json: http://gerrit.cloudera.org:8080/#/c/16542/2/common/thrift/metrics.json@1299 PS2, Line 1299: Failure I wonder if "Failure" is too strong of a word here and below, since this will be a normal thing any time a regular user tries to connect. Maybe just "...Connections from Non-Trusted Domains" Or might even be fine to just drop the "failure" metrics, since these connections will end up counted in the SPNEGO/BASIC auth success/failure metrics too. -- To view, visit http://gerrit.cloudera.org:8080/16542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09234078e2314dbc3177d0e869ae028e216ca699 Gerrit-Change-Number: 16542 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 20 Oct 2020 20:53:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Tim Armstrong has removed a vote on this change. Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Removed Code-Review+2 by Tim Armstrong -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 19:13:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9910: [DOCS] Add fault tolerance docs
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/16610 ) Change subject: IMPALA-9910: [DOCS] Add fault tolerance docs .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/16610/3/docs/topics/impala_retry_failed_queries.xml File docs/topics/impala_retry_failed_queries.xml: http://gerrit.cloudera.org:8080/#/c/16610/3/docs/topics/impala_retry_failed_queries.xml@42 PS3, Line 42: node is blacklisted by the Impala Coordinator. Should we mention that we only retry once in current implementation? -- To view, visit http://gerrit.cloudera.org:8080/16610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 Gerrit-Change-Number: 16610 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Shajini Thayasingh Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 20 Oct 2020 19:06:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9910: [DOCS] Add fault tolerance docs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16610 ) Change subject: IMPALA-9910: [DOCS] Add fault tolerance docs .. Patch Set 3: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/605/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/16610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 Gerrit-Change-Number: 16610 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Shajini Thayasingh Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 18:36:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9910: [DOCS] Add fault tolerance docs
Shajini Thayasingh has posted comments on this change. ( http://gerrit.cloudera.org:8080/16610 ) Change subject: IMPALA-9910: [DOCS] Add fault tolerance docs .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/16610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 Gerrit-Change-Number: 16610 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Shajini Thayasingh Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 18:30:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7097 Print EC info in the query plan and profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16587 ) Change subject: IMPALA-7097 Print EC info in the query plan and profile .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/16587 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ea378914624a714fde820d290b3b9c43325c6a1 Gerrit-Change-Number: 16587 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 20 Oct 2020 18:23:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9910: [DOCS] Add fault tolerance docs
Hello Thomas Tauber-Marshall, Shajini Thayasingh, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16610 to look at the new patch set (#3). Change subject: IMPALA-9910: [DOCS] Add fault tolerance docs .. IMPALA-9910: [DOCS] Add fault tolerance docs Adds a few basic docs for fault tolerance in Impala. Covers the following topics: * Transparent query retries * Node blacklisting * Statestore heartbeats This commit only adds a high level explanation of the afortmentioned fault tolerance concepts. The docs should be expanded on in a future commit. Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 --- M docs/impala.ditamap M docs/impala_keydefs.ditamap A docs/topics/impala_fault_tolerance.xml A docs/topics/impala_node_blacklisting.xml A docs/topics/impala_retry_failed_queries.xml M docs/topics/impala_spool_query_results.xml A docs/topics/impala_transparent_query_retries.xml 7 files changed, 209 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/16610/3 -- To view, visit http://gerrit.cloudera.org:8080/16610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 Gerrit-Change-Number: 16610 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shajini Thayasingh Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9910: [DOCS] Add fault tolerance docs
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16610 ) Change subject: IMPALA-9910: [DOCS] Add fault tolerance docs .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/16610/2/docs/topics/impala_fault_tolerance.xml File docs/topics/impala_fault_tolerance.xml: http://gerrit.cloudera.org:8080/#/c/16610/2/docs/topics/impala_fault_tolerance.xml@51 PS2, Line 51: impalads periodically sent heartbeats (RPCs) to the statestored : process. If an impalad stops sending heartbeats, the statestored will : consider the impalad as failed > nit: the statestore sends the heartbeat messages, an impalad is considered Done http://gerrit.cloudera.org:8080/#/c/16610/2/docs/topics/impala_fault_tolerance.xml@55 PS2, Line 55: coordindators > We also now process the updates at executors (to cancel queries when their Done -- To view, visit http://gerrit.cloudera.org:8080/16610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 Gerrit-Change-Number: 16610 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Shajini Thayasingh Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 18:20:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9910: [DOCS] Add fault tolerance docs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16610 ) Change subject: IMPALA-9910: [DOCS] Add fault tolerance docs .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/605/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/16610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 Gerrit-Change-Number: 16610 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shajini Thayasingh Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 18:20:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 45: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 45 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 18:13:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 45: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6590/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 45 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 18:13:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10178 Run-time profile shall report skews
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16474 ) Change subject: IMPALA-10178 Run-time profile shall report skews .. Patch Set 44: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1 Gerrit-Change-Number: 16474 Gerrit-PatchSet: 44 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 18:13:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 16:51:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9910: [DOCS] Add fault tolerance docs
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16610 ) Change subject: IMPALA-9910: [DOCS] Add fault tolerance docs .. Patch Set 2: (2 comments) Obviously these are big subjects and a lot more could be written about them, but its a good start, so thanks for this. http://gerrit.cloudera.org:8080/#/c/16610/2/docs/topics/impala_fault_tolerance.xml File docs/topics/impala_fault_tolerance.xml: http://gerrit.cloudera.org:8080/#/c/16610/2/docs/topics/impala_fault_tolerance.xml@51 PS2, Line 51: impalads periodically sent heartbeats (RPCs) to the statestored : process. If an impalad stops sending heartbeats, the statestored will : consider the impalad as failed nit: the statestore sends the heartbeat messages, an impalad is considered failed if it doesn't respond http://gerrit.cloudera.org:8080/#/c/16610/2/docs/topics/impala_fault_tolerance.xml@55 PS2, Line 55: coordindators We also now process the updates at executors (to cancel queries when their coordinator fails) -- To view, visit http://gerrit.cloudera.org:8080/16610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d178b21a9654bbed8b814ccadca95703ffacb62 Gerrit-Change-Number: 16610 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shajini Thayasingh Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 16:29:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6628: Use unqualified table references in .test files run from test queries.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16603 ) Change subject: IMPALA-6628: Use unqualified table references in .test files run from test_queries.py .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7497/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Gerrit-Change-Number: 16603 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 15:56:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16545 ) Change subject: IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet) .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/16545/4/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/16545/4/be/src/runtime/coordinator.cc@772 PS4, Line 772: HdfsTableDescriptor* hdfs_table; Uninitialized variable. -- To view, visit http://gerrit.cloudera.org:8080/16545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5690fb6c2cc51f0033fa26caf8597c80a11bcd8e Gerrit-Change-Number: 16545 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Oct 2020 15:46:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6628: Use unqualified table references in .test files run from test queries.py
Qifan Chen has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/16603 ) Change subject: IMPALA-6628: Use unqualified table references in .test files run from test_queries.py .. IMPALA-6628: Use unqualified table references in .test files run from test_queries.py This fix modified the following tests launched from test_queries.py by removing references to database 'functional' whenever possible. The objective of the change is to allow more testing coverage with different databases than the single 'functional' database. In the fix, neither new tables were added nor expected results were altered. empty.test inline-view-limit.test inline-view.test limit.test misc.test sort.test subquery-single-node.test subquery.test top-n.test union.test with-clause.test It was determined that other tests in testdata/workloads/functional-query/queries/QueryTest do not refer to 'functional' or the references are a must for some reason. Testing Ran query_tests on these changed tests with exhaustive exploration strategy. Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 --- M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/empty.test M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test M testdata/workloads/functional-query/queries/QueryTest/inline-view.test M testdata/workloads/functional-query/queries/QueryTest/limit.test M testdata/workloads/functional-query/queries/QueryTest/misc.test M testdata/workloads/functional-query/queries/QueryTest/sort.test M testdata/workloads/functional-query/queries/QueryTest/subquery-single-node.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test M testdata/workloads/functional-query/queries/QueryTest/top-n.test M testdata/workloads/functional-query/queries/QueryTest/union.test M testdata/workloads/functional-query/queries/QueryTest/with-clause.test M tests/query_test/test_queries.py 13 files changed, 269 insertions(+), 253 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/16603/9 -- To view, visit http://gerrit.cloudera.org:8080/16603 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972 Gerrit-Change-Number: 16603 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16545 ) Change subject: IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet) .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/7496/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5690fb6c2cc51f0033fa26caf8597c80a11bcd8e Gerrit-Change-Number: 16545 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Oct 2020 15:09:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10166 ALTER TABLE for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16606 ) Change subject: IMPALA-10166 ALTER TABLE for Iceberg tables .. Patch Set 3: Thanks for the clarification. Seems good to me! -- To view, visit http://gerrit.cloudera.org:8080/16606 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5104cc47c7b42dacdb52983f503cd263135d6bfc Gerrit-Change-Number: 16606 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Oct 2020 14:41:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16545 ) Change subject: IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet) .. Patch Set 4: (10 comments) http://gerrit.cloudera.org:8080/#/c/16545/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/16545/3/be/src/runtime/coordinator.cc@786 PS3, Line 786: is_hive_acid) { > is_transactional became a bit misleading IMO, as Iceberg is also kind of tr I agree, renamed it to is_hive_acid. http://gerrit.cloudera.org:8080/#/c/16545/3/be/src/runtime/coordinator.cc@800 PS3, Line 800: } else { > Won't we incorrectly call this in case of Iceberg tables? Done http://gerrit.cloudera.org:8080/#/c/16545/3/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/16545/3/be/src/service/client-request-state.cc@1213 PS3, Line 1213: if (per_part_map.size() != 1) return ret; > If this is always the case than wouldn't be a DCHECK better? We call this function for every table kind, not just for Iceberg tables. What we know is that if the table has multiple partitions, then it's definitely not an Iceberg table. http://gerrit.cloudera.org:8080/#/c/16545/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java: http://gerrit.cloudera.org:8080/#/c/16545/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java@96 PS3, Line 96: String.format("Failed to load Iceberg table with id: %s", tableId)); > nit: +2 indentation Done http://gerrit.cloudera.org:8080/#/c/16545/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java: http://gerrit.cloudera.org:8080/#/c/16545/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java@101 PS3, Line 101: String.format("Failed to load Iceberg table at location: %s", tableLocation)); > nit: +2 indentation Done http://gerrit.cloudera.org:8080/#/c/16545/3/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/16545/3/tests/query_test/test_iceberg.py@21 PS3, Line 21: > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/16545/3/tests/query_test/test_iceberg.py@50 PS3, Line 50: > flake8: W292 no newline at end of file Done http://gerrit.cloudera.org:8080/#/c/16545/3/tests/stress/test_insert_stress.py File tests/stress/test_insert_stress.py: http://gerrit.cloudera.org:8080/#/c/16545/3/tests/stress/test_insert_stress.py@50 PS3, Line 50: while insert_cnt < num_ > Isn't it a problem that this affect ACID tests too? There could be a parame In this file there are non-ACID inserts, but yeah, it will affect them. It would mean the test become a bit more flaky, but also faster. Anyway, I switched to use a parameter. http://gerrit.cloudera.org:8080/#/c/16545/3/tests/stress/test_insert_stress.py@114 PS3, Line 114: self.client.execute("""create table {0} (wid int, i int) > Is this really needed? We should never use the same unique_database twice. Done http://gerrit.cloudera.org:8080/#/c/16545/3/tests/stress/test_insert_stress.py@124 PS3, Line 124: writers = [Task(self._impala_role_concurrent_writer, tbl_name, i, inserts, counter) > Yeah, it's good to see that the infrastructure for ACID inserts can be reus Yeah, it was a nice surprise how easy it was :) -- To view, visit http://gerrit.cloudera.org:8080/16545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5690fb6c2cc51f0033fa26caf8597c80a11bcd8e Gerrit-Change-Number: 16545 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Oct 2020 14:39:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet)
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16545 to look at the new patch set (#4). Change subject: IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet) .. IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet) This commit adds support for INSERT INTO statements against Iceberg tables when the table is non-partitioned and the underlying file format is Parquet. We still use Impala's HdfsParquetTableWriter to write the data files, though they needed some modifications to conform to the Iceberg spec, namely: * write Iceberg/Parquet 'field_id' for the columns * TIMESTAMPs are encoded as INT64 micros (without time zone) We use DmlExecState to transfer information from the table sink operators to the coordinator, then updateCatalog() invokes the AppendFiles API to add files atomically. DmlExecState is encoded in protobuf, communication with the Frontend uses Thrift. Therefore to avoid defining Iceberg DataFile multiple times they are stored in FlatBuffers. The commit also does some corrections on Impala type <-> Iceberg type mapping: * Impala TIMESTAMP is Iceberg TIMESTAMP (without time zone) * Impala CHAR is Iceberg FIXED Testing: * Added INSERT tests to iceberg-insert.test * Added negative tests to iceberg-negative.test * I also did some manual testing with Spark. Spark is able to read Iceberg tables written by Impala until we use TIMESTAMPs. In that case Spark rejects the data files because it only accepts TIMESTAMPS with time zone. * Added concurrent INSERT tests to test_insert_stress.py Change-Id: I5690fb6c2cc51f0033fa26caf8597c80a11bcd8e --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h A be/src/exec/output-partition.h M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-metadata-utils.cc M be/src/exec/parquet/parquet-metadata-utils.h M be/src/runtime/coordinator.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/fbs/CMakeLists.txt A common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M common/thrift/Descriptors.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/catalog/Column.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java A fe/src/main/java/org/apache/impala/catalog/IcebergColumn.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M tests/common/skip.py M tests/query_test/test_iceberg.py M tests/stress/test_insert_stress.py 41 files changed, 856 insertions(+), 192 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/16545/4 -- To view, visit http://gerrit.cloudera.org:8080/16545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5690fb6c2cc51f0033fa26caf8597c80a11bcd8e Gerrit-Change-Number: 16545 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10166 ALTER TABLE for Iceberg tables
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16606 ) Change subject: IMPALA-10166 ALTER TABLE for Iceberg tables .. Patch Set 3: > (1 comment) Sorry I didn't make it clear. I mean 'RENAME TABLE' here, not rename column. According to your advice, I will keep ADD COLUMNS/RENAME TABLE/SET TBL_PROPERTIES in this patch. -- To view, visit http://gerrit.cloudera.org:8080/16606 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5104cc47c7b42dacdb52983f503cd263135d6bfc Gerrit-Change-Number: 16606 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Oct 2020 14:26:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10166 ALTER TABLE for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16606 ) Change subject: IMPALA-10166 ALTER TABLE for Iceberg tables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/16606/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16606/3//COMMIT_MSG@11 PS3, Line 11: * ADD COLUMNS : * DROP COLUMN : * REPLACE COLUMNS : * ALTER COLUMN CHANGE COLUMN > Thanks Zoltan, it indeed a problem. If you insist supporting for ALTER TABL Yeah, it makes sense to me. I think we cannot support both RENAME and DROP because if those are combined, we cannot find the affected columns by name, neither by position. DROP also interacts weirdly with ADD COLUMN. E.g. DROP + ADD column with the same name is a troublesome edge case. So I think we should add support for DROP after step 2. -- To view, visit http://gerrit.cloudera.org:8080/16606 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5104cc47c7b42dacdb52983f503cd263135d6bfc Gerrit-Change-Number: 16606 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Oct 2020 13:17:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7495/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 11 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 13:05:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10166 ALTER TABLE for Iceberg tables
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16606 ) Change subject: IMPALA-10166 ALTER TABLE for Iceberg tables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/16606/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16606/3//COMMIT_MSG@11 PS3, Line 11: * ADD COLUMNS : * DROP COLUMN : * REPLACE COLUMNS : * ALTER COLUMN CHANGE COLUMN > Iceberg assigns a unique field id to each fields to support schema evolutio Thanks Zoltan, it indeed a problem. If you insist supporting for ALTER TABLE statements that don't make the table unreadable to Impala. I will separate this patch to these sub-tasks: 1. Keeping ADD COLUMNS/RENAME/SET TBLPROPERTIES in this patch as part1. Maybe we can also including DROP COLUMN statement. 2. Implement a new option such as ‘ICEBERG_FIELD’ in TParquetFallbackSchemaResolution in another patch as part2, so we can use filed id to resolve columns. 3. After we finished the task2, we can supporting REPLACE/ALTER in a new patch as part3. Maybe we can also merge task2 and task3 into one task. How do you think? -- To view, visit http://gerrit.cloudera.org:8080/16606 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5104cc47c7b42dacdb52983f503cd263135d6bfc Gerrit-Change-Number: 16606 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Oct 2020 12:45:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16449 to look at the new patch set (#11). Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. IMPALA-10168: Expose JSON catalog objects in catalogd's debug page Catalogd has a debug page at '/catalog_object' showing catalog objects in thrift debug strings. It's inconvenient for tests to parse the thrift string and get interesting infos. This patch extends this page to support returning JSON results, which eases tests to extract complex infos from the catalog objects, e.g. partition ids of a hdfs table. Just like getting json results from other pages, the usage is adding a ‘json’ argument in the URL, e.g. http://localhost:25020/catalog_object?json&object_type=TABLE&object_name=db1.tbl1 Implementation: Csaba helped to find that Thrift has a protocol, TSimpleJSONProtocol, which can convert thrift objects to human readable JSON strings. This simplifies the implementation a lot. However, TSimpleJSONProtocol is not implemented in cpp yet (THRIFT-2476). So we do the conversion in FE to use its java implementation. Tests: - Add tests to verify json fields existence. Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M fe/src/main/java/org/apache/impala/service/JniCatalog.java M tests/webserver/test_web_pages.py 5 files changed, 96 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16449/11 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 11 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7494/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 9 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 11:45:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 11:27:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6589/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 11:27:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 9: Code-Review+2 (3 comments) Thank Csaba's help on this! Carry on your +2. http://gerrit.cloudera.org:8080/#/c/16449/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16449/8//COMMIT_MSG@15 PS8, Line 15: interestin > nit: interesting Done http://gerrit.cloudera.org:8080/#/c/16449/8//COMMIT_MSG@24 PS8, Line 24: helpe > nit: helped Done http://gerrit.cloudera.org:8080/#/c/16449/8/be/src/catalog/catalog.h File be/src/catalog/catalog.h: http://gerrit.cloudera.org:8080/#/c/16449/8/be/src/catalog/catalog.h@75 PS8, Line 75: Like > nit: Like Done -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 9 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 11:26:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16449 to look at the new patch set (#9). Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. IMPALA-10168: Expose JSON catalog objects in catalogd's debug page Catalogd has a debug page at '/catalog_object' showing catalog objects in thrift debug strings. It's inconvenient for tests to parse the thrift string and get interested infos. Catalogd has a debug page at '/catalog_object' showing catalog objects in thrift debug strings. It's inconvenient for tests to parse the thrift string and get interesting infos. This patch extends this page to support returning JSON results, which eases tests to extract complex infos from the catalog objects, e.g. partition ids of a hdfs table. Just like getting json results from other pages, the usage is adding a ‘json’ argument in the URL, e.g. http://localhost:25020/catalog_object?json&object_type=TABLE&object_name=db1.tbl1 Implementation: Csaba helped to find that Thrift has a protocol, TSimpleJSONProtocol, which can convert thrift objects to human readable JSON strings. This simplifies the implementation a lot. However, TSimpleJSONProtocol is not implemented in cpp yet (THRIFT-2476). So we do the conversion in FE to use its java implementation. Tests: - Add tests to verify json fields existence. Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M fe/src/main/java/org/apache/impala/service/JniCatalog.java M tests/webserver/test_web_pages.py 5 files changed, 96 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16449/9 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 9 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 8: Code-Review+2 (3 comments) Just a few nits. http://gerrit.cloudera.org:8080/#/c/16449/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16449/8//COMMIT_MSG@15 PS8, Line 15: interested nit: interesting http://gerrit.cloudera.org:8080/#/c/16449/8//COMMIT_MSG@24 PS8, Line 24: helps nit: helped http://gerrit.cloudera.org:8080/#/c/16449/8/be/src/catalog/catalog.h File be/src/catalog/catalog.h: http://gerrit.cloudera.org:8080/#/c/16449/8/be/src/catalog/catalog.h@75 PS8, Line 75: Likes nit: Like -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 8 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 11:13:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10256: Skip test disable incremental metadata updates on S3 tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16616 ) Change subject: IMPALA-10256: Skip test_disable_incremental_metadata_updates on S3 tests .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7493/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16616 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0b922de84cff0a1e0771d5a8470bdd9f153f85f0 Gerrit-Change-Number: 16616 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 20 Oct 2020 07:46:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16449 ) Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7491/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 8 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 20 Oct 2020 07:44:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10075: Reuse unchanged partition instances
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16392 ) Change subject: IMPALA-10075: Reuse unchanged partition instances .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7492/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2dd645c260d271291021e52fdac4b74924df1170 Gerrit-Change-Number: 16392 Gerrit-PatchSet: 9 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 20 Oct 2020 07:44:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10256: Skip test disable incremental metadata updates on S3 tests
Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16616 Change subject: IMPALA-10256: Skip test_disable_incremental_metadata_updates on S3 tests .. IMPALA-10256: Skip test_disable_incremental_metadata_updates on S3 tests IMPALA-10113 adds a test for disabling the incremental_metadata_updates flag to verify the metadata propagation still working correctly. The test invokes two test files which is used in metadata/test_ddl.py. One test file is about hdfs caching. It should only be run on HDFS file system. So we should mark the test with "SkipIf.not_hdfs". Tests: - Run CORE test on S3 build. Change-Id: I0b922de84cff0a1e0771d5a8470bdd9f153f85f0 --- M tests/custom_cluster/test_disable_features.py 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/16616/1 -- To view, visit http://gerrit.cloudera.org:8080/16616 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0b922de84cff0a1e0771d5a8470bdd9f153f85f0 Gerrit-Change-Number: 16616 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang
[Impala-ASF-CR] IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16449 to look at the new patch set (#8). Change subject: IMPALA-10168: Expose JSON catalog objects in catalogd's debug page .. IMPALA-10168: Expose JSON catalog objects in catalogd's debug page Catalogd has a debug page at '/catalog_object' showing catalog objects in thrift debug strings. It's inconvenient for tests to parse the thrift string and get interested infos. Catalogd has a debug page at '/catalog_object' showing catalog objects in thrift debug strings. It's inconvenient for tests to parse the thrift string and get interested infos. This patch extends this page to support returning JSON results, which eases tests to extract complex infos from the catalog objects, e.g. partition ids of a hdfs table. Just like getting json results from other pages, the usage is adding a ‘json’ argument in the URL, e.g. http://localhost:25020/catalog_object?json&object_type=TABLE&object_name=db1.tbl1 Implementation: Csaba helps to find that Thrift has a protocol, TSimpleJSONProtocol, which can convert thrift objects to human readable JSON strings. This simplifies the implementation a lot. However, TSimpleJSONProtocol is not implemented in cpp yet (THRIFT-2476). So we do the conversion in FE to use its java implementation. Tests: - Add tests to verify json fields existence. Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M fe/src/main/java/org/apache/impala/service/JniCatalog.java M tests/webserver/test_web_pages.py 5 files changed, 96 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16449/8 -- To view, visit http://gerrit.cloudera.org:8080/16449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I15f256b4e3f5206c7140746694106e03b0a4ad92 Gerrit-Change-Number: 16449 Gerrit-PatchSet: 8 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10075: Reuse unchanged partition instances
Hello Qifan Chen, Vihang Karajgaonkar, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16392 to look at the new patch set (#9). Change subject: IMPALA-10075: Reuse unchanged partition instances .. IMPALA-10075: Reuse unchanged partition instances Currently, we always update the partition instance when we reload a partition. If a partition remains the same after reloading, we should reuse the old partition instance. So we won't send redundant updates on these partitions. This reduces the size of the catalog topic update. When a huge table is REFRESHed, catalogd only propagates the changed partitions. Tests: - Add tests to verify that partition instances are reused after some DDL/DMLs. Change-Id: I2dd645c260d271291021e52fdac4b74924df1170 --- M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java A tests/metadata/test_reuse_partitions.py 5 files changed, 198 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/16392/9 -- To view, visit http://gerrit.cloudera.org:8080/16392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2dd645c260d271291021e52fdac4b74924df1170 Gerrit-Change-Number: 16392 Gerrit-PatchSet: 9 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar