[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/17244/11/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/17244/11/tests/common/impala_test_suite.py@156 PS11, Line 156: create_hive_client nit: It may be more informative to change the name of this method to create_hms_client(). -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 24 Apr 2021 19:49:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Reviewed-on: http://gerrit.cloudera.org:8080/17244 Reviewed-by: Quanlong Huang Tested-by: Vihang Karajgaonkar --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,398 insertions(+), 22 deletions(-) Approvals: Quanlong Huang: Looks good to me, approved Vihang Karajgaonkar: Verified -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: Verified+1 Carrying forward the Verified +1 vote from PS9 since there was only a new comment added to a test between PS9 and PS10. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 01:01:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 00:46:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 00:39:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8517/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 20:13:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/17244/10/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/10/tests/custom_cluster/test_metastore_service.py@215 PS10, Line 215: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:53:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,398 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/10 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@157 PS8, Line 157: + " is transactional but it was requested without providing validWriteIdList"); > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py@277 PS6, Line 277: catalog_hms_client.create_table(self.__get_test_tbl(new_db_name, new_tbl_name, > It'd be helpful to leave a comment that "this won't trigger table metadata Done. Yes, DDL handling needs more thought. I think it would be cleaner to rely on the events to update the catalogd instead of out of band add/remove of the tables which could cause failures when same table is added or removed at a high frequency. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:52:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8516/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:14:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8515/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:10:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7051/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:54:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@157 PS8, Line 157: + " is transactional but it was requested without providing validWriteIdList"); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/8/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/8/tests/custom_cluster/test_metastore_service.py@215 PS8, Line 215: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:54:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 8: Latest patch fixes the test failure which was due to unavailable port to start the catalogd's HMS endpoint. The test was modified to use a free port instead of a hard-coded one. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:54:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,395 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/8 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/7/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/7/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@157 PS7, Line 157: + " is transactional but it was requested without providing validWriteIdList"); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/7/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/7/tests/custom_cluster/test_metastore_service.py@215 PS7, Line 215: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:50:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,395 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/7 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7047/ One of the tests which failed was fixed in https://gerrit.cloudera.org/#/c/17248/. I rebase to that change. Looking into the other failure. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 17:40:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7047/ -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 03:59:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: Code-Review+2 (1 comment) LGTM. Thanks for addressing the comments! http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py@277 PS6, Line 277: catalog_hms_client.create_table(self.__get_test_tbl(new_db_name, new_tbl_name, It'd be helpful to leave a comment that "this won't trigger table metadata loading in catalogd since it just forward the DDL to HMS". BTW, should this be an enhancement? i.e. also create an IncompleteTable for it in catalog after the forwarded DDL succeeds? -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 01:02:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8509/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 22:21:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7047/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 22:16:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/6/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/6/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@157 PS6, Line 157: + " is transactional but it was requested without providing validWriteIdList"); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py@215 PS6, Line 215: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 22:01:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: gerrit-verify-dryrun-external -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 22:00:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,328 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/6 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8508/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 21:39:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 5: Code-Review+1 Thanks for making the changes. Updated patch set LGTM. Doing a +1 from my side and will let Quanlong take another look for his review comments. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 21:28:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/17244/5/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/5/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@155 PS5, Line 155: + " is transactional but it was requested without providing validWriteIdList"); line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 21:21:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,334 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/5 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 3: (23 comments) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@141 PS3, Line 141: deepCopy > I think this won't have significant impact on performance. But if we do so, I think what you are saying makes sense. May be a better way for now is to just get rid of the deepCopy and add a comment here to make sure in case this assumption changes in the future, we should make a copy of the table/partition before modifying it here. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91 PS4, Line 91: > I am curious about this value..should this depend on the number of files fo Yes, ideally it should be proportional to the number of directories. We don't have good heuristics on how much do we typically take to load a given number of directories. I guess it also would be different for different filesystems. Right now this is only for informational purposes. I can add a TODO for this as of now if that is okay with you. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91 PS4, Line 91: > nit: usually we append _MS if the time units is ms. It is done for the exte renamed to _MS http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@126 PS4, Line 126: > The comment above says to check the processor engine is set to Impala but t Added a check to make sure that the engine is set to Impala. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@135 PS4, Line 135: > For transactional tables, should we have a precondition that the ValidWrite Good point. However, at this point we don't know if the table is transactional or not. I added a check later down below to make sure that if the table is transactional, we have a ValidWriteIdList for it in the request. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@205 PS4, Line 205: > nit: the defaultCatalog is missing in the params. Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@225 PS4, Line 225: > nit: move this TODO to line 275. This method is different way to get a list of partitions which match a given partition expression. In case of getPartitionsByNames the client is expected to provide list of names while in this case the client provides a expression like partKey != null && partKey > 10. The method first retrieves all the partitions and then returns the ones which match the expression below. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@299 PS4, Line 299: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@312 PS4, Line 312: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@365 PS4, Line 365: > Need to update this as well. Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@414 PS4, Line 414: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@491 PS4, Line 491: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3318 PS4, Line 3318: } > nit: keep the blank line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java@50 PS4, Line 50: > nit: remove 'the' Done ht
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@141 PS3, Line 141: > Thanks for catching this. Yes, it doesn't look like this is necessary. I wa I think this won't have significant impact on performance. But if we do so, it makes sense to do the same for partitions in getPartitionsByNames() as well. Deep copying all partitions could impact performance for large tables. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@225 PS4, Line 225: // TODO add file-metadata to Partitions nit: move this TODO to line 275. Is the common pattern is using this method to get the filtered partition names, and then use getPartitionsByNames to get he file metadata? Or are there other use cases that we don't need the file-metadata? http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@299 PS4, Line 299: if (condition) { : return; : } nit: one line http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@312 PS4, Line 312: if (catalogName == null) { : return; : } nit: one line http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@365 PS4, Line 365: //TODO add table id here when client passes it (CDPD-14901) Need to update this as well. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@414 PS4, Line 414: if (networkAddresses.isEmpty()) { : return result; : } nit: one line http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@491 PS4, Line 491: if (getPartsResult.getPartitionsSize() == 0) { : return; : } nit: one line http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3318 PS4, Line 3318: } nit: keep the blank line http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@105 PS4, Line 105: public void preServe() { : : } nit: one line http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@182 PS4, Line 182: TODO Add SASL and ssl support File an upstream JIRA for this? http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@205 PS4, Line 205: //TODO add config for this? : boolean useCompactProtocol = false; Will setting this to true help on large tables? Do you plan to file an upstream JIRA? http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@265 PS4, Line 265:* @return nit: remove this http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@141 PS3, Line 141: import org.apache.hadoop.hive.metastore.api.LockRequest; > My IDE tells me that this is not a unused import. Some of the methods below hmm, I simply search "InvalidPartitionException" but can't find one... -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Num
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 4: (9 comments) http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91 PS4, Line 91: FALLBACK_FILE_MD_TIME_WARN_THRESHOLD nit: usually we append _MS if the time units is ms. It is done for the external config options but it is a good convention for internal constants as well. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91 PS4, Line 91: 100 I am curious about this value..should this depend on the number of files for which the metadata needs to be loaded ? http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@126 PS4, Line 126: checkCondition(getTableRequest.getEngine() != null, "Column stats are requested " The comment above says to check the processor engine is set to Impala but this is only checking that engine is non-null. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@135 PS4, Line 135: if (getTableRequest.isSetValidWriteIdList()) { For transactional tables, should we have a precondition that the ValidWriteIdList is non-null ? http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@205 PS4, Line 205:* @param catalog The catalog instance which caches the table. nit: the defaultCatalog is missing in the params. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java@50 PS4, Line 50: the nit: remove 'the' http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@57 PS4, Line 57: //TODO add It says TODO here but if --enable_catalogd_hms_cache is true, this patch does serve certain requests from the cache, right ? http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@250 PS4, Line 250: metaserver nit: At other places you have used 'metastore server' or 'metastore service'. Would be good to use the same here. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@146 PS4, Line 146: return super.get_partitions_by_names_req(getPartitionsByNamesRequest); This doesn't do any logging of the request (unlike the get_partition_by_expr) ? Would be good to have the INFO log entry. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 00:53:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8502/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 05 Apr 2021 22:55:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1342 PS4, Line 1342: CatalogHmsAPIHelper.loadAndSetFileMetadataFromFs(validTxnList, validWriteIdList, result); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java@25 PS4, Line 25: import static org.apache.impala.catalog.metastore.CatalogHmsFileMetadataTest.assertFdsAreSame; line too long (94 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 05 Apr 2021 22:35:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 23 files changed, 5,313 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/4 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 3: (30 comments) http://gerrit.cloudera.org:8080/#/c/17244/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17244/3//COMMIT_MSG@7 PS3, Line 7: > nit: no whitespace Done http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/catalog/catalog-server.cc@94 PS3, Line 94: hms_port_number > hms_port Done http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/common/global-flags.cc@352 PS3, Line 352: " it will be redirected to HMS."); > nit: add a blank line here Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@44 PS3, Line 44: import org.apache.hadoop.hive.common.ValidReaderWriteIdList; > nit: unused import Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@87 PS3, Line 87: CatalogHMSAPIHelper > nit: I think our naming rules prefer "CatalogHmsApiHelper" Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@96 PS3, Line 96: maxNonHdfsPartsParallelLoad > Should this be maxHdfsPartsParallelLoad(), or we just choose this because i Yes, I thought of just reusing the larger size of the 2 pools since this is a static pool which is shared by all the API invocations in case of cache misses. I have a TODO here to have it separately configurable. Perhaps I should just use a constant of 20 to avoid confusion. Any thoughts? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@121 PS3, Line 121: GetPartialCatalogObjectRequestBuilder reqBuilder = new GetPartialCatalogObjectRequestBuilder() > line too long (98 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@138 PS3, Line 138: CDPD-14901 > File an upstream JIRA for this? Thanks for spotting this. Updated the comment with the latest observations about this. http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@141 PS3, Line 141: deepCopy > Could you add a comment for deepCopy() here? We already deep copy the hms t Thanks for catching this. Yes, it doesn't look like this is necessary. I was making a copy a bit pessimistically since that would mean that we are using implementation knowledge of the API. The copy was necessary since the table was being modified later down below. I added a comment explaining the same but I would be okay to remove the deepCopy too if you think that is an overkill. http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@225 PS3, Line 225: GetPartialCatalogObjectRequestBuilder catalogReq = new GetPartialCatalogObjectRequestBuilder() > line too long (98 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@519 PS3, Line 519: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@523 PS3, Line 523: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java@46 PS3, Line 46: CatalogHMSClientUtils > nit: CatalogHmsClientUtils? Done http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@141 PS3, Line 141: import org.apache.hadoop.hive.metastore.api.InvalidPartitionException; > nit: unused import My IDE tells me that this is not a unused import. Some of the methods below throw this exception
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@44 PS3, Line 44: import org.apache.hadoop.hive.common.ValidReaderWriteIdList; nit: unused import http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@141 PS3, Line 141: deepCopy Could you add a comment for deepCopy() here? We already deep copy the hms table object in Table.getPartialInfo(): https://github.com/apache/impala/blob/22d1f8c7fe14e636d9186f69cf156851879f37b3/fe/src/main/java/org/apache/impala/catalog/Table.java#L661 Maybe we don't need this? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@141 PS3, Line 141: import org.apache.hadoop.hive.metastore.api.InvalidPartitionException; nit: unused import http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@222 PS3, Line 222: import org.apache.hadoop.hive.metastore.api.UnknownPartitionException; nit: unused import http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@264 PS3, Line 264: import org.apache.impala.catalog.CatalogException; nit: unused import http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@751 PS3, Line 751: info nit: Will this too verbose? Consider "trace"? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@759 PS3, Line 759: Table tbl = result.getTable(); : boolean isTransactional = tbl.getParameters() != null && AcidUtils : .isTransactionalTable(tbl.getParameters()); nit: move these after the following if-clause http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@778 PS3, Line 778: setFileMetadata At the first glance, I thought this is setting the file metadata using those cached in catalogd. However, it always triggers file metadata loading. Can we rename this to "loadFileMetadata" or "loadAndSetFileMetadata"? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1256 PS3, Line 1256: If the hive-exec jar is not present in the classpath, we fall-back to HMS since :* Catalog has no way to deserialize the expression sent over by the client. It seems we just forward the call. Are we missing some codes here? EDIT: Seems copied from CatalogMetastoreServiceHandler.get_partitions_by_expr(). Maybe we can remove these comments. http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1300 PS3, Line 1300: if (fallBackToHMSOnErrors_) { : return; : } nit: Many simple if-clauses in this patch are not in our code style. I think we prefer if (fallBackToHMSOnErrors_) return; It'd be better to keep the same code style. But if these codes are copied from Hive, please ignore this. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 05 Apr 2021 10:14:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/17244/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17244/3//COMMIT_MSG@7 PS3, Line 7: nit: no whitespace http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/catalog/catalog-server.cc@94 PS3, Line 94: hms_port_number hms_port http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/common/global-flags.cc@352 PS3, Line 352: " it will be redirected to HMS."); nit: add a blank line here http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@87 PS3, Line 87: CatalogHMSAPIHelper nit: I think our naming rules prefer "CatalogHmsApiHelper" http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@96 PS3, Line 96: maxNonHdfsPartsParallelLoad Should this be maxHdfsPartsParallelLoad(), or we just choose this because it defaults to 20? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@138 PS3, Line 138: CDPD-14901 File an upstream JIRA for this? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java@46 PS3, Line 46: CatalogHMSClientUtils nit: CatalogHmsClientUtils? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHMSFileMetadataTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHMSFileMetadataTest.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHMSFileMetadataTest.java@50 PS3, Line 50: CatalogHMSFileMetadataTest nit: CatalogHmsFileMetadataTest? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java@50 PS3, Line 50: EnableCatalogdHMSCacheFlagTest nit: EnableCatalogdHmsCacheFlagTest? http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@67 PS3, Line 67: startHMS nit: startHms? -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 02 Apr 2021 13:33:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8472/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 30 Mar 2021 23:51:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@121 PS3, Line 121: GetPartialCatalogObjectRequestBuilder reqBuilder = new GetPartialCatalogObjectRequestBuilder() line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@225 PS3, Line 225: GetPartialCatalogObjectRequestBuilder catalogReq = new GetPartialCatalogObjectRequestBuilder() line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@519 PS3, Line 519: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@523 PS3, Line 523: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1237 PS3, Line 1237: public GetPartitionsPsWithAuthResponse get_partitions_ps_with_auth_req(GetPartitionsPsWithAuthRequest req) line too long (107 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1632 PS3, Line 1632: throws NoSuchObjectException, InvalidObjectException, MetaException, InvalidInputException, TException { line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1643 PS3, Line 1643: throws NoSuchObjectException, MetaException, InvalidObjectException, InvalidInputException, TException { line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1654 PS3, Line 1654: throws NoSuchObjectException, MetaException, InvalidObjectException, InvalidInputException, TException { line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1664 PS3, Line 1664: throws AlreadyExistsException, InvalidObjectException, MetaException, NoSuchObjectException, TException { line too long (111 > 90) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@2435 PS3, Line 2435: public WMCreateOrDropTriggerToPoolMappingResponse create_or_drop_wm_trigger_to_pool_mapping( line too long (94 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 30 Mar 2021 23:22:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. IMPALA-10613 : Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift A fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHMSFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 23 files changed, 5,542 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/3 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8471/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 30 Mar 2021 23:13:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@121 PS2, Line 121: GetPartialCatalogObjectRequestBuilder reqBuilder = new GetPartialCatalogObjectRequestBuilder() line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@225 PS2, Line 225: GetPartialCatalogObjectRequestBuilder catalogReq = new GetPartialCatalogObjectRequestBuilder() line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@519 PS2, Line 519: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@523 PS2, Line 523: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1600 PS2, Line 1600: throws NoSuchObjectException, InvalidObjectException, MetaException, InvalidInputException, TException { line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1611 PS2, Line 1611: throws NoSuchObjectException, MetaException, InvalidObjectException, InvalidInputException, TException { line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1622 PS2, Line 1622: throws NoSuchObjectException, MetaException, InvalidObjectException, InvalidInputException, TException { line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1632 PS2, Line 1632: throws AlreadyExistsException, InvalidObjectException, MetaException, NoSuchObjectException, TException { line too long (111 > 90) http://gerrit.cloudera.org:8080/#/c/17244/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@2403 PS2, Line 2403: public WMCreateOrDropTriggerToPoolMappingResponse create_or_drop_wm_trigger_to_pool_mapping( line too long (94 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 30 Mar 2021 23:00:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. IMPALA-10613 : Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift A fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHMSFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 23 files changed, 5,510 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/2 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 1: (14 comments) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3397 PS1, Line 3397: TGetPartialCatalogObjectRequest req, String tableLoadReason) throws CatalogException { > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@745 PS1, Line 745: new ParallelFileMetadataLoader(getFileSystem(), partBuilders, validWriteIds_, validTxnList, > line too long (95 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@746 PS1, Line 746: Utils.shouldRecursivelyListPartitions(this), getHostIndex(), debugActions, logPrefix) > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@199 PS1, Line 199: new Class[]{ThriftHiveMetastore.Iface.class, ICatalogMetastoreServer.class}, > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java@25 PS1, Line 25: import static org.apache.impala.catalog.metastore.CatalogHMSFileMetadataTest.assertFdsAreSame; > line too long (94 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/17244/1/tests/common/impala_test_suite.py@156 PS1, Line 156: @ > flake8: E303 too many blank lines (2) Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@19 PS1, Line 19: import logging > flake8: F401 'logging' imported but unused Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@25 PS1, Line 25: from hive_metastore.ttypes import GetValidWriteIdsRequest > flake8: F401 'hive_metastore.ttypes.GetValidWriteIdsRequest' imported but u Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@230 PS1, Line 230: @ > flake8: E303 too many blank lines (2) Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@294 PS1, Line 294: " > flake8: E501 line too long (91 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@405 PS1, Line 405: # > flake8: E265 block comment should start with '# ' Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@420 PS1, Line 420: t > flake8: E501 line too long (92 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@427 PS1, Line 427: _ > flake8: E501 line too long (96 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/17244/1/tests/custom_cluster/test_metastore_service.py@430 PS1, Line 430: ) > flake8: E501 line too long (92 > 90 characters) Done -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 30 Mar 2021 22:59:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8470/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 30 Mar 2021 22:57:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. Patch Set 1: (81 comments) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@118 PS1, Line 118: GetPartialCatalogObjectRequestBuilder reqBuilder = new GetPartialCatalogObjectRequestBuilder() line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@219 PS1, Line 219: GetPartialCatalogObjectRequestBuilder catalogReq = new GetPartialCatalogObjectRequestBuilder() line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@504 PS1, Line 504: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@508 PS1, Line 508: + "fallback path. Time taken: {} msec", getPartsResult.getPartitionsSize(), line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3397 PS1, Line 3397: TGetPartialCatalogObjectRequest req, String tableLoadReason) throws CatalogException { line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@745 PS1, Line 745: new ParallelFileMetadataLoader(getFileSystem(), partBuilders, validWriteIds_, validTxnList, line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@746 PS1, Line 746: Utils.shouldRecursivelyListPartitions(this), getHostIndex(), debugActions, logPrefix) line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@199 PS1, Line 199: new Class[]{ThriftHiveMetastore.Iface.class, ICatalogMetastoreServer.class}, line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@277 PS1, Line 277: * APIs that should be served from CatalogD must be overridden in {@link CatalogMetastoreServer}. line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@291 PS1, Line 291: protected static final String METAEXCEPTION_MSG_FORMAT = "Unexpected error occurred while" line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@416 PS1, Line 416: return client.getHiveClient().getThriftClient().get_database_req(getDatabaseRequest); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@521 PS1, Line 521: throws AlreadyExistsException, InvalidObjectException, MetaException, NoSuchObjectException, TException { line too long (111 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@530 PS1, Line 530: throws AlreadyExistsException, InvalidObjectException, MetaException, NoSuchObjectException, TException { line too long (111 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@544 PS1, Line 544: throws AlreadyExistsException, InvalidObjectException, MetaException, NoSuchObjectException, TException { line too long (111 > 90) http://gerrit.cloudera.org:8080/#/c/17244/1/fe/src/main/java/org/apache/impala/
[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17244 Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog .. IMPALA-10613 : Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift A fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHMSFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHMSCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 23 files changed, 5,444 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/1 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar