Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13375 )
Change subject: IMPALA-8504 (part 2): Support CREATE TABLE statement with Kudu/HMS integration ...................................................................... Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/13375/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/13375/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1893 PS3, Line 1893: * 1. For managed tables, if Kudu's integration with the Hive Metastore is not enabled, : * the Kudu table is first created in the Kudu storage engine, then in the HMS. : * Failure to add the table in HMS results in the table being dropped from Kudu. : * If enabled, after the table is created in the Kudu, relies on Kudu to create : * the table in the HMS. : * For external tables, only creates the tables in the HMS (regardless Kudu's : * integration with the Hive Metastore enabled or not). : * : * 2. And finally creates the table in the catalog cache. nit: the structure of this felt a bit strange since there are three non-partitioned cases to think about in #1. Perhaps make that distinction a bit clearer as (+ some minor grammar tweaks): For managed tables: - If Kudu's integration with the Hive Metastore is not enabled, the Kudu table is first created in Kudu, then in the HMS. - Otherwise, when the table is created in Kudu, we rely on Kudu to have created the table in the HMS. For external tables: - We only create the table in the HMS (regardless of Kudu's integration with the Hive Metastore). After the above is complete, we create the table in the catalog cache. It'd also be nice if the code followed this structure. http://gerrit.cloudera.org:8080/#/c/13375/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1909 PS3, Line 1909: // Check if Kudu's integration with the Hive Metastore is enabled, and validate : // the configuration. : String masterHosts = newTable.getParameters().get(KuduTable.KEY_MASTER_HOSTS); : String hmsUris; : try { : hmsUris = MetaStoreUtil.getHiveMetastoreUrisKeyValue( : catalog_.getMetaStoreClient().getHiveClient()); : } catch (Exception e) { : throw new RuntimeException(String.format("Failed to get the Hive Metastore " + : "configuration for table '%s' ", newTable.getTableName()), e); : } : boolean isHMSIntegrationEnabled = : KuduTable.isHMSIntegrationEnabledAndValidate(masterHosts, hmsUris); Shouldn't be not be checking this in the case of an external table? http://gerrit.cloudera.org:8080/#/c/13375/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/13375/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@99 PS3, Line 99: (Validation of Kudu is configured to the same : // Hive Metastore as Impala should have been done before this). It's not clear where this is happening. Might be worth noting. Alternatively take this sentence out; not sure how useful it is. And presumably we'll have more of these calls in other DDL, so not worth repeating everywhere. http://gerrit.cloudera.org:8080/#/c/13375/3/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@283 PS3, Line 283: // External table should not have table ID. Not your fault, but it isn't quite obvious that this is an external table. Could you add a precheck for that here? And maybe update the comment explaining that this should only be used for external tables? http://gerrit.cloudera.org:8080/#/c/13375/3/tests/custom_cluster/test_kudu.py File tests/custom_cluster/test_kudu.py: http://gerrit.cloudera.org:8080/#/c/13375/3/tests/custom_cluster/test_kudu.py@116 PS3, Line 116: class TestKuduHMSIntegration(CustomClusterTestSuite, KuduTestSuite): Could you add a brief description of this test, similar to L99? http://gerrit.cloudera.org:8080/#/c/13375/3/tests/custom_cluster/test_kudu.py@128 PS3, Line 128: self.run_test_case('QueryTest/kudu_create', vector, use_db=unique_database) What scenarios is this testing? Hard to tell without being more familiar with the test fixtures. Maybe add a comment? Does this cover managed _and_ external tables? -- To view, visit http://gerrit.cloudera.org:8080/13375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icffe412395f47f5e07d97bad457020770cfa7502 Gerrit-Change-Number: 13375 Gerrit-PatchSet: 3 Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Thomas Marshall <tmarsh...@cloudera.com> Gerrit-Comment-Date: Thu, 23 May 2019 18:53:54 +0000 Gerrit-HasComments: Yes