Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/14179 )
Change subject: [hms] Sync external HMS tables with purge property ...................................................................... Patch Set 2: (27 comments) http://gerrit.cloudera.org:8080/#/c/14179/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14179/1//COMMIT_MSG@12 PS1, Line 12: Further, in : Hive 4, non-transactional Hive tables will get auto-migrated : to external tables with `external.table.purge=true`. > So with this patch, HMS sync is compliant with Hive 3 but not with Hive 4? With this patch we should be compliant with Hive 2, 3, and 4. Because HMS sync will now synchronize the managed table type and external table type with the property `external.table.purge=true`. http://gerrit.cloudera.org:8080/#/c/14179/1//COMMIT_MSG@17 PS1, Line 17: synchronize > synchronized Done http://gerrit.cloudera.org:8080/#/c/14179/1//COMMIT_MSG@26 PS1, Line 26: manag > managed Done http://gerrit.cloudera.org:8080/#/c/14179/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14179/2//COMMIT_MSG@13 PS2, Line 13: Hive tables > Does it matter if they are managed or external originally? external tables don't need to be migrated. The migration here is from managed to external tables. http://gerrit.cloudera.org:8080/#/c/14179/1/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java File java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java: http://gerrit.cloudera.org:8080/#/c/14179/1/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@187 PS1, Line 187: oldTableType, newTable. > Should use oldTableType here. Done http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java File java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java: http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@183 PS2, Line 183: // The Kudu master is allowed to make these changes if necessary as it is a trusted user. > Is this in preparation for introducing tooling to migrate managed-->externa Yeah, there are a few reasons for this. The first was to support testing. In versions where the auto-migration code in Hive doesn't exist, pre-hive4, I need to be able to create and alter external purge tables to test the handling. In the future we may want to migrate managed tables to external purge tables using the tools and this will be useful for that too. http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@194 PS2, Line 194: !TableType.MANAGED_TABLE.name().equals(oldTableType) && : !(isExternalTable(oldTable) && > Isn't some or all of this taken care of by isPurgeTable? replaced with `isSynchronizedTable` http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@279 PS2, Line 279: d. > nit: "purging" isn't a well-defined concept in Kudu, and so coming back to Done http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@285 PS2, Line 285: isPurgeTable > nit: Would it make sense to call this "isSynchronizedTable"? We already use This is slightly more generic than checking for synchronized because it doesn't check for "external". I did this in case Hive chooses to use the purge property for other things in the future. This function just indicates tables that are purged, either via being managed or the property. I added an isSynchronizedTable function. http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java File java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java: http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java@20 PS2, Line 20: import static org.hamcrest.CoreMatchers.containsString; > nit: not used? Done http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java@233 PS2, Line 233: altering > Did this test alter the table? What is the rationale behind the test? I moved this more basic validation up from below. http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java@265 PS2, Line 265: > nit: why 3 lines of space here? Done http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java@297 PS2, Line 297: a external type > nit: "an external, synchronized table"? It's actually not required that it's synchronized. This is just show its type can be changed. http://gerrit.cloudera.org:8080/#/c/14179/2/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java@319 PS2, Line 319: { > Generally this is used to define a namespace so we can use the name `altere Primarily this is used in this test to scope the alteredTable variable within the {}. Without them, variable names would need to be unique or we would need to reuse variables. http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/hms/hms_catalog.cc File src/kudu/hms/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/hms/hms_catalog.cc@215 PS2, Line 215: // Remove the Kudu-specific field 'kudu.table_id'. > Removing the managed table validation should not bring any issues, as it wi Correct, this is always removing the id if it exists. If it doesn't, it's a no-op. http://gerrit.cloudera.org:8080/#/c/14179/1/src/kudu/hms/hms_client.h File src/kudu/hms/hms_client.h: http://gerrit.cloudera.org:8080/#/c/14179/1/src/kudu/hms/hms_client.h@191 PS1, Line 191: // Returns true if the HMS table is a managed table or : // if the HMS is an external tabl > This should be reworded slightly to avoid ambiguity. As written it's not cl Done http://gerrit.cloudera.org:8080/#/c/14179/1/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/14179/1/src/kudu/hms/hms_client.cc@413 PS1, Line 413: (table.tableType == hms::HmsClient::kExternalTable && > Nit: indent by another two chars so the first part of the condition is nice Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/hms/hms_client.cc@410 PS2, Line 410: string > nit: could be a const ref. Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/integration-tests/hms_itest-base.cc File src/kudu/integration-tests/hms_itest-base.cc: http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/integration-tests/hms_itest-base.cc@168 PS2, Line 168: RETURN_NOT_OK(hms_client_->GetTable(database_name, table_name, &table)); > nit: spacing. Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/integration-tests/hms_itest-base.cc@175 PS2, Line 175: We also disable the HMS plugin validation to allow a managed table to : // be altered to an external table. > nit: I was a confused at first because I thought we were both pretending to Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/integration-tests/master_hms-itest.cc@438 PS2, Line 438: NO_FATALS(CheckTableDoesNotExist("default", "e")); > nit: spacing. Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/kudu-tool-test.cc@3694 PS2, Line 3694: external table > nit: external table with purge key to be true. Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/kudu-tool-test.cc@3694 PS2, Line 3694: external table > nit: or "external, synchronized table." Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/kudu-tool-test.cc@3757 PS2, Line 3757: external table > nit: external table with purge key to be true (need to be synchronized). Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/kudu-tool-test.cc@3757 PS2, Line 3757: external table > nit: or "external, synchronized table." Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/kudu-tool-test.cc@3760 PS2, Line 3760: SchemaBuilder().Build())); > We're missing the kExternalType here. Done http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/tool_action_hms.cc File src/kudu/tools/tool_action_hms.cc: http://gerrit.cloudera.org:8080/#/c/14179/2/src/kudu/tools/tool_action_hms.cc@369 PS2, Line 369: external_hms_tables_by_id > nit: maybe label this non_external_hms_tables_by_id? Maybe unsynchronized_hms_tables_by_id? That's sort of wrong too since we don't handle all table types, only managed and external. -- To view, visit http://gerrit.cloudera.org:8080/14179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I942515c015b1a4a63cf08383c331c4e19350c312 Gerrit-Change-Number: 14179 Gerrit-PatchSet: 2 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Greg Solovyev <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 09 Sep 2019 14:35:44 +0000 Gerrit-HasComments: Yes
