Dan Burkert has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10817 )
Change subject: KUDU-2191: support table-name identifiers with upper case chars ...................................................................... KUDU-2191: support table-name identifiers with upper case chars Summary: When the HMS integration is enabled, Kudu now preserves table name casing, but uses case-insensitive lookups to retrieve tables. Background: The HMS lowercases all database (table) identifiers during database (table) creation, only storing the lowercased version. On database and table lookup the HMS automatically does a case-insensitive compare. During table creation Kudu checks that table names are valid UTF-8, and does no transformations on identiers. During table lookups Kudu requires that the table name match exactly, including case. As a result of these behavior differences and the design of the notification log listener, tables with upper-case characters can not be altered or deleted when the HMS integration is enabled. This commit fixes this by changing how the Catalog Manager handles identifiers when the HMS integration is enabled: * During table creation, the Catalog Manager preserves the case of table names. * On table lookup, the Catalog Manager does a case-insensitive comparison to find the table. This is implemented by storing the preserved case in the table's sys-catalog metadata entry, and storing a 'normalized' (down-cased) identifier in the ephemeral by-name table map. The various parts of the catalog manager which deal with the by-name map are converted to use the normalized version of the name. When the HMS integration is not configured, normalized table names are equal to the original table name, so the behavior changes that this patch introduces are entirely opt-in. There is one edge case that complicates turning on the HMS integration in rare circumstances: if there are existing (legacy) tables with names which map to the same normalized form (e.g. differ only in case), the catalog manager will fail to startup and instruct the operator to rename the offending tables before trying again. Additionally, this check only applies to tables that otherwise follow the Hive table naming rules (matching regex '[\w_/]+\.[\w_/]+'). Change-Id: I18977d6fe7b2999a36681a728ac0d1e54b7f38cd Reviewed-on: http://gerrit.cloudera.org:8080/10817 Reviewed-by: Adar Dembo <a...@cloudera.com> Tested-by: Kudu Jenkins --- M src/kudu/hms/hms_catalog-test.cc M src/kudu/hms/hms_catalog.cc M src/kudu/hms/hms_catalog.h M src/kudu/hms/hms_client-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/mini-cluster/external_mini_cluster.h 10 files changed, 421 insertions(+), 116 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I18977d6fe7b2999a36681a728ac0d1e54b7f38cd Gerrit-Change-Number: 10817 Gerrit-PatchSet: 11 Gerrit-Owner: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org>