Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10817 )

Change subject: WIP: KUDU-2191: support table-name identifiers with upper case 
chars
......................................................................


Patch Set 4:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/hms/hms_catalog.h
File src/kudu/hms/hms_catalog.h:

http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/hms/hms_catalog.h@131
PS3, Line 131: only the
             :   // characters a-z, A-Z, 0-9, _, and /.
Nit: I think you can get by with "only ascii alphanumerics, _, and /"


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/hms/hms_catalog.h@134
PS3, Line 134: Normalizeed
Normalized


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/hms/hms_catalog.h@140
PS3, Line 140: table
Nit: "name" maybe? You've used 'table' for a hive::Table* in PopulateTable so I 
don't think it should be used for a std::string here.


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/hms/hms_catalog.h@162
PS3, Line 162:   static Status ParseTableName(const std::string& table,
I think StringPiece is a more idiomatic choice for "pointer to part of a 
string". It doesn't have a mutable_data() member (which you'll need for 
ToLowerCase) but you can add that.

Also, please update the function doc to explain what the lifecycles of 
hms_database and hms_table are now.

On second thought, why bother with this optimization? Why not just continue to 
return std::strings? This doesn't seem performance critical, so the extra 
lifecycle considerations don't seem worth it.


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/hms/hms_catalog.cc
File src/kudu/hms/hms_catalog.cc:

http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/hms/hms_catalog.cc@105
PS3, Line 105: '_', and '/'
> The HMS appears to always allow underscores, and allows forward slashes by
Should we at least warn users about how '/' is configurable in the HMS? Or is 
that too much detail?


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/integration-tests/master_hms-itest.cc
File src/kudu/integration-tests/master_hms-itest.cc:

http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/integration-tests/master_hms-itest.cc@477
PS3, Line 477: vector<string>(
Surprised you needed this. Below too.


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.h@766
PS3, Line 766: 'normalized_table_names_map_', 'table_ids_map_',
             :   // and 'tablet_map_'
Nit: feel free to make this vaguer so it needn't be updated whenever the maps 
change names.


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.cc@318
PS3, Line 318:       auto* existing = 
InsertOrReturnExisting(&catalog_manager_->normalized_table_names_map_,
I'm a little nervous about this; I think there may have been a good reason for 
why the code overwrites entries in table_names_map_, beyond consistency with 
the insertion into table_ids_map_ and tablet_map_.

I can't think of any now, though. Anything interesting in the git history?


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.cc@328
PS3, Line 328:             Substitute("$0 or $1 [id=$2]", 
(*existing)->ToString(), l.data().name(), table_id));
Wouldn't it be helpful to show the IDs of both tables?


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.cc@1513
PS3, Line 1513: );
> When HMS integration is enabled, maybe we should log the normalized  table
If you do that, update the log statement on L1525 and possibly elsewhere here 
too.


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.cc@2304
PS3, Line 2304:     // reserve the new table name, since it's already reserved 
by way of
Nit: normally I wouldn't advocate for a double negative, but I wonder if the 
condition would be more readable if it matched the comment:

  if (!(table_name != req.new_table_name() && normalized_table_name == 
normalized_new_table_name))


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/master/catalog_manager.cc@4794
PS3, Line 4794:   string normalized_table_name = table_name;
Why ignore_result()? AFAICT every place calling 
CatalogManager::NormalizeTableName could return a bad Status, so why not 
propagate this?


http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/10817/3/src/kudu/mini-cluster/external_mini_cluster.h@322
PS3, Line 322:   // Enable Hive Metastore integration.
             :   // Overrides HMS integration options set by 
ExternalMiniClusterOptions.
             :   void EnableMetastoreIntegration();
             :
             :   // Disable Hive Metastore integration.
             :   // Overrides HMS integration options set by 
ExternalMiniClusterOptions.
             :   void DisableMetastoreIntegration();
These should probably doc that the cluster must be restarted for the change to 
take effect.

Alternatively, perhaps it should be like SetDaemonBinPath and the cluster has 
to be stopped to even call these? Seems simpler for clients.



--
To view, visit http://gerrit.cloudera.org:8080/10817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18977d6fe7b2999a36681a728ac0d1e54b7f38cd
Gerrit-Change-Number: 10817
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Wed, 27 Jun 2018 17:43:23 +0000
Gerrit-HasComments: Yes

Reply via email to