Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8312 )

Change subject: KUDU-2191 (4/n): WIP: Hive Metastore catalog manager integration
......................................................................


Patch Set 3:

(11 comments)

curious about timeouts - I dont see any timeouts set here but I'm guessing 
they're necessary (what if you kill -STOP the minihms, what happens to our 
calls?)

http://gerrit.cloudera.org:8080/#/c/8312/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8312/3//COMMIT_MSG@23
PS3, Line 23: master_hms-itest before this is committed:
any plan for fault tests? eg crashing either the HMS or the master at some 
point in the middle of these sequences?


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/catalog_manager.cc@231
PS3, Line 231:               "Addresses of Hive Metastore");
a bit more docs here, eg what the multiple addrs mean, etc. Also does the HMS 
expose more than one port? (eg an HTTP interface?) If so we should clarify that 
it's the Thrift addr


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/catalog_manager.cc@1419
PS3, Line 1419:       SetupError(resp->mutable_error(), 
MasterErrorPB::HIVE_METASTORE_ERROR, s);
I think it's worth logging such errors as well, since it may be an end user 
that gets the response,  and then they'll probably complain to the admin who 
will look in the logs for more details.

We should probably be especially verbose in these logs because some failures at 
this point can cause orphaned pointers in the HMS (eg a timeout may be thrown 
even if the op succeeded)

(same below)


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc
File src/kudu/master/hms_catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@88
PS3, Line 88:                                const Schema& schema) {
worth a CHECK(running_) in these functions? otherwise it seems like they would 
just hang forever


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@198
PS3, Line 198:             "Kudu table name must contain only lower-case ASCII 
characters, "
no digits allowed? that's surprising


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@203
PS3, Line 203:   for (int idx = 0; idx < table.size(); idx++) {
I feel like this code could be written more concisely using some of the stuff 
from gutil/strings/util.h.

For example, you could use Split() on '.' and then ensure that it has 1 or 2 
results, and IsIdentifier() returns true for all parts. Or use 
AdvanceIdentifier up to twice? Or match the whole string against a class that 
includes '.' and then use strcount('.') to  see that there is at most one dot?

I think the 'split' is probably most straight forward


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@228
PS3, Line 228:   for (const HostPort& address : addresses_) {
I'd think we might need to store an index in the class so that we swap to the 
new HMS when one has an issue. Otherwise what if an HMS is "up" (ie accepts 
thrift connections) but times out all calls, or "up" but for some reason is 
unable to talk to its backing RDBMS or whatever? I'd guess we'd want to 
reconnect to the other one.


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@289
PS3, Line 289:         cond_.TimedWait(MonoDelta::FromMilliseconds(wait));
worth some logging here?


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@304
PS3, Line 304:       Status s = rpc.run(&client_.get());
I think wrapping this  in a TRACE_EVENT scope and a   LOG_IF_SLOW and such 
would  be helpful. Perhaps a TRACE message and/or a   TRACE_COUNTER_INCREMENT 
too to keep track of how  much time is spent in calls to HMS for a given 
CreateTable call. That will also require propagating the current Trace object 
into the 'Rpc' object.

If you'd rather defer this work please add a TODO. I'm happy to do it if you 
want to assign me the todo


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@308
PS3, Line 308:         client_ = boost::none;
is  there a more explicit Shutdown we should do on the client? or does calling 
the dtor like this suffice?


http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/util/net/net_util.h
File src/kudu/util/net/net_util.h:

http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/util/net/net_util.h@133
PS3, Line 133: bool ValidateAddressListFlag(const char* flag_name, const 
std::string& addr_list);
doc this? specifically I think this is meant to be used as a gflag VALIDATOR 
and thus the somewhat odd signature



--
To view, visit http://gerrit.cloudera.org:8080/8312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf
Gerrit-Change-Number: 8312
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Tue, 24 Oct 2017 21:20:47 +0000
Gerrit-HasComments: Yes

Reply via email to