Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13384 )

Change subject: hms: fix flakiness of master-stress-test from managed tables
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13384/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13384/1//COMMIT_MSG@22
PS1, Line 22: is retried by B
I would think it's being retried by the client that initiated the operation, no?


http://gerrit.cloudera.org:8080/#/c/13384/1/src/kudu/integration-tests/master-stress-test.cc
File src/kudu/integration-tests/master-stress-test.cc:

http://gerrit.cloudera.org:8080/#/c/13384/1/src/kudu/integration-tests/master-stress-test.cc@279
PS1, Line 279:       if (hms_client_ && s.IsRemoteError()) {
             :         // Check that the table was not created in Kudu.
             :         client::sp::shared_ptr<KuduTable> table;
             :         s = client_->OpenTable(to_create, &table);
             :         CHECK(s.IsNotFound()) << s.ToString();
             :
             :         // Check that the table was not created in the HMS.
             :         Slice db;
             :         Slice table_name;
             :         CHECK_OK(ParseHiveTableIdentifier(to_create, &db, 
&table_name));
             :         hive::Table hms_table;
             :         Status s = hms_client_->GetTable(db.ToString(), 
table_name.ToString(), &hms_table);
             :         CHECK(s.IsNotFound()) << s.ToString();
I'm curious whether this piece might inadvertently hide issues in other 
scenarios.  Is there anything specific in the Status::RemoteError() in this 
case that we could use to make sure this outcome has happened due to the leader 
change and nothing else?


http://gerrit.cloudera.org:8080/#/c/13384/1/src/kudu/integration-tests/master-stress-test.cc@280
PS1, Line 280:         // Check that the table was not created in Kudu.
             :         client::sp::shared_ptr<KuduTable> table;
             :         s = client_->OpenTable(to_create, &table);
             :         CHECK(s.IsNotFound()) << s.ToString();
             :
             :         // Check that the table was not created in the HMS.
             :         Slice db;
             :         Slice table_name;
             :         CHECK_OK(ParseHiveTableIdentifier(to_create, &db, 
&table_name));
             :         hive::Table hms_table;
             :         Status s = hms_client_->GetTable(db.ToString(), 
table_name.ToString(), &hms_table);
             :         CHECK(s.IsNotFound()) << s.ToString();
Another this I'm curious is whether this check catches all the outcomes of the 
race described in the commit message.

For example, is it possible that by the time of this check the entry is still 
in the HMS and hasn't been rolled back by the scoped cleanup in the former 
master, and the table is seen as not-yet-existing in Kudu by the newly elected 
leader master?



--
To view, visit http://gerrit.cloudera.org:8080/13384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1233e5b133035d36f98ce08265c0273570caa454
Gerrit-Change-Number: 13384
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 21 May 2019 06:13:57 +0000
Gerrit-HasComments: Yes

Reply via email to