[kudu-CR] [compaction] KUDU-1400: Improve rowset compaction policy to consider merging small DRSs
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11869 ) Change subject: [compaction] KUDU-1400: Improve rowset compaction policy to consider merging small DRSs .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc File src/kudu/tablet/compaction_policy-test.cc: http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@381 PS4, Line 381: if (divisor > 2) { : ASSERT_EQ(rowsets.size(), picked.size()); > Yep, I understand the general case. That was more about having explicit sc Yeah, what I was trying to say in my previous comment was that the existing test cases exhaust most of the "wiggle room" on the constants. Adding a new test case doesn't specify the behavior more. -- To view, visit http://gerrit.cloudera.org:8080/11869 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b421c6ed77d28ebab9b91a4d6fcb1e825997e6c Gerrit-Change-Number: 11869 Gerrit-PatchSet: 4 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 20 Nov 2018 07:02:36 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11958 ) Change subject: [tools] ksck: Add information about replica counts to plain ksck output .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/11958/3/src/kudu/tools/ksck_results.h File src/kudu/tools/ksck_results.h: http://gerrit.cloudera.org:8080/#/c/11958/3/src/kudu/tools/ksck_results.h@358 PS3, Line 358: in the latter (default) : // case, only a quartile summary of the counts will be printed. nit: maybe generalize this to just mention that it prints a summary of the replica counts, since the quartiles are somewhat an implementation detail, and this also doesn't mention outliers. Also the fact that PLAIN_CONCISE is the default probably doesn't need to be here (and maybe shouldn't, in case that changes?) -- To view, visit http://gerrit.cloudera.org:8080/11958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 Gerrit-Change-Number: 11958 Gerrit-PatchSet: 3 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fengling Wang Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 20 Nov 2018 07:44:13 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output
Hello Fengling Wang, Tidy Bot, Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, Mitch Barnett, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11958 to look at the new patch set (#3). Change subject: [tools] ksck: Add information about replica counts to plain ksck output .. [tools] ksck: Add information about replica counts to plain ksck output This adds some information about replica counts on tablet servers to the output of ksck when ksck is in PLAIN_* mode (i.e. not JSON output). It outputs a 5-number summary of the distribution of replicas and lists any outliers: Tablet Replica Count Summary Statistic| Replica Count +--- Minimum| 1646 First Quartile | 3672 Median | 4075 Third Quartile | 4242 Maximum| 4600 Tablet Replica Count Outliers Type | UUID | Host | Replica Count ---+--++--- Small | cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 1646 In PLAIN_FULL mode it additionally outputs the replica count for every tablet server: Tablet Replica Count by Tablet Server UUID | Host | Replica Count --++--- 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96 a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101 c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101 cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76 I also tested it against an empty cluster. There's no unit tests added, just because our current testing setup for ksck makes it really painful to add one for this, and it seemed easy enough to check out manually. Probably, a follow up should straighten out ksck-test to make testing ksck changes easier. Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 --- M src/kudu/tools/ksck_results.cc M src/kudu/tools/ksck_results.h 2 files changed, 131 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/11958/3 -- To view, visit http://gerrit.cloudera.org:8080/11958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 Gerrit-Change-Number: 11958 Gerrit-PatchSet: 3 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fengling Wang Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11958 ) Change subject: [tools] ksck: Add information about replica counts to plain ksck output .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@11 PS1, Line 11: it outputs a 5-number summary of the distribution of : replicas and lists any outliers > Maybe your answer to this lies in what you've implemented, but do you think Yeah, why not? It's the verbose output. http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@26 PS1, Line 26: Small > Hrm, I might've missed this. What is this referring to? Also, how are you q It should be pretty clear what it means, I think. The reason I even include it is thatthe table will be sorted Small before Big and then within Small sorted by most to least replicas and within Big by least to most, so having the Small/Big label helps make the order understandable. http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@28 PS1, Line 28: In PLAIN_FULL mode it outputs the replica count for every tablet server: : : Tablet Replica Count by Tablet Server :UUID | Host | Replica Count : --++--- : 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100 : 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100 : 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101 : 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95 : 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94 : 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96 : a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101 : c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101 : cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76 : : I also tested it against an empty cluster. > Did you test it with the -tables, -tablets configurations? From the impleme Yes. Actually this output comes from using the -tables. http://gerrit.cloudera.org:8080/#/c/11958/1/src/kudu/tools/ksck_results.cc File src/kudu/tools/ksck_results.cc: http://gerrit.cloudera.org:8080/#/c/11958/1/src/kudu/tools/ksck_results.cc@576 PS1, Line 576: tservers_sorted_by_replica_count.emplace_back(entry.first, > warning: 'emplace_back' is called inside a loop; consider pre-allocating th Done -- To view, visit http://gerrit.cloudera.org:8080/11958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 Gerrit-Change-Number: 11958 Gerrit-PatchSet: 1 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fengling Wang Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 20 Nov 2018 07:22:46 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output
Hello Fengling Wang, Tidy Bot, Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, Mitch Barnett, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11958 to look at the new patch set (#2). Change subject: [tools] ksck: Add information about replica counts to plain ksck output .. [tools] ksck: Add information about replica counts to plain ksck output This adds some information about replica counts on tablet servers to the output of ksck when ksck is in PLAIN_* mode (i.e. not JSON output). It outputs a 5-number summary of the distribution of replicas and lists any outliers: Tablet Replica Count Summary Statistic| Replica Count +--- Minimum| 1646 First Quartile | 3672 Median | 4075 Third Quartile | 4242 Maximum| 4600 Tablet Replica Count Outliers Type | UUID | Host | Replica Count ---+--++--- Small | cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 1646 In PLAIN_FULL mode it additionally outputs the replica count for every tablet server: Tablet Replica Count by Tablet Server UUID | Host | Replica Count --++--- 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96 a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101 c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101 cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76 I also tested it against an empty cluster. There's no unit tests added, just because our current testing setup for ksck makes it really painful to add one for this, and it seemed easy enough to check out manually. Probably, a follow up should straighten out ksck-test to make testing ksck changes easier. Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 --- M src/kudu/tools/ksck_results.cc M src/kudu/tools/ksck_results.h 2 files changed, 130 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/11958/2 -- To view, visit http://gerrit.cloudera.org:8080/11958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 Gerrit-Change-Number: 11958 Gerrit-PatchSet: 2 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fengling Wang Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [util] more information on fault injection
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11960 ) Change subject: [util] more information on fault injection .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61 Gerrit-Change-Number: 11960 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 20 Nov 2018 07:07:50 + Gerrit-HasComments: No
[kudu-CR] De-flake sentry tests
Hao Hao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11961 Change subject: De-flake sentry tests .. De-flake sentry tests Currently, sentry_client-test and sentry_authz_provider-test are still flaky due to possible port collision. To avoid such race condition, this patch moves the util function that finds the bind address for a daemon based on binding mode to test_util, and updates sentry-test-base to use it to pick up a unique bind address. Change-Id: Id112634ad5e6f0e2c48b7e0e89389faff4f5656b --- M src/kudu/integration-tests/master_migration-itest.cc M src/kudu/integration-tests/master_sentry-itest.cc M src/kudu/integration-tests/security-itest.cc M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/mini-cluster/external_mini_cluster.h M src/kudu/mini-cluster/internal_mini_cluster.cc M src/kudu/mini-cluster/internal_mini_cluster.h M src/kudu/mini-cluster/mini_cluster.cc M src/kudu/mini-cluster/mini_cluster.h M src/kudu/sentry/sentry-test-base.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/util/test_util.cc M src/kudu/util/test_util.h 14 files changed, 158 insertions(+), 152 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/11961/1 -- To view, visit http://gerrit.cloudera.org:8080/11961 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id112634ad5e6f0e2c48b7e0e89389faff4f5656b Gerrit-Change-Number: 11961 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11956 to look at the new patch set (#4). Change subject: [sentry] add mini Sentry to the external mini cluster .. [sentry] add mini Sentry to the external mini cluster This commit enables external mini cluster to be able to start with both mini sentry and mini hms. A major challenge is the circular dependency between the HMS and Sentry in terms of configuring each with the other's address, including port which is currently discovered by polling lsof. To work around it, we can do: * Start the Sentry service. Find out which port it's on. * Start the HMS, configured to talk to Sentry's port. Find out which port it's on. * Restart the Sentry service on the same port, reconfigured to talk to the HMS's port. However, there could be a race condition that another program binds to the same port in step 3. One option is to use SO_REUSEPORT socket option, which permits multiple sockets to be bound to an identical socket address. Though both Sentry and HMS are written in Java, and only JDK9 (and above) supports this socket option[1]. Methods such as Java reflection can be used to invoke private methods to set up this option, but is hacky. This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that external servers such as Sentry can pick up a unique bind address, knowing that in doing so there'll never be any danger of a port collision. [1]: https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 --- M src/kudu/hms/mini_hms.cc M src/kudu/hms/mini_hms.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/master_sentry-itest.cc M src/kudu/mini-cluster/CMakeLists.txt M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/mini-cluster/external_mini_cluster.h M src/kudu/mini-cluster/mini_cluster.cc M src/kudu/mini-cluster/mini_cluster.h M src/kudu/security/test/mini_kdc.cc M src/kudu/sentry/mini_sentry.cc M src/kudu/sentry/mini_sentry.h M src/kudu/sentry/sentry-test-base.h M src/kudu/util/test_util.cc M src/kudu/util/test_util.h 15 files changed, 488 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/4 -- To view, visit http://gerrit.cloudera.org:8080/11956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 Gerrit-Change-Number: 11956 Gerrit-PatchSet: 4 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [util] more information on fault injection
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11960 to look at the new patch set (#2). Change subject: [util] more information on fault injection .. [util] more information on fault injection Log more information on injected fault: name of the process and PID. That helps to parse concatenated logs: e.g., that useful in parsing logs from scenarios run via dist-test. Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61 --- M src/kudu/util/fault_injection.cc 1 file changed, 6 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/11960/2 -- To view, visit http://gerrit.cloudera.org:8080/11960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61 Gerrit-Change-Number: 11960 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [util] more information on fault injection
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11960 Change subject: [util] more information on fault injection .. [util] more information on fault injection Log more information on injected fault: name of the process and PID. That helps to parse concatenated logs: e.g., that useful in parsing logs from scenarios run via dist-test. Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61 --- M src/kudu/util/fault_injection.cc 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/11960/1 -- To view, visit http://gerrit.cloudera.org:8080/11960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61 Gerrit-Change-Number: 11960 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin
[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11958 ) Change subject: [tools] ksck: Add information about replica counts to plain ksck output .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@11 PS1, Line 11: it outputs a 5-number summary of the distribution of : replicas and lists any outliers Maybe your answer to this lies in what you've implemented, but do you think this is useful to include in PLAIN_FULL? http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@26 PS1, Line 26: Small Hrm, I might've missed this. What is this referring to? Also, how are you quantifying what an outlier is? Ah, from the patch it's clear, but mind also noting this in the commit message? Upon googling things, I guess the answer here is obvious... http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@28 PS1, Line 28: In PLAIN_FULL mode it outputs the replica count for every tablet server: : : Tablet Replica Count by Tablet Server :UUID | Host | Replica Count : --++--- : 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100 : 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100 : 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101 : 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95 : 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94 : 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96 : a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101 : c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101 : cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76 : : I also tested it against an empty cluster. Did you test it with the -tables, -tablets configurations? From the implementation, it seems like this should be handled, but it's probably worth a look that it looks ok. -- To view, visit http://gerrit.cloudera.org:8080/11958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 Gerrit-Change-Number: 11958 Gerrit-PatchSet: 1 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fengling Wang Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 20 Nov 2018 02:01:55 + Gerrit-HasComments: Yes
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11956 ) Change subject: [sentry] add mini Sentry to the external mini cluster .. Patch Set 3: (20 comments) http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@13 PS2, Line 13: we can do nit: this reads a bit like this is what's implemented. Mind changing it to "one approach would be to:" http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.h File src/kudu/hms/mini_hms.h: http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.h@54 PS2, Line 54: Configures the mini HMS to enable the Sentry plugin. nit: doc what 'sentry_service_principal' is? What happens when it's boost::none? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc File src/kudu/hms/mini_hms.cc: http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@81 PS2, Line 81: "Enabling Sentry for HMS"; nit: maybe add the sentry address? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@164 PS2, Line 164: Status wait = WaitForTcpBind(hms_process_->pid(), _, none, nit: Can you add a comment about why `none` is appropriate here? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@318 PS2, Line 318: Configures the HMS to use the Sentry plugin for filtering : // read results. I'm not sure I understand this. What is a "read result" in this context? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@322 PS2, Line 322: Configures the HMS to use the Sentry event listener for : // authorization metadata operations. Just making sure I understand this, this is configured to ensure the HMS will consult Sentry for authorization metadata? What is considered an authorization metadata operation? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@328 PS2, Line 328: service. nit: missing space http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@327 PS2, Line 327: synchronizes the notification logs with the Sentry : //service. Just making sure I understand this, this is synchronizing authz metadata entries in the HMS with changes in Sentry? I'm unsure of where the notification logs come into play. http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@359 PS2, Line 359: HNS HMS http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc File src/kudu/integration-tests/master_sentry-itest.cc: http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@55 PS2, Line 55: const ExternalMiniCluster::BindMode bind_mode; Hrm, where is this used? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@86 PS2, Line 86: { ExternalMiniCluster::BindMode::LOOPBACK, true, }, : { ExternalMiniCluster::BindMode::LOOPBACK, false, }, Mind commenting what the behavior for this test is for mac? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@103 PS2, Line 103: Substitute("$0.$1", kDatabaseName, kTableName) nit: maybe pull this out into a kTableIdentifier? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@117 PS2, Line 117: if (params.enable_kerberos) { : // Create a principal 'kudu' and configure to use it. : ASSERT_OK(cluster_->kdc()->CreateUserPrincipal("kudu")); : ASSERT_OK(cluster_->kdc()->Kinit("kudu")); : ASSERT_OK(cluster_->kdc()->SetKrb5Environment()); : hms_client_opts.enable_kerberos = true; : hms_client_opts.service_principal = "hive"; : } : hms::HmsClient hms_client(cluster_->hms()->address(), hms_client_opts); : ASSERT_OK(hms_client.Start()); : : hive::Table hms_table; : ASSERT_OK(hms_client.GetTable(kDatabaseName, kTableName, _table)); Are there authz metadata entries we should be checking here? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h File src/kudu/mini-cluster/external_mini_cluster.h: http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@159 PS2, Line 159: // If true, set up a Sentry service as part of this ExternalMiniCluster. nit: Is there a default worth mentioning here? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@419 PS2, Line 419: // Index to identify the number of external servers being started. I'm a little confused about this variable. Usually an index is a pointer into some list, but this comment makes it out to be some kind of count? Mind
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11956 ) Change subject: [sentry] add mini Sentry to the external mini cluster .. Patch Set 3: (34 comments) http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@15 PS2, Line 15: * Start the Sentry service. Find out which port it's on. Good explanation, but I'd make two small tweaks to improve readability: 1. Make it a numbered list (i.e 1, 2, 3) in order to emphasize the order of events 2. For the first step, point out that the Sentry service is configured without any knowledge of an HMS service. It'll start up but will be useless otherwise. http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@21 PS2, Line 21: that Nit: where http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@22 PS2, Line 22: same port in step 3 Nit: "same port during the restart in step 3." http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@24 PS2, Line 24: Though both Sentry and HMS are written in Java, and only JDK9 (and above) : supports this socket option[1] Nit: I'd rewrite as "Although both Sentry and HMS are written in Java, only JDK9 (and above) supports this socket option[1]". http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@26 PS2, Line 26: is hacky. Nit: "they are hacky" http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@33 PS2, Line 33: [1]: https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR Wrong link here; didn't you mean the SO_REUSEPORT anchor? http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc File src/kudu/hms/mini_hms.cc: http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@84 PS2, Line 84: sentry_service_principal_ = *sentry_service_principal; If you're going to go through the trouble to wrap sentry_service_principal in an optional to reflect its optionality in the EnableSentry() function, shouldn't that optionality carry over into the sentry_service_principal_ state as well, so that methods like IsAuthorizationEnabled() are more clear? Then you could also make this a simpler optional and std::move it into place. http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@164 PS2, Line 164: Status wait = WaitForTcpBind(hms_process_->pid(), _, none, Should use /*foo=*/ inline comments to help explain what 'none' means. http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@306 PS2, Line 306: : hive.sentry.conf.url : file://$1/hive-sentry-site.xml : Shouldn't this also be conditioned on IsAuthorizationEnabled? Otherwise there will have been no hive-sentry-site.xml file. http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@328 PS2, Line 328: //service. Nit: indentation http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@355 PS2, Line 355: sentry_address_); Where is this substituted? I don't see a $7 in kHiveFileTemplate. http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@359 PS2, Line 359: HNS HMS http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@363 PS2, Line 363: // Set of service users which is safe to be excluded from the : // Sentry authorization checks. Nit: I'd make this a bit more explicit "Set of service users whose access will be excluded from Sentry authorization checks." http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@366 PS2, Line 366: The indentation here is pretty confusing w.r.t. the IsAuthorizationEnabled condition. Maybe shift this block all the way left? http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/CMakeLists.txt File src/kudu/integration-tests/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/CMakeLists.txt@91 PS3, Line 91: ADD_KUDU_TEST(master_sentry-itest RUN_SERIAL true PROCESSORS 4) How did you arrive at the PROCESSORS value? And why use RUN_SERIAL here? http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc File src/kudu/integration-tests/master_sentry-itest.cc: http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@55 PS3, Line 55: const ExternalMiniCluster::BindMode bind_mode; Looks like you're not using this param in SetUp(). Nor do I think you need to; the default BindMode is already taken care of in mini_cluster.h. http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@71 PS3, Line 71: opts.enable_sentry = true; What happens if enable_sentry=true but enable_kerberos=false? The MiniSentry is started but not configured to be used?
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11956 to look at the new patch set (#3). Change subject: [sentry] add mini Sentry to the external mini cluster .. [sentry] add mini Sentry to the external mini cluster This commit enables external mini cluster to be able to start with both mini sentry and mini hms. A major challenge is the circular dependency between the HMS and Sentry in terms of configuring each with the other's address, including port which is currently discovered by polling lsof. To work around it, we can do: * Start the Sentry service. Find out which port it's on. * Start the HMS, configured to talk to Sentry's port. Find out which port it's on. * Restart the Sentry service on the same port, reconfigured to talk to the HMS's port. However, there could be a race condition that another program binds to the same port in step 3. One option is to use SO_REUSEPORT socket option, which permits multiple sockets to be bound to an identical socket address. Though both Sentry and HMS are written in Java, and only JDK9 (and above) supports this socket option[1]. Methods such as Java reflection can be used to invoke private methods to set up this option, but is hacky. This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that external servers such as Sentry can pick up a unique bind address, knowing that in doing so there'll never be any danger of a port collision. [1]: https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 --- M src/kudu/hms/mini_hms.cc M src/kudu/hms/mini_hms.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/master_sentry-itest.cc M src/kudu/mini-cluster/CMakeLists.txt M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/mini-cluster/external_mini_cluster.h M src/kudu/mini-cluster/mini_cluster.cc M src/kudu/mini-cluster/mini_cluster.h M src/kudu/security/test/mini_kdc.cc M src/kudu/sentry/mini_sentry.cc M src/kudu/sentry/mini_sentry.h M src/kudu/sentry/sentry-test-base.h M src/kudu/util/test_util.cc M src/kudu/util/test_util.h 15 files changed, 488 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/3 -- To view, visit http://gerrit.cloudera.org:8080/11956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 Gerrit-Change-Number: 11956 Gerrit-PatchSet: 3 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [location awareness] Register client with location when connect to cluster
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11923 ) Change subject: [location_awareness] Register client with location when connect to cluster .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/client/client-internal.cc@714 PS6, Line 714: location_ = connect_response.client_location(); Does location_ need to be protected by leader_master_lock_? Why or why not? http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/master-test.cc File src/kudu/master/master-test.cc: http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/master-test.cc@1635 PS6, Line 1635: ASSERT_OK(proxy_->ConnectToMaster(req, , )); Should also ASSERT that !resp.has_error. http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.h File src/kudu/master/ts_descriptor.h: http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.h@182 PS6, Line 182: class ClientDescriptor { Why is GetClientLocation stored in this class? Do you envision adding more content to it in the near future? http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.cc File src/kudu/master/ts_descriptor.cc: http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.cc@411 PS6, Line 411: string location_temp; I don't think this indirection is needed; GetLocationFromLocationMappingCmd won't modify the third parameter if it fails, so you could feed 'location' directly into it. -- To view, visit http://gerrit.cloudera.org:8080/11923 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0efb327293d86168a30b05305f69d011ad15587a Gerrit-Change-Number: 11923 Gerrit-PatchSet: 6 Gerrit-Owner: Fengling Wang Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Fengling Wang Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 20 Nov 2018 00:22:19 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output
Will Berkeley has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11958 Change subject: [tools] ksck: Add information about replica counts to plain ksck output .. [tools] ksck: Add information about replica counts to plain ksck output This adds some information about replica counts on tablet servers to the output of ksck when ksck is in PLAIN_* mode (i.e. not JSON output). In PLAIN_CONCISE mode, it outputs a 5-number summary of the distribution of replicas and lists any outliers: Tablet Replica Count Summary Statistic| Replica Count +--- Minimum| 1646 First Quartile | 3672 Median | 4075 Third Quartile | 4242 Maximum| 4600 Tablet Replica Count Outliers Type | UUID | Host | Replica Count ---+--++--- Small | cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 1646 In PLAIN_FULL mode it outputs the replica count for every tablet server: Tablet Replica Count by Tablet Server UUID | Host | Replica Count --++--- 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96 a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101 c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101 cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76 I also tested it against an empty cluster. There's no unit tests added, just because our current testing setup for ksck makes it really painful to add one for this, and it seemed easy enough to check out manually. Probably, a follow up should straighten out ksck-test to make testing ksck changes easier. Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 --- M src/kudu/tools/ksck_results.cc M src/kudu/tools/ksck_results.h 2 files changed, 130 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/11958/1 -- To view, visit http://gerrit.cloudera.org:8080/11958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595 Gerrit-Change-Number: 11958 Gerrit-PatchSet: 1 Gerrit-Owner: Will Berkeley
[kudu-CR] [location awareness] Register client with location when connect to cluster
Hello Will Berkeley, Alexey Serbin, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11923 to look at the new patch set (#6). Change subject: [location_awareness] Register client with location when connect to cluster .. [location_awareness] Register client with location when connect to cluster This patch enables the client to get a location assigned after connected to the cluster. After that, the client can compare its location with the replica locations to select which replica to read. Change-Id: I0efb327293d86168a30b05305f69d011ad15587a --- M src/kudu/client/client-internal.cc M src/kudu/client/client-internal.h M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/master/master-test.cc M src/kudu/master/master.proto M src/kudu/master/master_service.cc M src/kudu/master/ts_descriptor.cc M src/kudu/master/ts_descriptor.h 10 files changed, 97 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/11923/6 -- To view, visit http://gerrit.cloudera.org:8080/11923 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0efb327293d86168a30b05305f69d011ad15587a Gerrit-Change-Number: 11923 Gerrit-PatchSet: 6 Gerrit-Owner: Fengling Wang Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Fengling Wang Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley
[kudu-CR] [docs] add Hive Metastore integration
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11798 ) Change subject: [docs] add Hive Metastore integration .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc File docs/hive_metastore.adoc: http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@46 PS2, Line 46: table identifier names Nit: "table name identifiers" to match the other usages in this doc. http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@67 PS2, Line 67: the Hive database and table : name of Kudu tables Should this be rewritten as "the database and table names of Kudu tables"? http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@107 PS2, Line 107: * Add the `kudu-hive.jar` to the HMS classpath. Using HIVE_AUX_JARS_PATH is Is this enough context for an admin to understand what needs to be done? I don't know anything about HMS so HIVE_AUX_JARS_PATH doesn't mean anything to me, but maybe it will to the average person familiar with HMS? http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@107 PS2, Line 107: Using HIVE_AUX_JARS_PATH is : the most common and straightforward This reads like it's missing a noun at the end. Common and straightforward path maybe? http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@122 PS2, Line 122: ## Upgrading Existing Tables I imagine this section may change depending on what the verdict is on external table support. But that can be addressed in a follow-on. http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@124 PS2, Line 124: When enabling the Hive Metastore integration on a Kudu cluster with existing What happens if you don't run this process? -- To view, visit http://gerrit.cloudera.org:8080/11798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12939c8f2245450ad46898c2050451b090c7ea01 Gerrit-Change-Number: 11798 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 19 Nov 2018 21:55:06 + Gerrit-HasComments: Yes
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Hao Hao has removed a vote on this change. Change subject: [sentry] add mini Sentry to the external mini cluster .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/11956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 Gerrit-Change-Number: 11956 Gerrit-PatchSet: 2 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/11956 ) Change subject: [sentry] add mini Sentry to the external mini cluster .. Patch Set 2: Verified+1 Unrelated flaky TraceEventCallbackTest. -- To view, visit http://gerrit.cloudera.org:8080/11956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 Gerrit-Change-Number: 11956 Gerrit-PatchSet: 2 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 19 Nov 2018 21:21:08 + Gerrit-HasComments: No
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11956 to look at the new patch set (#2). Change subject: [sentry] add mini Sentry to the external mini cluster .. [sentry] add mini Sentry to the external mini cluster This commit enables external mini cluster to be able to start with both mini sentry and mini hms. A major challenge is the circular dependency between the HMS and Sentry in terms of configuring each with the other's address, including port which is currently discovered by polling lsof. To work around it, we can do: * Start the Sentry service. Find out which port it's on. * Start the HMS, configured to talk to Sentry's port. Find out which port it's on. * Restart the Sentry service on the same port, reconfigured to talk to the HMS's port. However, there could be a race condition that another program binds to the same port in step 3. One option is to use SO_REUSEPORT socket option, which permits multiple sockets to be bound to an identical socket address. Though both Sentry and HMS are written in Java, and only JDK9 (and above) supports this socket option[1]. Methods such as Java reflection can be used to invoke private methods to set up this option, but is hacky. This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that external servers such as Sentry can pick up a unique bind address, knowing that in doing so there'll never be any danger of a port collision. [1]: https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 --- M src/kudu/hms/mini_hms.cc M src/kudu/hms/mini_hms.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/master_sentry-itest.cc M src/kudu/mini-cluster/CMakeLists.txt M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/mini-cluster/external_mini_cluster.h M src/kudu/mini-cluster/mini_cluster.cc M src/kudu/mini-cluster/mini_cluster.h M src/kudu/security/test/mini_kdc.cc M src/kudu/sentry/mini_sentry.cc M src/kudu/sentry/mini_sentry.h M src/kudu/sentry/sentry-test-base.h M src/kudu/util/test_util.cc M src/kudu/util/test_util.h 15 files changed, 488 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/2 -- To view, visit http://gerrit.cloudera.org:8080/11956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 Gerrit-Change-Number: 11956 Gerrit-PatchSet: 2 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [docs] add Hive Metastore integration
Hao Hao has uploaded a new patch set (#2) to the change originally created by Dan Burkert. ( http://gerrit.cloudera.org:8080/11798 ) Change subject: [docs] add Hive Metastore integration .. [docs] add Hive Metastore integration Change-Id: I12939c8f2245450ad46898c2050451b090c7ea01 --- A docs/hive_metastore.adoc 1 file changed, 154 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/11798/2 -- To view, visit http://gerrit.cloudera.org:8080/11798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I12939c8f2245450ad46898c2050451b090c7ea01 Gerrit-Change-Number: 11798 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [sentry] add mini Sentry to the external mini cluster
Hao Hao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11956 Change subject: [sentry] add mini Sentry to the external mini cluster .. [sentry] add mini Sentry to the external mini cluster This commit enables external mini cluster to be able to start with both mini sentry and mini hms. A major challenge is the circular dependency between the HMS and Sentry in terms of configuring each with the other's address, including port which is currently discovered by polling lsof. To work around it, we can do: * Start the Sentry service. Find out which port it's on. * Start the HMS, configured to talk to Sentry's port. Find out which port it's on. * Restart the Sentry service on the same port, reconfigured to talk to the HMS's port. However, there could be a race condition that another program binds to the same port in step 3. One option is to use SO_REUSEPORT socket option, which permits multiple sockets to be bound to an identical socket address. Though both Sentry and HMS are written in Java, and only JDK9 (and above) supports this socket option[1]. Methods such as Java reflection can be used to invoke private methods to set up this option, but is hacky. This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that external servers such as Sentry can pick up a unique bind address, knowing that in doing so there'll never be any danger of a port collision. [1]: https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 --- M src/kudu/hms/mini_hms.cc M src/kudu/hms/mini_hms.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/master_sentry-itest.cc M src/kudu/mini-cluster/CMakeLists.txt M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/mini-cluster/external_mini_cluster.h M src/kudu/mini-cluster/mini_cluster.cc M src/kudu/mini-cluster/mini_cluster.h M src/kudu/security/test/mini_kdc.cc M src/kudu/sentry/mini_sentry.cc M src/kudu/sentry/mini_sentry.h M src/kudu/util/test_util.cc M src/kudu/util/test_util.h 14 files changed, 484 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/1 -- To view, visit http://gerrit.cloudera.org:8080/11956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59 Gerrit-Change-Number: 11956 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao
[kudu-CR] KUDU-2038: Support bitmap index
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11722 ) Change subject: KUDU-2038: Support bitmap index .. Patch Set 6: > Here's a design doc that LiFu shared with me via Slack. It was originally a > docx, but I've uploaded it as a gdoc to facilitate comments. If there's > trouble accessing it, I can funnel comments from the gdoc to gerrit. > > https://docs.google.com/document/d/1U_RFs8piw30K7fjyl0m08v2f-eiIjOUOkJRGGXrCoVY/edit?usp=sharing Thanks, Andrew. Lifu, I've left some comments on this gdoc. Could we continue this design discussion there? Let us know if you're not able to and we'll find a different approach. -- To view, visit http://gerrit.cloudera.org:8080/11722 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0edaa0ef1dba2dbce85ebf15f0a731e4939a7860 Gerrit-Change-Number: 11722 Gerrit-PatchSet: 6 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: helifu Gerrit-Comment-Date: Mon, 19 Nov 2018 17:51:46 + Gerrit-HasComments: No