[kudu-CR] [compaction] KUDU-1400: Improve rowset compaction policy to consider merging small DRSs

2018-11-19 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11869 )

Change subject: [compaction] KUDU-1400: Improve rowset compaction policy to 
consider merging small DRSs
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc
File src/kudu/tablet/compaction_policy-test.cc:

http://gerrit.cloudera.org:8080/#/c/11869/4/src/kudu/tablet/compaction_policy-test.cc@381
PS4, Line 381: if (divisor > 2) {
 :   ASSERT_EQ(rowsets.size(), picked.size());
> Yep, I understand the general case.  That was more about having explicit sc
Yeah, what I was trying to say in my previous comment was that the existing 
test cases exhaust most of the "wiggle room" on the constants. Adding a new 
test case doesn't specify the behavior more.



--
To view, visit http://gerrit.cloudera.org:8080/11869
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b421c6ed77d28ebab9b91a4d6fcb1e825997e6c
Gerrit-Change-Number: 11869
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 20 Nov 2018 07:02:36 +
Gerrit-HasComments: Yes


[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output

2018-11-19 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11958 )

Change subject: [tools] ksck: Add information about replica counts to plain 
ksck output
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11958/3/src/kudu/tools/ksck_results.h
File src/kudu/tools/ksck_results.h:

http://gerrit.cloudera.org:8080/#/c/11958/3/src/kudu/tools/ksck_results.h@358
PS3, Line 358:  in the latter (default)
 : // case, only a quartile summary of the counts will be printed.
nit: maybe generalize this to just mention that it prints a summary of the 
replica counts, since the quartiles are somewhat an implementation detail, and 
this also doesn't mention outliers. Also the fact that PLAIN_CONCISE is the 
default probably doesn't need to be here (and maybe shouldn't, in case that 
changes?)



--
To view, visit http://gerrit.cloudera.org:8080/11958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
Gerrit-Change-Number: 11958
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mitch Barnett 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 20 Nov 2018 07:44:13 +
Gerrit-HasComments: Yes


[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output

2018-11-19 Thread Will Berkeley (Code Review)
Hello Fengling Wang, Tidy Bot, Alexey Serbin, Attila Bukor, Kudu Jenkins, 
Andrew Wong, Mitch Barnett,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11958

to look at the new patch set (#3).

Change subject: [tools] ksck: Add information about replica counts to plain 
ksck output
..

[tools] ksck: Add information about replica counts to plain ksck output

This adds some information about replica counts on tablet servers to the
output of ksck when ksck is in PLAIN_* mode (i.e. not JSON output). It
outputs a 5-number summary of the distribution of replicas and lists any
outliers:

Tablet Replica Count Summary
   Statistic| Replica Count
+---
 Minimum| 1646
 First Quartile | 3672
 Median | 4075
 Third Quartile | 4242
 Maximum| 4600

Tablet Replica Count Outliers
 Type  |   UUID   |  Host  | 
Replica Count
---+--++---
 Small | cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 
1646

In PLAIN_FULL mode it additionally outputs the replica count for every
tablet server:

Tablet Replica Count by Tablet Server
   UUID   |  Host  | Replica 
Count
--++---
 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100
 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100
 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101
 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95
 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94
 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96
 a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101
 c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101
 cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76

I also tested it against an empty cluster.

There's no unit tests added, just because our current testing setup for
ksck makes it really painful to add one for this, and it seemed easy
enough to check out manually. Probably, a follow up should straighten
out ksck-test to make testing ksck changes easier.

Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
---
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
2 files changed, 131 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/11958/3
--
To view, visit http://gerrit.cloudera.org:8080/11958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
Gerrit-Change-Number: 11958
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mitch Barnett 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output

2018-11-19 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11958 )

Change subject: [tools] ksck: Add information about replica counts to plain 
ksck output
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@11
PS1, Line 11: it outputs a 5-number summary of the distribution of
: replicas and lists any outliers
> Maybe your answer to this lies in what you've implemented, but do you think
Yeah, why not? It's the verbose output.


http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@26
PS1, Line 26:  Small
> Hrm, I might've missed this. What is this referring to? Also, how are you q
It should be pretty clear what it means, I think. The reason I even include it 
is thatthe table will be sorted Small before Big and then within Small sorted 
by most to least replicas and within Big by least to most, so having the 
Small/Big label helps make the order understandable.


http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@28
PS1, Line 28: In PLAIN_FULL mode it outputs the replica count for every tablet 
server:
:
: Tablet Replica Count by Tablet Server
:UUID   |  Host 
 | Replica Count
: 
--++---
:  09d6bf7a02124145b43f43cb7a667b3d | 
vc1314.halxg.cloudera.com:7050 | 100
:  23d473f441674d43807fd9e631862bfd | 
vc1308.halxg.cloudera.com:7050 | 100
:  2fb5cdac22b0418bb2df456906e42eb4 | 
vc1306.halxg.cloudera.com:7050 | 101
:  70f7ee61ead54b1885d819f354eb3405 | 
vc1316.halxg.cloudera.com:7050 | 95
:  72fcec63e96f4248ae39d114eb3cd7c9 | 
vc1318.halxg.cloudera.com:7050 | 94
:  86708813b37a44bd8e92c711211c8685 | 
vc1310.halxg.cloudera.com:7050 | 96
:  a662440710624c02bd5612df32cb0235 | 
vc1302.halxg.cloudera.com:7050 | 101
:  c9633273962a4521a32d5e177a118a84 | 
vc1312.halxg.cloudera.com:7050 | 101
:  cc32936bc8594948a04fd4240da36aed | 
vc1304.halxg.cloudera.com:7050 | 76
:
: I also tested it against an empty cluster.
> Did you test it with the -tables, -tablets configurations? From the impleme
Yes. Actually this output comes from using the -tables.


http://gerrit.cloudera.org:8080/#/c/11958/1/src/kudu/tools/ksck_results.cc
File src/kudu/tools/ksck_results.cc:

http://gerrit.cloudera.org:8080/#/c/11958/1/src/kudu/tools/ksck_results.cc@576
PS1, Line 576: tservers_sorted_by_replica_count.emplace_back(entry.first,
> warning: 'emplace_back' is called inside a loop; consider pre-allocating th
Done



--
To view, visit http://gerrit.cloudera.org:8080/11958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
Gerrit-Change-Number: 11958
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mitch Barnett 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 20 Nov 2018 07:22:46 +
Gerrit-HasComments: Yes


[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output

2018-11-19 Thread Will Berkeley (Code Review)
Hello Fengling Wang, Tidy Bot, Alexey Serbin, Attila Bukor, Kudu Jenkins, 
Andrew Wong, Mitch Barnett,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11958

to look at the new patch set (#2).

Change subject: [tools] ksck: Add information about replica counts to plain 
ksck output
..

[tools] ksck: Add information about replica counts to plain ksck output

This adds some information about replica counts on tablet servers to the
output of ksck when ksck is in PLAIN_* mode (i.e. not JSON output). It
outputs a 5-number summary of the distribution of replicas and lists any
outliers:

Tablet Replica Count Summary
   Statistic| Replica Count
+---
 Minimum| 1646
 First Quartile | 3672
 Median | 4075
 Third Quartile | 4242
 Maximum| 4600

Tablet Replica Count Outliers
 Type  |   UUID   |  Host  | 
Replica Count
---+--++---
 Small | cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 
1646

In PLAIN_FULL mode it additionally outputs the replica count for every
tablet server:

Tablet Replica Count by Tablet Server
   UUID   |  Host  | Replica 
Count
--++---
 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100
 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100
 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101
 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95
 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94
 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96
 a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101
 c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101
 cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76

I also tested it against an empty cluster.

There's no unit tests added, just because our current testing setup for
ksck makes it really painful to add one for this, and it seemed easy
enough to check out manually. Probably, a follow up should straighten
out ksck-test to make testing ksck changes easier.

Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
---
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
2 files changed, 130 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/11958/2
--
To view, visit http://gerrit.cloudera.org:8080/11958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
Gerrit-Change-Number: 11958
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mitch Barnett 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [util] more information on fault injection

2018-11-19 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11960 )

Change subject: [util] more information on fault injection
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61
Gerrit-Change-Number: 11960
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 20 Nov 2018 07:07:50 +
Gerrit-HasComments: No


[kudu-CR] De-flake sentry tests

2018-11-19 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11961


Change subject: De-flake sentry tests
..

De-flake sentry tests

Currently, sentry_client-test and sentry_authz_provider-test are still
flaky due to possible port collision. To avoid such race condition, this
patch moves the util function that finds the bind address for a daemon
based on binding mode to test_util, and updates sentry-test-base to use
it to pick up a unique bind address.

Change-Id: Id112634ad5e6f0e2c48b7e0e89389faff4f5656b
---
M src/kudu/integration-tests/master_migration-itest.cc
M src/kudu/integration-tests/master_sentry-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster-test.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/mini-cluster/internal_mini_cluster.cc
M src/kudu/mini-cluster/internal_mini_cluster.h
M src/kudu/mini-cluster/mini_cluster.cc
M src/kudu/mini-cluster/mini_cluster.h
M src/kudu/sentry/sentry-test-base.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/util/test_util.cc
M src/kudu/util/test_util.h
14 files changed, 158 insertions(+), 152 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/11961/1
--
To view, visit http://gerrit.cloudera.org:8080/11961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id112634ad5e6f0e2c48b7e0e89389faff4f5656b
Gerrit-Change-Number: 11961
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Hao Hao (Code Review)
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11956

to look at the new patch set (#4).

Change subject: [sentry] add mini Sentry to the external mini cluster
..

[sentry] add mini Sentry to the external mini cluster

This commit enables external mini cluster to be able to start with both
mini sentry and mini hms. A major challenge is the circular dependency
between the HMS and Sentry in terms of configuring each with the other's
address, including port which is currently discovered by polling lsof.
To work around it, we can do:

  * Start the Sentry service. Find out which port it's on.
  * Start the HMS, configured to talk to Sentry's port. Find out which
port it's on.
  * Restart the Sentry service on the same port, reconfigured to talk
to the HMS's port.

However, there could be a race condition that another program binds to
the same port in step 3. One option is to use SO_REUSEPORT socket option,
which permits multiple sockets to be bound to an identical socket address.
Though both Sentry and HMS are written in Java, and only JDK9 (and above)
supports this socket option[1]. Methods such as Java reflection can be
used to invoke private methods to set up this option, but is hacky.

This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that
external servers such as Sentry can pick up a unique bind address,
knowing that in doing so there'll never be any danger of a port
collision.

[1]: 
https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR

Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
---
M src/kudu/hms/mini_hms.cc
M src/kudu/hms/mini_hms.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/master_sentry-itest.cc
M src/kudu/mini-cluster/CMakeLists.txt
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/mini-cluster/mini_cluster.cc
M src/kudu/mini-cluster/mini_cluster.h
M src/kudu/security/test/mini_kdc.cc
M src/kudu/sentry/mini_sentry.cc
M src/kudu/sentry/mini_sentry.h
M src/kudu/sentry/sentry-test-base.h
M src/kudu/util/test_util.cc
M src/kudu/util/test_util.h
15 files changed, 488 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/4
--
To view, visit http://gerrit.cloudera.org:8080/11956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
Gerrit-Change-Number: 11956
Gerrit-PatchSet: 4
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [util] more information on fault injection

2018-11-19 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11960

to look at the new patch set (#2).

Change subject: [util] more information on fault injection
..

[util] more information on fault injection

Log more information on injected fault: name of the process and PID.
That helps to parse concatenated logs: e.g., that useful in parsing
logs from scenarios run via dist-test.

Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61
---
M src/kudu/util/fault_injection.cc
1 file changed, 6 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/11960/2
--
To view, visit http://gerrit.cloudera.org:8080/11960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61
Gerrit-Change-Number: 11960
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [util] more information on fault injection

2018-11-19 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11960


Change subject: [util] more information on fault injection
..

[util] more information on fault injection

Log more information on injected fault: name of the process and PID.
That helps to parse concatenated logs: e.g., that useful in parsing
logs from scenarios run via dist-test.

Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61
---
M src/kudu/util/fault_injection.cc
1 file changed, 4 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/11960/1
--
To view, visit http://gerrit.cloudera.org:8080/11960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id0f094006a89fd271091d1bde13d301ea359ae61
Gerrit-Change-Number: 11960
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output

2018-11-19 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11958 )

Change subject: [tools] ksck: Add information about replica counts to plain 
ksck output
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@11
PS1, Line 11: it outputs a 5-number summary of the distribution of
: replicas and lists any outliers
Maybe your answer to this lies in what you've implemented, but do you think 
this is useful to include in PLAIN_FULL?


http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@26
PS1, Line 26:  Small
Hrm, I might've missed this. What is this referring to? Also, how are you 
quantifying what an outlier is?

Ah, from the patch it's clear, but mind also noting this in the commit message?

Upon googling things, I guess the answer here is obvious...


http://gerrit.cloudera.org:8080/#/c/11958/1//COMMIT_MSG@28
PS1, Line 28: In PLAIN_FULL mode it outputs the replica count for every tablet 
server:
:
: Tablet Replica Count by Tablet Server
:UUID   |  Host 
 | Replica Count
: 
--++---
:  09d6bf7a02124145b43f43cb7a667b3d | 
vc1314.halxg.cloudera.com:7050 | 100
:  23d473f441674d43807fd9e631862bfd | 
vc1308.halxg.cloudera.com:7050 | 100
:  2fb5cdac22b0418bb2df456906e42eb4 | 
vc1306.halxg.cloudera.com:7050 | 101
:  70f7ee61ead54b1885d819f354eb3405 | 
vc1316.halxg.cloudera.com:7050 | 95
:  72fcec63e96f4248ae39d114eb3cd7c9 | 
vc1318.halxg.cloudera.com:7050 | 94
:  86708813b37a44bd8e92c711211c8685 | 
vc1310.halxg.cloudera.com:7050 | 96
:  a662440710624c02bd5612df32cb0235 | 
vc1302.halxg.cloudera.com:7050 | 101
:  c9633273962a4521a32d5e177a118a84 | 
vc1312.halxg.cloudera.com:7050 | 101
:  cc32936bc8594948a04fd4240da36aed | 
vc1304.halxg.cloudera.com:7050 | 76
:
: I also tested it against an empty cluster.
Did you test it with the -tables, -tablets configurations? From the 
implementation, it seems like this should be handled, but it's probably worth a 
look that it looks ok.



--
To view, visit http://gerrit.cloudera.org:8080/11958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
Gerrit-Change-Number: 11958
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mitch Barnett 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 20 Nov 2018 02:01:55 +
Gerrit-HasComments: Yes


[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11956 )

Change subject: [sentry] add mini Sentry to the external mini cluster
..


Patch Set 3:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@13
PS2, Line 13: we can do
nit: this reads a bit like this is what's implemented. Mind changing it to "one 
approach would be to:"


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.h
File src/kudu/hms/mini_hms.h:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.h@54
PS2, Line 54: Configures the mini HMS to enable the Sentry plugin.
nit: doc what 'sentry_service_principal' is? What happens when it's boost::none?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc
File src/kudu/hms/mini_hms.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@81
PS2, Line 81: "Enabling Sentry for HMS";
nit: maybe add the sentry address?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@164
PS2, Line 164: Status wait = WaitForTcpBind(hms_process_->pid(), _, none,
nit: Can you add a comment about why `none` is appropriate here?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@318
PS2, Line 318: Configures the HMS to use the Sentry plugin for filtering
 : // read results.
I'm not sure I understand this. What is a "read result" in this context?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@322
PS2, Line 322: Configures the HMS to use the Sentry event listener for
 : // authorization metadata operations.
Just making sure I understand this, this is configured to ensure the HMS will 
consult Sentry for authorization metadata? What is considered an authorization 
metadata operation?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@328
PS2, Line 328: service.
nit: missing space


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@327
PS2, Line 327: synchronizes the notification logs with the Sentry
 : //service.
Just making sure I understand this, this is synchronizing authz metadata 
entries in the HMS with changes in Sentry? I'm unsure of where the notification 
logs come into play.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@359
PS2, Line 359: HNS
HMS


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc
File src/kudu/integration-tests/master_sentry-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@55
PS2, Line 55:   const ExternalMiniCluster::BindMode bind_mode;
Hrm, where is this used?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@86
PS2, Line 86: { ExternalMiniCluster::BindMode::LOOPBACK, true,  },
:   { ExternalMiniCluster::BindMode::LOOPBACK, false, },
Mind commenting what the behavior for this test is for mac?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@103
PS2, Line 103: Substitute("$0.$1", kDatabaseName, kTableName)
nit: maybe pull this out into a kTableIdentifier?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/integration-tests/master_sentry-itest.cc@117
PS2, Line 117:   if (params.enable_kerberos) {
 : // Create a principal 'kudu' and configure to use it.
 : ASSERT_OK(cluster_->kdc()->CreateUserPrincipal("kudu"));
 : ASSERT_OK(cluster_->kdc()->Kinit("kudu"));
 : ASSERT_OK(cluster_->kdc()->SetKrb5Environment());
 : hms_client_opts.enable_kerberos = true;
 : hms_client_opts.service_principal = "hive";
 :   }
 :   hms::HmsClient hms_client(cluster_->hms()->address(), 
hms_client_opts);
 :   ASSERT_OK(hms_client.Start());
 :
 :   hive::Table hms_table;
 :   ASSERT_OK(hms_client.GetTable(kDatabaseName, kTableName, 
_table));
Are there authz metadata entries we should be checking here?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@159
PS2, Line 159:   // If true, set up a Sentry service as part of this 
ExternalMiniCluster.
nit: Is there a default worth mentioning here?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@419
PS2, Line 419:   // Index to identify the number of external servers being 
started.
I'm a little confused about this variable. Usually an index is a pointer into 
some list, but this comment makes it out to be some kind of count? Mind 

[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11956 )

Change subject: [sentry] add mini Sentry to the external mini cluster
..


Patch Set 3:

(34 comments)

http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@15
PS2, Line 15:   * Start the Sentry service. Find out which port it's on.
Good explanation, but I'd make two small tweaks to improve readability:
1. Make it a numbered list (i.e 1, 2, 3) in order to emphasize the order of 
events
2. For the first step, point out that the Sentry service is configured without 
any knowledge of an HMS service. It'll start up but will be useless otherwise.


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@21
PS2, Line 21: that
Nit: where


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@22
PS2, Line 22: same port in step 3
Nit: "same port during the restart in step 3."


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@24
PS2, Line 24: Though both Sentry and HMS are written in Java, and only JDK9 
(and above)
: supports this socket option[1]
Nit: I'd rewrite as "Although both Sentry and HMS are written in Java, only 
JDK9 (and above) supports this socket option[1]".


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@26
PS2, Line 26: is hacky.
Nit: "they are hacky"


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@33
PS2, Line 33: [1]: 
https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR
Wrong link here; didn't you mean the SO_REUSEPORT anchor?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc
File src/kudu/hms/mini_hms.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@84
PS2, Line 84: sentry_service_principal_ = *sentry_service_principal;
If you're going to go through the trouble to wrap sentry_service_principal in 
an optional to reflect its optionality in the EnableSentry() function, 
shouldn't that optionality carry over into the sentry_service_principal_ state 
as well, so that methods like IsAuthorizationEnabled() are more clear?

Then you could also make this a simpler optional and std::move it into 
place.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@164
PS2, Line 164:   Status wait = WaitForTcpBind(hms_process_->pid(), _, none,
Should use /*foo=*/ inline comments to help explain what 'none' means.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@306
PS2, Line 306:   
 : hive.sentry.conf.url
 : file://$1/hive-sentry-site.xml
 :   
Shouldn't this also be conditioned on IsAuthorizationEnabled? Otherwise there 
will have been no hive-sentry-site.xml file.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@328
PS2, Line 328: //service.
Nit: indentation


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@355
PS2, Line 355:  sentry_address_);
Where is this substituted? I don't see a $7 in kHiveFileTemplate.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@359
PS2, Line 359: HNS
HMS


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@363
PS2, Line 363: // Set of service users which is safe to be excluded 
from the
 : // Sentry authorization checks.
Nit: I'd make this a bit more explicit "Set of service users whose access will 
be excluded from Sentry authorization checks."


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@366
PS2, Line 366:   
The indentation here is pretty confusing w.r.t. the IsAuthorizationEnabled 
condition. Maybe shift this block all the way left?


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/CMakeLists.txt@91
PS3, Line 91: ADD_KUDU_TEST(master_sentry-itest RUN_SERIAL true PROCESSORS 4)
How did you arrive at the PROCESSORS value? And why use RUN_SERIAL here?


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc
File src/kudu/integration-tests/master_sentry-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@55
PS3, Line 55:   const ExternalMiniCluster::BindMode bind_mode;
Looks like you're not using this param in SetUp(). Nor do I think you need to; 
the default BindMode is already taken care of in mini_cluster.h.


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@71
PS3, Line 71: opts.enable_sentry = true;
What happens if enable_sentry=true but enable_kerberos=false? The MiniSentry is 
started but not configured to be used?



[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Hao Hao (Code Review)
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11956

to look at the new patch set (#3).

Change subject: [sentry] add mini Sentry to the external mini cluster
..

[sentry] add mini Sentry to the external mini cluster

This commit enables external mini cluster to be able to start with both
mini sentry and mini hms. A major challenge is the circular dependency
between the HMS and Sentry in terms of configuring each with the other's
address, including port which is currently discovered by polling lsof.
To work around it, we can do:

  * Start the Sentry service. Find out which port it's on.
  * Start the HMS, configured to talk to Sentry's port. Find out which
port it's on.
  * Restart the Sentry service on the same port, reconfigured to talk
to the HMS's port.

However, there could be a race condition that another program binds to
the same port in step 3. One option is to use SO_REUSEPORT socket option,
which permits multiple sockets to be bound to an identical socket address.
Though both Sentry and HMS are written in Java, and only JDK9 (and above)
supports this socket option[1]. Methods such as Java reflection can be
used to invoke private methods to set up this option, but is hacky.

This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that
external servers such as Sentry can pick up a unique bind address,
knowing that in doing so there'll never be any danger of a port
collision.

[1]: 
https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR

Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
---
M src/kudu/hms/mini_hms.cc
M src/kudu/hms/mini_hms.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/master_sentry-itest.cc
M src/kudu/mini-cluster/CMakeLists.txt
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/mini-cluster/mini_cluster.cc
M src/kudu/mini-cluster/mini_cluster.h
M src/kudu/security/test/mini_kdc.cc
M src/kudu/sentry/mini_sentry.cc
M src/kudu/sentry/mini_sentry.h
M src/kudu/sentry/sentry-test-base.h
M src/kudu/util/test_util.cc
M src/kudu/util/test_util.h
15 files changed, 488 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/3
--
To view, visit http://gerrit.cloudera.org:8080/11956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
Gerrit-Change-Number: 11956
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [location awareness] Register client with location when connect to cluster

2018-11-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11923 )

Change subject: [location_awareness] Register client with location when connect 
to cluster
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/client/client-internal.cc@714
PS6, Line 714:   location_ = connect_response.client_location();
Does location_ need to be protected by leader_master_lock_? Why or why not?


http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/master-test.cc
File src/kudu/master/master-test.cc:

http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/master-test.cc@1635
PS6, Line 1635:   ASSERT_OK(proxy_->ConnectToMaster(req, , ));
Should also ASSERT that !resp.has_error.


http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.h
File src/kudu/master/ts_descriptor.h:

http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.h@182
PS6, Line 182: class ClientDescriptor {
Why is GetClientLocation stored in this class? Do you envision adding more 
content to it in the near future?


http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.cc
File src/kudu/master/ts_descriptor.cc:

http://gerrit.cloudera.org:8080/#/c/11923/6/src/kudu/master/ts_descriptor.cc@411
PS6, Line 411: string location_temp;
I don't think this indirection is needed; GetLocationFromLocationMappingCmd 
won't modify the third parameter if it fails, so you could feed 'location' 
directly into it.



--
To view, visit http://gerrit.cloudera.org:8080/11923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0efb327293d86168a30b05305f69d011ad15587a
Gerrit-Change-Number: 11923
Gerrit-PatchSet: 6
Gerrit-Owner: Fengling Wang 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 20 Nov 2018 00:22:19 +
Gerrit-HasComments: Yes


[kudu-CR] [tools] ksck: Add information about replica counts to plain ksck output

2018-11-19 Thread Will Berkeley (Code Review)
Will Berkeley has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11958


Change subject: [tools] ksck: Add information about replica counts to plain 
ksck output
..

[tools] ksck: Add information about replica counts to plain ksck output

This adds some information about replica counts on tablet servers to the
output of ksck when ksck is in PLAIN_* mode (i.e. not JSON output). In
PLAIN_CONCISE mode, it outputs a 5-number summary of the distribution of
replicas and lists any outliers:

Tablet Replica Count Summary
   Statistic| Replica Count
+---
 Minimum| 1646
 First Quartile | 3672
 Median | 4075
 Third Quartile | 4242
 Maximum| 4600

Tablet Replica Count Outliers
 Type  |   UUID   |  Host  | 
Replica Count
---+--++---
 Small | cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 
1646

In PLAIN_FULL mode it outputs the replica count for every tablet server:

Tablet Replica Count by Tablet Server
   UUID   |  Host  | Replica 
Count
--++---
 09d6bf7a02124145b43f43cb7a667b3d | vc1314.halxg.cloudera.com:7050 | 100
 23d473f441674d43807fd9e631862bfd | vc1308.halxg.cloudera.com:7050 | 100
 2fb5cdac22b0418bb2df456906e42eb4 | vc1306.halxg.cloudera.com:7050 | 101
 70f7ee61ead54b1885d819f354eb3405 | vc1316.halxg.cloudera.com:7050 | 95
 72fcec63e96f4248ae39d114eb3cd7c9 | vc1318.halxg.cloudera.com:7050 | 94
 86708813b37a44bd8e92c711211c8685 | vc1310.halxg.cloudera.com:7050 | 96
 a662440710624c02bd5612df32cb0235 | vc1302.halxg.cloudera.com:7050 | 101
 c9633273962a4521a32d5e177a118a84 | vc1312.halxg.cloudera.com:7050 | 101
 cc32936bc8594948a04fd4240da36aed | vc1304.halxg.cloudera.com:7050 | 76

I also tested it against an empty cluster.

There's no unit tests added, just because our current testing setup for
ksck makes it really painful to add one for this, and it seemed easy
enough to check out manually. Probably, a follow up should straighten
out ksck-test to make testing ksck changes easier.

Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
---
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
2 files changed, 130 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/11958/1
--
To view, visit http://gerrit.cloudera.org:8080/11958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7e5373033ab84c1e34f9519eb9bd4e04a652c595
Gerrit-Change-Number: 11958
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 


[kudu-CR] [location awareness] Register client with location when connect to cluster

2018-11-19 Thread Fengling Wang (Code Review)
Hello Will Berkeley, Alexey Serbin, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11923

to look at the new patch set (#6).

Change subject: [location_awareness] Register client with location when connect 
to cluster
..

[location_awareness] Register client with location when connect to cluster

This patch enables the client to get a location assigned after
connected to the cluster. After that, the client can compare its
location with the replica locations to select which replica to read.

Change-Id: I0efb327293d86168a30b05305f69d011ad15587a
---
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/master/master-test.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
10 files changed, 97 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/11923/6
--
To view, visit http://gerrit.cloudera.org:8080/11923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0efb327293d86168a30b05305f69d011ad15587a
Gerrit-Change-Number: 11923
Gerrit-PatchSet: 6
Gerrit-Owner: Fengling Wang 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [docs] add Hive Metastore integration

2018-11-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11798 )

Change subject: [docs] add Hive Metastore integration
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc
File docs/hive_metastore.adoc:

http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@46
PS2, Line 46: table identifier names
Nit: "table name identifiers" to match the other usages in this doc.


http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@67
PS2, Line 67: the Hive database and table
: name of Kudu tables
Should this be rewritten as "the database and table names of Kudu tables"?


http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@107
PS2, Line 107: * Add the `kudu-hive.jar` to the HMS classpath. Using 
HIVE_AUX_JARS_PATH is
Is this enough context for an admin to understand what needs to be done? I 
don't know anything about HMS so HIVE_AUX_JARS_PATH doesn't mean anything to 
me, but maybe it will to the average person familiar with HMS?


http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@107
PS2, Line 107: Using HIVE_AUX_JARS_PATH is
 : the most common and straightforward
This reads like it's missing a noun at the end. Common and straightforward path 
maybe?


http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@122
PS2, Line 122: ## Upgrading Existing Tables
I imagine this section may change depending on what the verdict is on external 
table support. But that can be addressed in a follow-on.


http://gerrit.cloudera.org:8080/#/c/11798/2/docs/hive_metastore.adoc@124
PS2, Line 124: When enabling the Hive Metastore integration on a Kudu cluster 
with existing
What happens if you don't run this process?



--
To view, visit http://gerrit.cloudera.org:8080/11798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12939c8f2245450ad46898c2050451b090c7ea01
Gerrit-Change-Number: 11798
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 19 Nov 2018 21:55:06 +
Gerrit-HasComments: Yes


[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Hao Hao (Code Review)
Hao Hao has removed a vote on this change.

Change subject: [sentry] add mini Sentry to the external mini cluster
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/11956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
Gerrit-Change-Number: 11956
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11956 )

Change subject: [sentry] add mini Sentry to the external mini cluster
..


Patch Set 2: Verified+1

Unrelated flaky TraceEventCallbackTest.


-- 
To view, visit http://gerrit.cloudera.org:8080/11956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
Gerrit-Change-Number: 11956
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 19 Nov 2018 21:21:08 +
Gerrit-HasComments: No


[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Hao Hao (Code Review)
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11956

to look at the new patch set (#2).

Change subject: [sentry] add mini Sentry to the external mini cluster
..

[sentry] add mini Sentry to the external mini cluster

This commit enables external mini cluster to be able to start with both
mini sentry and mini hms. A major challenge is the circular dependency
between the HMS and Sentry in terms of configuring each with the other's
address, including port which is currently discovered by polling lsof.
To work around it, we can do:

  * Start the Sentry service. Find out which port it's on.
  * Start the HMS, configured to talk to Sentry's port. Find out which
port it's on.
  * Restart the Sentry service on the same port, reconfigured to talk
to the HMS's port.

However, there could be a race condition that another program binds to
the same port in step 3. One option is to use SO_REUSEPORT socket option,
which permits multiple sockets to be bound to an identical socket address.
Though both Sentry and HMS are written in Java, and only JDK9 (and above)
supports this socket option[1]. Methods such as Java reflection can be
used to invoke private methods to set up this option, but is hacky.

This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that
external servers such as Sentry can pick up a unique bind address,
knowing that in doing so there'll never be any danger of a port
collision.

[1]: 
https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR

Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
---
M src/kudu/hms/mini_hms.cc
M src/kudu/hms/mini_hms.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/master_sentry-itest.cc
M src/kudu/mini-cluster/CMakeLists.txt
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/mini-cluster/mini_cluster.cc
M src/kudu/mini-cluster/mini_cluster.h
M src/kudu/security/test/mini_kdc.cc
M src/kudu/sentry/mini_sentry.cc
M src/kudu/sentry/mini_sentry.h
M src/kudu/sentry/sentry-test-base.h
M src/kudu/util/test_util.cc
M src/kudu/util/test_util.h
15 files changed, 488 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/2
--
To view, visit http://gerrit.cloudera.org:8080/11956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
Gerrit-Change-Number: 11956
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [docs] add Hive Metastore integration

2018-11-19 Thread Hao Hao (Code Review)
Hao Hao has uploaded a new patch set (#2) to the change originally created by 
Dan Burkert. ( http://gerrit.cloudera.org:8080/11798 )

Change subject: [docs] add Hive Metastore integration
..

[docs] add Hive Metastore integration

Change-Id: I12939c8f2245450ad46898c2050451b090c7ea01
---
A docs/hive_metastore.adoc
1 file changed, 154 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/11798/2
--
To view, visit http://gerrit.cloudera.org:8080/11798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12939c8f2245450ad46898c2050451b090c7ea01
Gerrit-Change-Number: 11798
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [sentry] add mini Sentry to the external mini cluster

2018-11-19 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11956


Change subject: [sentry] add mini Sentry to the external mini cluster
..

[sentry] add mini Sentry to the external mini cluster

This commit enables external mini cluster to be able to start with both
mini sentry and mini hms. A major challenge is the circular dependency
between the HMS and Sentry in terms of configuring each with the other's
address, including port which is currently discovered by polling lsof.
To work around it, we can do:

  * Start the Sentry service. Find out which port it's on.
  * Start the HMS, configured to talk to Sentry's port. Find out which
port it's on.
  * Restart the Sentry service on the same port, reconfigured to talk
to the HMS's port.

However, there could be a race condition that another program binds to
the same port in step 3. One option is to use SO_REUSEPORT socket option,
which permits multiple sockets to be bound to an identical socket address.
Though both Sentry and HMS are written in Java, and only JDK9 (and above)
supports this socket option[1]. Methods such as Java reflection can be
used to invoke private methods to set up this option, but is hacky.

This patch extends the UNIQUE_LOOPBACK usage in mini cluster, so that
external servers such as Sentry can pick up a unique bind address,
knowing that in doing so there'll never be any danger of a port
collision.

[1]: 
https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR

Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
---
M src/kudu/hms/mini_hms.cc
M src/kudu/hms/mini_hms.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/master_sentry-itest.cc
M src/kudu/mini-cluster/CMakeLists.txt
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/mini-cluster/mini_cluster.cc
M src/kudu/mini-cluster/mini_cluster.h
M src/kudu/security/test/mini_kdc.cc
M src/kudu/sentry/mini_sentry.cc
M src/kudu/sentry/mini_sentry.h
M src/kudu/util/test_util.cc
M src/kudu/util/test_util.h
14 files changed, 484 insertions(+), 62 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/11956/1
--
To view, visit http://gerrit.cloudera.org:8080/11956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
Gerrit-Change-Number: 11956
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[kudu-CR] KUDU-2038: Support bitmap index

2018-11-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11722 )

Change subject: KUDU-2038: Support bitmap index
..


Patch Set 6:

> Here's a design doc that LiFu shared with me via Slack. It was originally a 
> docx, but I've uploaded it as a gdoc to facilitate comments. If there's 
> trouble accessing it, I can funnel comments from the gdoc to gerrit.
>
> https://docs.google.com/document/d/1U_RFs8piw30K7fjyl0m08v2f-eiIjOUOkJRGGXrCoVY/edit?usp=sharing

Thanks, Andrew.

Lifu, I've left some comments on this gdoc. Could we continue this design 
discussion there? Let us know if you're not able to and we'll find a different 
approach.


--
To view, visit http://gerrit.cloudera.org:8080/11722
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0edaa0ef1dba2dbce85ebf15f0a731e4939a7860
Gerrit-Change-Number: 11722
Gerrit-PatchSet: 6
Gerrit-Owner: helifu 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: helifu 
Gerrit-Comment-Date: Mon, 19 Nov 2018 17:51:46 +
Gerrit-HasComments: No