Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11956 )

Change subject: [sentry] add mini Sentry to the external mini cluster
......................................................................


Patch Set 3:

(34 comments)

http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@15
PS2, Line 15:   * Start the Sentry service. Find out which port it's on.
Good explanation, but I'd make two small tweaks to improve readability:
1. Make it a numbered list (i.e 1, 2, 3) in order to emphasize the order of 
events
2. For the first step, point out that the Sentry service is configured without 
any knowledge of an HMS service. It'll start up but will be useless otherwise.


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@21
PS2, Line 21: that
Nit: where


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@22
PS2, Line 22: same port in step 3
Nit: "same port during the restart in step 3."


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@24
PS2, Line 24: Though both Sentry and HMS are written in Java, and only JDK9 
(and above)
            : supports this socket option[1]
Nit: I'd rewrite as "Although both Sentry and HMS are written in Java, only 
JDK9 (and above) supports this socket option[1]".


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@26
PS2, Line 26: is hacky.
Nit: "they are hacky"


http://gerrit.cloudera.org:8080/#/c/11956/2//COMMIT_MSG@33
PS2, Line 33: [1]: 
https://docs.oracle.com/javase/9/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR
Wrong link here; didn't you mean the SO_REUSEPORT anchor?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc
File src/kudu/hms/mini_hms.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@84
PS2, Line 84:     sentry_service_principal_ = *sentry_service_principal;
If you're going to go through the trouble to wrap sentry_service_principal in 
an optional to reflect its optionality in the EnableSentry() function, 
shouldn't that optionality carry over into the sentry_service_principal_ state 
as well, so that methods like IsAuthorizationEnabled() are more clear?

Then you could also make this a simpler optional<string> and std::move it into 
place.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@164
PS2, Line 164:   Status wait = WaitForTcpBind(hms_process_->pid(), &port_, none,
Should use /*foo=*/ inline comments to help explain what 'none' means.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@306
PS2, Line 306:   <property>
             :     <name>hive.sentry.conf.url</name>
             :     <value>file://$1/hive-sentry-site.xml</value>
             :   </property>
Shouldn't this also be conditioned on IsAuthorizationEnabled? Otherwise there 
will have been no hive-sentry-site.xml file.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@328
PS2, Line 328:     //    service.
Nit: indentation


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@355
PS2, Line 355:                                          sentry_address_);
Where is this substituted? I don't see a $7 in kHiveFileTemplate.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@359
PS2, Line 359: HNS
HMS


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@363
PS2, Line 363:     //     Set of service users which is safe to be excluded 
from the
             :     //     Sentry authorization checks.
Nit: I'd make this a bit more explicit "Set of service users whose access will 
be excluded from Sentry authorization checks."


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/hms/mini_hms.cc@366
PS2, Line 366:   <configuration>
The indentation here is pretty confusing w.r.t. the IsAuthorizationEnabled 
condition. Maybe shift this block all the way left?


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/CMakeLists.txt@91
PS3, Line 91: ADD_KUDU_TEST(master_sentry-itest RUN_SERIAL true PROCESSORS 4)
How did you arrive at the PROCESSORS value? And why use RUN_SERIAL here?


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc
File src/kudu/integration-tests/master_sentry-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@55
PS3, Line 55:   const ExternalMiniCluster::BindMode bind_mode;
Looks like you're not using this param in SetUp(). Nor do I think you need to; 
the default BindMode is already taken care of in mini_cluster.h.


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@71
PS3, Line 71:     opts.enable_sentry = true;
What happens if enable_sentry=true but enable_kerberos=false? The MiniSentry is 
started but not configured to be used?


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@74
PS3, Line 74:     opts.num_masters = 1;
            :     opts.num_tablet_servers = 1;
These are the defaults; you can omit this.


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@93
PS3, Line 93: TestFoo
Rename to something more useful.


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@111
PS3, Line 111: exist
Nit: exists


http://gerrit.cloudera.org:8080/#/c/11956/3/src/kudu/integration-tests/master_sentry-itest.cc@116
PS3, Line 116:   const auto& params = GetParam();
Only used in one place below, so just inline GetParam().


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@159
PS2, Line 159:   // If true, set up a Sentry service as part of this 
ExternalMiniCluster.
This should probably be reflected in the CLI tool, in the control shell.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@160
PS2, Line 160:   bool enable_sentry;
Will we need a SentryMode to support all three cases? That is, "no sentry", 
"sentry started but no integration", and "sentry started and integration 
enabled"?


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@225
PS2, Line 225:   // Same as above but for a external server.
Nit: provide an example of what an "external server" is (like kdc or whatever).


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.h@419
PS2, Line 419:   // Index to identify the number of external servers being 
started.
             :   std::atomic<uint64_t> external_server_index_;
Seems like overkill for now, given that Sentry (and maybe HMS) are the only 
possible external servers, so you can assign them IDs statically (i.e. HMS is 
0, Sentry is 1, etc.).


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.cc
File src/kudu/mini-cluster/external_mini_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/external_mini_cluster.cc@257
PS2, Line 257:     // Restart Sentry with the HMS address.
You can avoid the restart altogether:
1. Pick a bind IP for HMS. Let it use its default port, or pick a non-ephemeral 
and non-privileged port at random and define it in the code.
2. Repeat step #1 for Sentry.
3. Start HMS, using the Sentry bind IP/port you picked in step 2.
3. Start Sentry, using the HMS bind IP/port you picked in step 1.

Because you're using UNIQUE_LOOPBACK addresses, steps #1 and #2 shouldn't yield 
any port collisions. For systems that don't support UNIQUE_LOOPBACK (like 
macOS) you might have to jump through some hoops though.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/mini_cluster.cc
File src/kudu/mini-cluster/mini_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/mini-cluster/mini_cluster.cc@53
PS2, Line 53:       switch (type) {
            :         case MASTER:
            :           idx = kServersMaxNum - index;
            :           break;
            :         case TSERVER:
            :           idx = index + 1;
            :           break;
            :         case EXTERNAL_SERVER:
            :           idx = kServersMaxNum / 3 + index;
            :           break;
            :         default:
            :           LOG(FATAL) << type;
            :       }
Seems like the goal of this is to partition the 256k "index space" into three 
portions, one for each enum value of DaemonType. If that's true, could you 
describe that in a comment?

Also, what happens if a daemon "crosses over" into the portion reserved for a 
different DaemonType? Seems like a theoretical concern given the size of our 
test clusters, but am curious nonetheless.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/security/test/mini_kdc.cc
File src/kudu/security/test/mini_kdc.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/security/test/mini_kdc.cc@157
PS2, Line 157: none
Doc with inline comment.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/sentry/mini_sentry.h
File src/kudu/sentry/mini_sentry.h:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/sentry/mini_sentry.h@93
PS2, Line 93:   std::string ip_ = "0.0.0.0";
Does this inline initialization in the class declaration work? I thought it's 
only OK for POD types.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/sentry/mini_sentry.cc
File src/kudu/sentry/mini_sentry.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/sentry/mini_sentry.cc@251
PS2, Line 251:     <name>sentry.service.server.rpc-address</name>
Doc this.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/util/test_util.h
File src/kudu/util/test_util.h:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/util/test_util.h@139
PS2, Line 139:
Should doc the new field.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/util/test_util.cc
File src/kudu/util/test_util.cc:

http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/util/test_util.cc@410
PS2, Line 410:   // Subsequent lines come in pairs.
Should probably add that in each pair, we ignore the first half of the pair 
(the file descriptor number).


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/util/test_util.cc@415
PS2, Line 415: use
Nit: "we use", to maintain the same voice as in the beginning of this 
paragraph. Below too.


http://gerrit.cloudera.org:8080/#/c/11956/2/src/kudu/util/test_util.cc@416
PS2, Line 416: match
Nit: matches



--
To view, visit http://gerrit.cloudera.org:8080/11956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f02e6085bd239570d10ec629f48856d37ed6e59
Gerrit-Change-Number: 11956
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 20 Nov 2018 01:15:12 +0000
Gerrit-HasComments: Yes

Reply via email to