Adar Dembo has posted comments on this change.

Change subject: WIP [security] load/store public TSK in the system table
......................................................................


Patch Set 5:

(34 comments)

I did a first pass. I think the most important question to settle is whether 
followers need to maintain an up-to-date TSK soft state. I'd prefer the answer 
to be "no", as it'd simplify this code a lot.

http://gerrit.cloudera.org:8080/#/c/5935/5/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

Line 192:     if (s.IsNetworkError() || s.IsServiceUnavailable()) {
Probably doesn't belong in this patch, right? I think you put this in a 
different one?


http://gerrit.cloudera.org:8080/#/c/5935/5/src/kudu/integration-tests/token_signer-itest.cc
File src/kudu/integration-tests/token_signer-itest.cc:

Line 85:   SignedTokenPB MakeToken() {
Declare as static?


Line 125:   gscoped_ptr<MiniCluster> cluster_;
unique_ptr


Line 141:   SleepFor(MonoDelta::FromSeconds(1));
Why is this needed?


PS5, Line 145: have
has


PS5, Line 145: folower
follower


http://gerrit.cloudera.org:8080/#/c/5935/5/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 204: DEFINE_int64(catalog_manager_sys_table_propagation_seconds, 60,
Not really sure what this means.


Line 208: // TODO(PKI): add flag tags
Indeed.


Line 342:     CHECK(token_signer_);
could use CHECK_NOTNULL in the initializer list above.


Line 345:   // The entry_id corresponds to the sequence number of the TSK key.
But we're ignoring the entry_id, so why is this comment useful?


Line 359:     LOG(INFO) << "Imported TSPK " << tspk.key_seq_num() << " from 
system table";
This may be too noisy. How many non-expired entries would we typically expect?


Line 365:   const set<int64_t>& loaded_key_seq_numbers() const {
We only ever get the first number, so change this method to expose that rather 
than the entire set.

Alternatively, track the earliest one explicitly (i.e. initialize it to -1, use 
std::min() or whatever to update it with each visit) and forgo the set entirely.


PS5, Line 371:  // Epoch time
What does this mean?


PS5, Line 469: firther
further


PS5, Line 489: firther
further


Line 497:           catalog_manager_->LoadTskEntries(&last_seen_tsk_entry_id);
This runs every second. Isn't that super wasteful?


Line 500:     } while (false);
Why this change? We can have an extra scope (for lock acquisition/release) 
without a do/while.


PS5, Line 3269: There's a bit of subtlety here:
Yeah, this is complicated. I'll take another look at this later; maybe ask Dan 
or Todd to review it too.


Line 3290:       RETURN_NOT_OK(signer->GenerateSigningKey(&next_tsk_));
Why are we writing to next_tsk_ before persisting, then clearing it on failure? 
Why not just write to a local variable and promote to next_tsk_ on success? 
Seems more straight-forward.


Line 3296:       const Status s = sys_catalog_->AddTskEntries({sys_entry});
Can we use security::TokenSigningPublicKeyPB in the interface to this function, 
rather than SysTskEntryPB?

Also, if we only ever add one at a time, let's not use a vector-based interface.


Line 3320:   TskEntryLoader loader(signer, WallTime_Now());
A certain amount of imprecision is OK here, right? Masters' clocks may not be 
perfectly synchronized, so on leader election it's possible that a TSK can 
either "lose" or "gain" some expiration time, depending on the clock difference 
between the previous leader master and the new one. That's OK, right?


Line 3322:   const set<int64_t> seq_numbers(loader.loaded_key_seq_numbers());
Could be a cref.


http://gerrit.cloudera.org:8080/#/c/5935/5/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

Line 565:   //Status LoadTokenSigningInfo(std::vector<SysTskEntryPB>* 
new_entries);
Why is this commented out?


PS5, Line 773: pubic
public


PS5, Line 773: follower master
follower masters


http://gerrit.cloudera.org:8080/#/c/5935/5/src/kudu/master/master.proto
File src/kudu/master/master.proto:

Line 200: // The on-disk entry in the sys.catalog table ("metadata" column) to 
represent
This isn't used (and is commented out); can it be removed? I think it'd be 
simpler to figure out the "newest" SysTskEntryPB by scanning all of them; 
storing a pointer to the latest feels like a premature optimization.


Line 204: // be single entry of this type in the sys.catalog table. 
Trailing whitespace here.


http://gerrit.cloudera.org:8080/#/c/5935/5/src/kudu/master/sys_catalog.cc
File src/kudu/master/sys_catalog.cc:

Line 478: // TSK-related methods
This isn't using the same format as the other boundaries, plus it's splitting 
the "table-related methods" section into multiple pieces.


Line 490: void SysCatalogTable::ReqAddTskEntry(WriteRequestPB* req,
Could be inlined into AddTskEntries(), since it's just used in that one place.


Line 511: void SysCatalogTable::ReqDeleteTskEntry(WriteRequestPB* req,
Not used?


Line 520:   CHECK_OK(row.SetStringNoCopy(
Is NoCopy() correct? Doesn't the string from TskSeqNumberToEntryId() go out of 
scope the moment the call completes?


Line 568:   unique_ptr<Arena> range_arena;
Can't reuse the arena below?


Line 576:     const string key_upper_boundary = key_low_boundary + ".";
Confused by this; if we want to scan from key_low_boundary and up, shouldn't we 
just pass 'nullptr' as upper?

Also, key_util:: has methods for incrementing cell values.


http://gerrit.cloudera.org:8080/#/c/5935/5/src/kudu/master/sys_catalog.h
File src/kudu/master/sys_catalog.h:

PS5, Line 148: If the 'low_entry_id'
             :   // is not empty, it is used as the scan's exclusive low 
boundary for the
             :   // 'entry_id' column.
This feels premature. The number of TSK rows is pretty low, right? Can't we 
just scan them all and filter out the ones we don't care about.


-- 
To view, visit http://gerrit.cloudera.org:8080/5935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie91d4129bda0ca49e81988c28385895a2abcd201
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to