Alexey Serbin has posted comments on this change. Change subject: [security] tailored TokenSigner for system catalog ......................................................................
Patch Set 10: (23 comments) http://gerrit.cloudera.org:8080/#/c/5930/10/src/kudu/master/master.cc File src/kudu/master/master.cc: PS10, Line 116: FLAGS_token_signing_key_validity_seconds, : FLAGS_token_signing_key_rotation_seconds, : FLAGS_token_signing_key_propagation_seconds)); : > nit: weird indentation Done PS10, Line 124: RETURN_NOT_OK(token_signer_->TryAddKey(&key)); : if (key) { : RETURN_NOT_OK(token_signer_->AddKey(std::move(key))); > this looks kind of strange to me. If "TryAddKey" succeeds, then why are we Agree, let me rename TryAddKey ---> CheckNeedKey http://gerrit.cloudera.org:8080/#/c/5930/10/src/kudu/security/token_signer.cc File src/kudu/security/token_signer.cc: PS10, Line 56: DEFINE_int64(token_signing_key_rotation_seconds, 60 * 60 * 24 * 6, : "number of seconds prior of the expiration time of current " : "signing key when next key should be activated"); : T > this config seems a bit "backwards" to me. I would expect if I set this to Since we don't track the creation time of a key, but the expiration time, this seemed to me as more reliable way to describe and track the rotation schedule. Otherwise, in case of multi-master setup there might be extra keys generated when leadership switches from one master to another. I.e., for key rotation I think it's better to rely less on the ephemeral run-time state of master processes. Line 62: DEFINE_int64(token_signing_key_propagation_seconds, 60, > I'm a little confused by this flag. Shouldn't we just check that tsk_valid In essence we need to make sure that the public part of the new key appears in the master-->tserver HB response before we make the key active. I.e., there should be some time delta after a new was generated prior to making it active. As for the relation to the diagram, that diagram assumes rotation interval is equal to the propagation interval. The translation map between the terms of the diagram in the block comment of the header file and these parameters: 'validity period' <==> 'FLAGS_token_signing_key_validity_seconds' 'rotation interval' <==> 'FLAGS_token_signing_key_validity_seconds - FLAGS_token_signing_key_rotaion_seconds' N/A <==> 'FLAGS_token_signing_key_propagation_seconds' Also, varying token_signing_key_propagation_seconds parameter helps in testing scenarios. PS10, Line 51: DEFINE_int64(token_signing_key_validity_seconds, 60 * 60 * 24 * 7, : "number of seconds that a token signing key is valid for"); : TAG_FLAG(token_signing_key_validity_seconds, advanced); : TAG_FLAG(token_signing_key_validity_seconds, experimental); : : DEFINE_int64(token_signing_key_rotation_seconds, 60 * 60 * 24 * 6, : "number of seconds prior of the expiration time of current " : "signing key when next key should be activated"); : TAG_FLAG(token_signing_key_rotation_seconds, advanced); : TAG_FLAG(token_signing_key_rotation_seconds, experimental); : : DEFINE_int64(token_signing_key_propagation_seconds, 60, : "time necessary to propagate public part of TSK " : "via tserver-to-master heartbeats with some margin"); : TAG_FLAG(token_signing_key_propagation_seconds, advanced); : TAG_FLAG(token_signing_key_propagation_seconds, experimental); > I think these should be defined in master.cc where they're used - was surpr It seems I ended up putting it into wrong place -- I started with having them in master.cc. I'll move these, thank you for the clarification. PS10, Line 89: // Reload the TSKs (presumably originated from the system catalog table). : // This method can be called multiple times. > this doc should probably be in the header, not on the impl. Done PS10, Line 91: &k > nit: & goes with type Done PS10, Line 111: if (next_key_seq_num_ <= seq_num) { : // Advance the key sequence number, if needed. : next_key_seq_num_ = seq_num + 1; : } > next_key_seq_num_ = std::max(next_key_seq_num_, seq_num + 1) Done PS10, Line 127: tsk_deque_.clear(); : for (int i = 0; i < 2; ++i) { : auto it = tsk_by_seq.begin(); : if (it != tsk_by_seq.end()) { : tsk_deque_.emplace_back(it->second.release()); : tsk_by_seq.erase(tsk_by_seq.begin()); : } : } > hrm, could this be written more simply as: Sure, that's much better. Probably, that was too late in night -- that's why such crappy piece of code. Line 139: Status TokenSigner::GenerateAuthnToken(string username, > I still think we're going to end up refactoring this back into an AuthnToke ok, sgtm Line 152: lock_guard<RWMutex> l(lock_); > this can be a shared_lock, right? otherwise we are serializing signatures Good catch. Will fix. Line 164: lock_guard<RWMutex> l(lock_); > can be shared Done Line 177: lock_guard<RWMutex> l(lock_); > I think this could work with just the read lock, with the exception of the yep, need to protect next_key_seq_num_. I'll move the RSA key generation out from the lock. Sequence number gap is not an issue, right. Line 187: return Status::OK(); > maybe returning Status::Aborted() or Status::AlreadyPresent() would be a cl It seems you have been confused by the name of the method, sorry. The interface is to return nullptr wrapped into unique_ptr in that case (need to return non-null in if a key is needed). Non-OK status is reserved for other unexpected errors. This helps to keep the call sites cleaner. Also, the methods is renamed into CheckNeedKey() http://gerrit.cloudera.org:8080/#/c/5930/10/src/kudu/security/token_signer.h File src/kudu/security/token_signer.h: PS10, Line 63: TryPushKey > name doesn't match impl woops, sorry -- there have been 2 or 3 revisions of the TokenSigner interface. I will fix this. Line 113: // RETURN_NOT_OK(ts.AddKey(std::move(key))); > One thing that I think we should document in this class is whether it's OK Yep: holes in key sequence numbering is not an issue. Every time you call CheckNeedKey() a new sequence number is assigned to the key output by the method, if any. I will add a note about key sequence numbering. PS10, Line 122: // { : // unique_ptr<TokenSigningPrivateKey> key; : // RETURN_NOT_OK(ts.TryAddKey(&key)); : // if (key) { : // // Store the newly generated key into the system table. : // ... : // : // // Add the key into the queue of the TokenSigner. : // RETURN_NOT_OK(ts.AddKey(std::move(key))); : // } : // } : // // Check and switch to the next key, if it's time. : // RETURN_NOT_OK(ts.TryRotateKey()); > this is the exact same sequence as we have above, isn't it? I think it woul Yep, eventually this ended up like this (and I think it's good). Thank you for pointing at this. PS10, Line 153: eximine > typo Done PS10, Line 158: TryAddKey > per comments elsewhere, I think a better name for this might be something l Yep, good point. How about CheckNeedKey ? Line 188: std::unique_ptr<TokenVerifier> verifier_; > I think the single-ownership model here is going to be a little problematic Yep, that's a point to discuss. From one side, in the case of the master I'd prefer to not allow any other component modify state of TokenVerifier because of concerns over consistency of key sequence numbers. Line 197: std::deque<std::unique_ptr<TokenSigningPrivateKey>> tsk_deque_; > add a doc here? which one is the current one? the head of the queue? Done http://gerrit.cloudera.org:8080/#/c/5930/10/src/kudu/security/token_verifier.h File src/kudu/security/token_verifier.h: PS10, Line 53: static const int64_t kPreStartSeqNum; > doc Done PS10, Line 60: kPreStartSeqNum > is the constant for this actually helping clarity or obfuscating it? I just needed a constant to refer to as a parameter by default in ExportKeys and also in various test scenarios for validation of the key sequence numbers. Do you mean it's better just drop this and go with some implicit constant like 0 or -1? Would it help to add clarity? -- To view, visit http://gerrit.cloudera.org:8080/5930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie2417e2ccba6a1114db366b2f642f95362bf479c Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes