Hi Christopher, Thanks for all the questions. I want to add some details about internode mTLS connection & internode mTLS authenticator that we are adding in this patch.
SSL/TLS related configuration for internode connections are present in “server_encryption_options” section of cassandra.yaml. server_encryption_options: internode_encryption: none keystore: conf/.keystore keystore_password: cassandra outbound_keystore: conf/.keystore outbound_keystore_password: cassandra truststore: conf/.truststore truststore_password: cassandra server_encryption_options.truststore - contains trusted root certificates. During TLS handshake, only certificates that are signed by these CA will be trusted. server_encryption_options.keystore - contains server certificate that will be used by a node that acts as a server during internode connection, this certificate will be used to create ssl context for inbound connections. server_encryption_options.outbound_keystore - contains client certificate that will be used by a node that acts as a client during internode connection, this certificate will be used to create ssl context for outbound connections. The internode authenticator that we have in this patch, trusts certificates whose identity is same as the identity of the certificate that the node uses for making outbound connections. Like Dinesh mentioned in the above comment, this is to avoid configuring individual trusted identities in each node. Imagine you have a 1000 node cluster and if a new node joins the ring, we need to add the new node’s identity to all other 1000 nodes for the new node to be able to make internode connections. But, with this internode mTLS authenticator, we don’t need to update all other 1000 nodes as they will trust the new node, if the new node’s identity is same as theirs. During an internode connection between two nodes node1 -> node2, node1 acts as a client (creates outbound connection) & node2 acts as a server(receives an inbound connection). In this scenario, node1 uses “server_encryption_options.outbound_keystore” to create SSL context which contains the client certificate, where as node2 uses “server_encryption_options.keystore” to create SSL context which contains the server certificate. There will be TLS handshake first between two nodes and then the internode authenticator comes into the picture. On node2 (which is the server), after the TLS handshake, node2 lists the certificates in its outbound keystore and extracts identities out of them and trusts only those identities. If node1 is presenting a certificate with an identity that is present in the extracted identities, node2 will accept the connection. When node2 makes an outbound connection it uses the certificate in outbound_keystore, this means that node2 trusts other nodes whose identities are same as its own. Coming to your questions 1. server_encryption_options.truststore contains only trusted roots, these are used in SSL handshake to verify that the certificate presented is signed by these trusted CAs. So we should use server_encryption_options.outbound_keystore which contain client certificates signed by CA and it will contain identity information like SPIFFE in its SAN 2. Yes, each trusted certificate (not to be confused as root certificates) has to be added into outbound keystore 3. Right now, there is no pattern matching. But one can modify the “SpiffeCertificateValidator” to have pattern matching if they want. 4. `cassandra-mtls.yaml` has a reference configuration for both internode & client mTLS authenticators. Also there is documentation in all the authenticators and cassandra.yaml on how to configure the authenticators. We would be happy to add any other documentation that might be helpful. Thanks, Jyothsna. On 2023/06/02 21:28:22 Dinesh Joshi wrote: > > On Jun 2, 2023, at 1:56 PM, Christopher Bradford <br...@gmail.com> wrote: > > > > I am not sure what you mean by this would be used alongside internode and > > client TLS? The mutual TLS authentication allows the server to authenticate > > the client's identity using a client TLS certificate. The authenticators > > we're adding enable this functionality. There isn't an expectation that the > > same certificates be used. In fact, clients should not use the same > > certificates as the internode. > > > > My apologies if questions (1 and 2) were a bit convoluted. I'm going to > > walk through both client and internode below as I perceive the PR. > > > > Client TLS connections have the client certificate checked against the > > trust store (if client_encryption_options.require_client_auth is set to > > true). It looks like the authenticator in the PR checks the identity in the > > subject alternative name part of the certificate against identity / roles > > relationships specified via CQL. This all makes sense to me, but to be > > clear my question was whether the certificate used by the client is the > > same as the certificate used to secure the connection. My initial reading > > here is that yes the certificates are the same, there's only ever one > > certificate, we're just looking in two locations for trust. We would first > > be checking the certificate's trust before the request is ever processed > > (is the CA or this certificate in the trust store). Then the SAN of the > > certificate is utilized to determine who the request is being performed by > > which is then matched up with a role and request processing continues as > > usual. > > Correct. > > > > Internode communication is where I started to get confused. It wasn't clear > > where we were authorizing identities as trusted. We still have the > > server_encryption_options.require_client_auth (similar to > > client_encryption_options.require_client_auth) boolean to force checking > > the trust of the provided certificate against our trust store. I was > > looking for a way to specify either an allowed list of identities or a > > pattern to match on. Rereading the PR showed me that we are extracting the > > valid identities from the outbound keystore (reference link). This doesn't > > seem correct as the associated documentation in cassandra.yaml indicates > > this is where the public and private key information is stored for a node's > > outbound (client) connections to other nodes. Should this instead be the > > server_encryption_options.truststore alongside trusted CA certificates? In > > either case it seems as though we would need to load the public certificate > > for all servers in the cluster (including the specified SPIFFE SAN). Is > > that correct? This means there's no way (yet) to match against a specific > > pattern of identities, instead they must all be explicitly allowed. > > The reason we use the keystore is that the node extracts its own identity and > expects other nodes in the cluster to share the same identity. This default > behavior makes it easy to avoid configuring individual identities of nodes in > the cluster. It's critical to recognize that if we had a separate identity > for each node in the cluster, then we would need to update all nodes in the > cluster when a new node is added or removed. This way all nodes in the > cluster can have a shared identity while simultaneously preventing > unnecessary operational pain of adding and removing identities each time a > node is added or removed from the cluster. > > > > Back to my original question, it appears as though we are using a single > > certificate for internode where we first check the trust chain of the > > certificate then check the subject against a valid list in some store. This > > is the same behavior for client certificates. The genesis of my question > > was whether there would be separate certificates for the SPIFFE work on top > > of the certificates used for the base TLS communication. There are not > > (which is good IMO) instead a SAN is expected to be included in the > > certificate which is then used for checking identity. (Note internode and > > client certificates may still be separate and in most cases should be). > > Correct. > > > Given I've sorted out questions 1 and 2 in my mental model, question 3 is a > > little different. I think the question is more how do I manage certificates > > and trust given these changes? I think this can be answered with: > > > > For clients we provide a CA certificate in the client trust store and > > identities via the new CQL syntax for mapping roles to SPIFFE identities. > > Easy. > > > > Internode communication is handled by adding a CA certificate to the server > > trust store and outbound node client certificates to a store (which is not > > clear at the moment per above) which include a SPIFFE identity as part of > > their SAN. > > I hope my explanation above clarifies this confusion. > > > > > Please let me know if this is accurate. > > > > To recap open questions from above: > > • > > Should we be using server_encryption_options.outbound_keystore or > > server_encryption_options.truststore for maintaining the list of trusted > > identities? The current PR uses the former, I propose the latter is more > > appropriate. > > I hope my explanation above clarifies this confusion. > > > • Does each trusted certificate (containing an SPIFFE identity SAN) > > need to be loaded into the location specified in question 1 above? > > no. > > > • Is there any way to match on a pattern of an identity? > > not necessary. > > > > > For example my SPIFFE identity may be > > spiffe://example.com/payments/cassandra/1, in this example cassandra/1 is > > the identifier for the node and I want to trust anything that matches > > spiffe://example.com/payments/cassandra/*. My initial reading here is that > > it is not possible at this time. We could use a > > All nodes in a cluster should have the same SPIFFE as they belong to the same > cluster. This simplifying assumption helps reduce the operational pain. So > ideally one would embed spiffe://example.com/payments/billingcluster. > > > • From a documentation perspective, how would you describe a reference > > implementation? I think the project could benefit from a reference > > configuration including mTLS for clients and nodes with separate > > certificates for all components / nodes alongside a sample SPIFFE identity. > > (This question may be a bit outside of scope for the particular ticket / > > feature, but it's something to noodle on) > > Yes, we can add more documentation with an example to help illustrate the use. > > Thanks, > > Dinesh