I’m not suggesting enforcing network encryption, just prohibiting 
unauthenticated connections from peers so that we do not effectively offer a 
decrypt-all-the-data endpoint.

If as an operator you know that it is impossible for unauthenticated peers to 
open a connection due to your network configuration, then we can offer some 
special SafeAllowAllInternodeAuthenticator that permits things to proceed as 
normal, but we should definitely ensure operators have considered internode 
authentication in the case we have at rest encryption. It’s far too easy for 
this to be overlooked otherwise, and for an operator to thereby fail to protect 
their data.


From: Bowen Song <bo...@bso.ng.INVALID>
Date: Tuesday, 16 November 2021 at 12:33
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption
I think a warning message is fine, but Cassandra should not enforce
network encryption when on-disk encryption is enabled. It's definitely a
valid use case to have Cassandra over IPSec without enabling TLS.

On 16/11/2021 12:02, bened...@apache.org wrote:
> We already have the facility to authenticate peers, I am suggesting we should 
> e.g. refuse to enable encryption if there is no such facility configured for 
> a replica, or fail to start if there is encrypted data present and no 
> authentication facility configured.
>
> It is in my opinion much more problematic to remove encryption from data and 
> ship it to another node in the network than it is to ship data that is 
> already unencrypted to another node on the network. Either is bad, but it is 
> probably fine to leave the unencrypted case to the cognizance of the operator 
> who may be happy relying on their general expectation that there are no 
> nefarious actors on the network. Encrypting data suggests this is not an 
> acceptable assumption, so I think we should make it harder for users that 
> require encryption to accidentally misconfigure in this way, since they 
> probably have higher security expectations (and compliance requirements) than 
> users that do not encrypt their data at rest.
>
>
> From: Bowen Song <bo...@bso.ng.INVALID>
> Date: Tuesday, 16 November 2021 at 11:56
> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption
> I think authenticating a receiving node is important, but it is perhaps
> not in the scope of this ticket (or CEP if it becomes one). This applies
> to not only encrypted SSTables, but also unencrypted SSTables. A
> malicious node can join the cluster and send bogus requests to other
> nodes is a general problem not specific to the on-disk encryption.
>
> On 16/11/2021 10:50, bened...@apache.org wrote:
>> I assume the key would be decrypted before being streamed, or perhaps 
>> encrypted using a public key provided to you by the receiving node. This 
>> would permit efficient “zero copy” streaming for the data portion, but not 
>> require any knowledge of the recipient node’s master key(s).
>>
>> Either way, we would still want to ensure we had some authentication of the 
>> recipient node before streaming the file as it would effectively be 
>> decrypted to any node that could request this streaming action.
>>
>>
>> From: Stefan Miklosovic <stefan.mikloso...@instaclustr.com>
>> Date: Tuesday, 16 November 2021 at 10:45
>> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
>> Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption
>> Ok but this also means that Km would need to be the same for all nodes right?
>>
>> If we are rolling in node by node fashion, Km is changed at node 1, we
>> change the wrapped key which is stored on disk and we stream this
>> table to the other node which is still on the old Km. Would this work?
>> I think we would need to rotate first before anything is streamed. Or
>> no?
>>
>> On Tue, 16 Nov 2021 at 11:17, Bowen Song <bo...@bso.ng.invalid> wrote:
>>> Yes, that's correct. The actual key used to encrypt the SSTable will
>>> stay the same once the SSTable is created. This is a widely used
>>> practice in many encrypt-at-rest applications. One good example is the
>>> LUKS full disk encryption, which also supports multiple keys to unlock
>>> (decrypt) the same data. Multiple unlocking keys is only possible
>>> because the actual key used to encrypt the data is randomly generated
>>> and then stored encrypted by (a key derived from) a user chosen key.
>>>
>>> If this approach is adopted, the streaming process can share the Kr
>>> without disclosing the Km, therefore enableling zero-copy streaming.
>>>
>>> On 16/11/2021 08:56, Stefan Miklosovic wrote:
>>>> Hi Bowen, Very interesting idea indeed. So if I got it right, the very
>>>> key for the actual sstable encryption would be always the same, it is
>>>> just what is wrapped would differ. So if we rotate, we basically only
>>>> change Km hence KEK hence the result of wrapping but there would still
>>>> be the original Kr key used.
>>>>
>>>> Jeremiah - I will prepare that branch very soon.
>>>>
>>>> On Tue, 16 Nov 2021 at 01:09, Bowen Song <bo...@bso.ng.invalid> wrote:
>>>>>>        The second question is about key rotation. If an operator needs to
>>>>>>        roll the key because it was compromised or there is some policy 
>>>>>> around
>>>>>>        that, we should be able to provide some way to rotate it. Our 
>>>>>> idea is
>>>>>>        to write a tool (either a subcommand of nodetool (rewritesstables)
>>>>>>        command or a completely standalone one in tools) which would take 
>>>>>> the
>>>>>>        first, original key, the second, new key and dir with sstables as
>>>>>>        input and it would literally took the data and it would rewrite 
>>>>>> it to
>>>>>>        the second set of sstables which would be encrypted with the 
>>>>>> second
>>>>>>        key. What do you think about this?
>>>>>        I would rather suggest that “what key encrypted this” be part of 
>>>>> the sstable metadata, and allow there to be multiple keys in the system.  
>>>>> This way you can just add a new “current key” so new sstables use the new 
>>>>> key, but existing sstables would use the old key.  An operator could then 
>>>>> trigger a “nodetool upgradesstables —all” to rewrite the existing 
>>>>> sstables with the new “current key”.
>>>>>
>>>>> There's a much better approach to solve this issue. You can stored a
>>>>> wrapped key in an encryption info file alone side the SSTable file.
>>>>> Here's how it works:
>>>>> 1. randomly generate a key Kr
>>>>> 2. encrypt the SSTable file with the key Kr, store the encrypted SSTable
>>>>> file on disk
>>>>> 3. derive a key encryption key KEK from the SSTable file's information
>>>>> (e.g.: table UUID + generation) and the user chosen master key Km, so
>>>>> you have KEK = KDF(UUID+GEN, Km)
>>>>> 4. wrap (encrypt) the key Kr with the KEK, so you have WKr = KW(Kr, KEK)
>>>>> 5. hash the Km, the hash will used as a key ID to identify which master
>>>>> key was used to encrypt the key Kr if the server has multiple master
>>>>> keys in use
>>>>> 6. store the the WKr and the hash of Km in a separate file alone side
>>>>> the SSTable file
>>>>>
>>>>> In the read path, the Kr should be kept in memory to help improve
>>>>> performance and this will also allow zero-downtime master key rotation.
>>>>>
>>>>> During a key rotation:
>>>>> 1. derive the KEK in the same way: KEK = KDF(UUID+GEN, Km)
>>>>> 2. read the WKr from the encryption information file, and unwrap
>>>>> (decrypt) it using the KEK to get the Kr
>>>>> 3. derive a new KEK' from the new master key Km' in the same way as above
>>>>> 4. wrap (encrypt) the key Kr with KEK' to get WKr' = KW(Kr, KEK')
>>>>> 5. hash the new master key Km', and store it together with the WKr' in
>>>>> the encryption info file
>>>>>
>>>>> Since the key rotation only involves rewriting the encryption info file,
>>>>> the operation should take only a few milliseconds per SSTable file, it
>>>>> will be much faster than decrypting and then re-encrypting the SSTable 
>>>>> data.
>>>>>
>>>>>
>>>>>
>>>>> On 15/11/2021 18:42, Jeremiah D Jordan wrote:
>>>>>>> On Nov 14, 2021, at 3:53 PM, Stefan 
>>>>>>> Miklosovic<stefan.mikloso...@instaclustr.com>  wrote:
>>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> there are two points we are not completely sure about.
>>>>>>>
>>>>>>> The first one is streaming. If there is a cluster of 5 nodes, each
>>>>>>> node has its own unique encryption key. Hence, if a SSTable is stored
>>>>>>> on a disk with the key for node 1 and this is streamed to node 2 -
>>>>>>> which has a different key - it would not be able to decrypt that. Our
>>>>>>> idea is to actually send data over the wire _decrypted_ however it
>>>>>>> would be still secure if internode communication is done via TLS. Is
>>>>>>> this approach good with you?
>>>>>>>
>>>>>> So would you fail startup if someone enabled sstable encryption but did 
>>>>>> not have TLS for internode communication?  Another concern here is 
>>>>>> making sure zero copy streaming does not get triggered for this case.
>>>>>> Have you considered having some way to distribute the keys to all nodes 
>>>>>> such that you don’t need to decrypt on the sending side?  Having to do 
>>>>>> this will mean a lot more overhead for the sending side of a streaming 
>>>>>> operation.
>>>>>>
>>>>>>> The second question is about key rotation. If an operator needs to
>>>>>>> roll the key because it was compromised or there is some policy around
>>>>>>> that, we should be able to provide some way to rotate it. Our idea is
>>>>>>> to write a tool (either a subcommand of nodetool (rewritesstables)
>>>>>>> command or a completely standalone one in tools) which would take the
>>>>>>> first, original key, the second, new key and dir with sstables as
>>>>>>> input and it would literally took the data and it would rewrite it to
>>>>>>> the second set of sstables which would be encrypted with the second
>>>>>>> key. What do you think about this?
>>>>>> I would rather suggest that “what key encrypted this” be part of the 
>>>>>> sstable metadata, and allow there to be multiple keys in the system.  
>>>>>> This way you can just add a new “current key” so new sstables use the 
>>>>>> new key, but existing sstables would use the old key.  An operator could 
>>>>>> then trigger a “nodetool upgradesstables —all” to rewrite the existing 
>>>>>> sstables with the new “current key”.
>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> On Sat, 13 Nov 2021 at 19:35,<sc...@paradoxica.net>  wrote:
>>>>>>>> Same reaction here - great to have traction on this ticket. Shylaja, 
>>>>>>>> thanks for your work on this and to Stefan as well! It would be 
>>>>>>>> wonderful to have the feature complete.
>>>>>>>>
>>>>>>>> One thing I’d mention is that a lot’s changed about the project’s 
>>>>>>>> testing strategy since the original patch was written. I see that the 
>>>>>>>> 2016 version adds a couple round-trip unit tests with a small amount 
>>>>>>>> of static data. It would be good to see randomized tests fleshed out 
>>>>>>>> that exercise more of the read/write path; or which add variants of 
>>>>>>>> existing read/write path tests that enable encryption.
>>>>>>>>
>>>>>>>> – Scott
>>>>>>>>
>>>>>>>>> On Nov 13, 2021, at 7:53 AM, Brandon Williams<dri...@gmail.com>  
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> We already have a ticket and this predated CEPs, and being an
>>>>>>>>> obviously good improvement to have that many have been asking for for
>>>>>>>>> some time now, I don't see the need for a CEP here.
>>>>>>>>>
>>>>>>>>> On Sat, Nov 13, 2021 at 5:01 AM Stefan Miklosovic
>>>>>>>>> <stefan.mikloso...@instaclustr.com>  wrote:
>>>>>>>>>> Hi list,
>>>>>>>>>>
>>>>>>>>>> an engineer from Intel - Shylaja Kokoori (who is watching this list
>>>>>>>>>> closely) has retrofitted the original code from CASSANDRA-9633 work 
>>>>>>>>>> in
>>>>>>>>>> times of 3.4 to the current trunk with my help here and there, mostly
>>>>>>>>>> cosmetic.
>>>>>>>>>>
>>>>>>>>>> I would like to know if there is a general consensus about me going 
>>>>>>>>>> to
>>>>>>>>>> create a CEP for this feature or what is your perception on this. I
>>>>>>>>>> know we have it a little bit backwards here as we should first 
>>>>>>>>>> discuss
>>>>>>>>>> and then code but I am super glad that we have some POC we can
>>>>>>>>>> elaborate further on and CEP would just cement  and summarise the
>>>>>>>>>> approach / other implementation aspects of this feature.
>>>>>>>>>>
>>>>>>>>>> I think that having 9633 merged will fill quite a big operational gap
>>>>>>>>>> when it comes to security. There are a lot of enterprises who desire
>>>>>>>>>> this feature so much. I can not remember when I last saw a ticket 
>>>>>>>>>> with
>>>>>>>>>> 50 watchers which was inactive for such a long time.
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
>>>>>>>>>> For additional commands, e-mail:dev-h...@cassandra.apache.org
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
>>>>>>>>> For additional commands, e-mail:dev-h...@cassandra.apache.org
>>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
>>>>>>>> For additional commands, e-mail:dev-h...@cassandra.apache.org
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
>>>>>>> For additional commands, e-mail:dev-h...@cassandra.apache.org
>>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail:dev-unsubscr...@cassandra.apache.org
>>>>>> For additional commands, e-mail:dev-h...@cassandra.apache.org
>>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to