Re: [DISCUSS] Best way to re-encrypt existing data (TDE cache key rotation).

Nikolay Izhikov Mon, 25 May 2020 01:08:54 -0700

Hello, Alexei.

I think we want to implement this feature without nodes restart.
In the ideal scenario all nodes will stay alive and respond to the user 
requests.


> 24 мая 2020 г., в 15:24, Alexei Scherbakov <[email protected]> 
> написал(а):
> 
> Pavel Pereslegin,
> 
> I see another opportunity.
> We can use rebalancing to re-encrypt node data with a new key.
> It's a trivial procedure for me: stop a node, clear database, change a key,
> start node and wait for rebalancing to complete.
> Data will be re-encrypted during rebalancing.
> 
> Did I miss something ?
> 
> пт, 22 мая 2020 г. в 16:14, Ivan Rakov <[email protected]>:
> 
>> Folks,
>> 
>> Just keeping you informed: I and my colleagues are highly interested in TDE
>> in general and keys rotations specifically, but we don't have enough time
>> so far.
>> We'll dive into this feature and participate in reviews next month.
>> 
>> --
>> Best Regards,
>> Ivan Rakov
>> 
>> On Sun, May 17, 2020 at 10:51 PM Pavel Pereslegin <[email protected]>
>> wrote:
>> 
>>> Hello, Alexey.
>>> 
>>>> is the encryption key for the data the same on all nodes in the
>> cluster?
>>> Yes, each encrypted cache group has its own encryption key, the key is
>>> the same on all nodes.
>>> 
>>>> Clearly, during the re-encryption there will exist pages
>>>> encrypted with both new and old keys at the same time.
>>> Yes, there will be pages encrypted with different keys at the same time.
>>> Currently, we only store one key for one cache group. To rotate a key,
>>> at a certain point in time it is necessary to support several keys (at
>>> least for reading the WAL).
>>> For the "in place" strategy, we'll store the encryption key identifier
>>> on each encrypted page (we currently have some unused space on
>>> encrypted page, so I don't expect any memory overhead here). Thus, we
>>> will have several keys for reading and one key for writing. I assume
>>> that the old key will be automatically deleted when a specific WAL
>>> segment is deleted (and re-encryption is finished).
>>> 
>>>> Will a node continue to re-encrypt the data after it restarts?
>>> Yes.
>>> 
>>>> If a node goes down during the re-encryption, but the rest of the
>>>> cluster finishes re-encryption, will we consider the procedure
>> complete?
>>> I'm not sure, but it looks like the key rotation is complete when we
>>> set the new key on all nodes so that the updates will be encrypted
>>> with the new key (as required by PCI DSS).
>>> Status of re-encryption can be obtained separately (locally or cluster
>>> wide).
>>> 
>>> I forgot to mention that with “in place” re-encryption it will be
>>> impossible to quickly cancel re-encryption, because by canceling we
>>> mean re-encryption with the old key.
>>> 
>>>> How do you see the whole key rotation procedure will work?
>>> Initial design for re-encryption with "partition copying" is described
>>> here [1]. I'll prepare detailed design for "in place" re-encryption if
>>> we'll go this way. In short, send the new encryption key cluster-wide,
>>> each node adds a new key and starts background re-encryption.
>>> 
>>> [1]
>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign
>>> .
>>> 
>>> вс, 17 мая 2020 г. в 18:35, Alexey Goncharuk <[email protected]
>>> :
>>>> 
>>>> Pavel, Anton,
>>>> 
>>>> How do you see the whole key rotation procedure will work? Clearly,
>>> during
>>>> the re-encryption there will exist pages encrypted with both new and
>> old
>>>> keys at the same time. Will a node continue to re-encrypt the data
>> after
>>> it
>>>> restarts? If a node goes down during the re-encryption, but the rest of
>>> the
>>>> cluster finishes re-encryption, will we consider the procedure
>> complete?
>>> By
>>>> the way, is the encryption key for the data the same on all nodes in
>> the
>>>> cluster?
>>>> 
>>>> чт, 14 мая 2020 г. в 11:30, Anton Vinogradov <[email protected]>:
>>>> 
>>>>> +1 to "In place re-encryption".
>>>>> 
>>>>> - It has a simple design.
>>>>> - Clusters under load may require just load to re-encrypt the data.
>>>>> (Friendly to load).
>>>>> - Easy to throttle.
>>>>> - Easy to continue.
>>>>> - Design compatible with the multi-key architecture.
>>>>> - It can be optimized to use own WAL buffer and to re-encrypt pages
>>> without
>>>>> restoring them to on-heap.
>>>>> 
>>>>> On Thu, May 14, 2020 at 1:54 AM Pavel Pereslegin <[email protected]>
>>> wrote:
>>>>> 
>>>>>> Hello Igniters.
>>>>>> 
>>>>>> Recently, master key rotation for Apache Ignite Transparent Data
>>>>>> Encryption was implemented [1], but some security standards (PCI
>> DSS
>>>>>> at least) require rotation of all encryption keys [2]. Currently,
>>>>>> encryption occurs when reading/writing pages to disk, cache
>>> encryption
>>>>>> keys are stored in metastore.
>>>>>> 
>>>>>> I'm going to contribute cache encryption key rotation and want to
>>>>>> consult what is the best way to re-encrypting existing data, I see
>>> two
>>>>>> different strategies.
>>>>>> 
>>>>>> 1. In place re-encryption:
>>>>>> Using the old key, sequentially read all the pages from the
>>> datastore,
>>>>>> mark as dirty and log them into the WAL. After checkpoint pages
>> will
>>>>>> be stored to disk encrypted with the new key (as usual, along with
>>>>>> updates). This strategy requires store the identifier (number) of
>> the
>>>>>> encryption key into the encrypted page.
>>>>>> pros:
>>>>>> - can work in the background with minimal performance impact
>> (this
>>>>>> impact can be managed).
>>>>>> cons:
>>>>>> - page duplication in the WAL may affect performance and
>> historical
>>>>>> rebalance.
>>>>>> 
>>>>>> 2. Copy partition with re-encryption.
>>>>>> This strategy is similar to partition snapshotting [3] - create
>>>>>> partition copy encrypted with the new key and then replace the
>>>>>> original partition file with the new one (see details [4]).
>>>>>> pros:
>>>>>> - should work faster than "in place" re-encryption.
>>>>>> cons:
>>>>>> - re-encryption in active cluster (and on unstable topology) can
>> be
>>>>>> difficult to implement.
>>>>>> 
>>>>>> (See more detailed comparison [5])
>>>>>> 
>>>>>> Re-encryption of existing data is a long and rare procedure (It is
>>>>>> recommended to change the key every 6 months, but at least once
>> every
>>>>>> 2 years). Thus, re-encryption can be implemented for maintenance
>> mode
>>>>>> (for example, on a stable topology in a read-only cluster) and in
>>> such
>>>>>> case the approach with partition copying seems simpler and faster.
>>>>>> 
>>>>>> So, what do you think - do we need "online" re-encryption and which
>>> of
>>>>>> the proposed options is best suited for this?
>>>>>> 
>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-12186
>>>>>> [2]
>>> https://www.pcisecuritystandards.org/documents/PCI_DSS_v3-2-1.pdf
>>>>>> [3]
>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Partitionscopystrategy
>>>>>> [4]
>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign
>>>>>> .
>>>>>> [5]
>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Comparison
>>>>>> 
>>>>> 
>>> 
>> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov

Re: [DISCUSS] Best way to re-encrypt existing data (TDE cache key rotation).

Reply via email to