Re: [DISCUSS] Best way to re-encrypt existing data (TDE cache key rotation).

Alexei Scherbakov Mon, 25 May 2020 01:46:31 -0700

For me, the one big disadvantage for offline re-encryption is the
possibility to run out of WAL history.
If an re-encryption takes a long time we will get full rebalancing with
partition eviction.
This willl takes us to the re-encryption using full rebalancing, proposed
by me earlier.




пн, 25 мая 2020 г. в 11:27, Nikolay Izhikov <[email protected]>:

> > And definitely this approach is much simplier to implement
>
> I agree.
>
> If we allow to made nodes offline for reencryption then we can implement a
> fully offline procedure:
>
> 1. Stop node.
> 2. Execute some control.sh command that will reencrypt all data without
> starting node
> 3. Start node.
>
> Pavel, can you, please, write it one more time - what disadvantages in
> offline procedure?
>
> > 25 мая 2020 г., в 11:20, Alexei Scherbakov <[email protected]>
> написал(а):
> >
> > And definitely this approach is much simplier to implement because all
> > corner cases are handled by rebalancing code.
> >
> > пн, 25 мая 2020 г. в 11:16, Alexei Scherbakov <
> [email protected]
> >> :
> >
> >> I mean: serving supply requests.
> >>
> >> пн, 25 мая 2020 г. в 11:15, Alexei Scherbakov <
> >> [email protected]>:
> >>
> >>> Nikolay,
> >>>
> >>> Can you explain why such restriction is necessary ?
> >>> Most likely having a currently re-encrypting node serving only demand
> >>> requests will have least preformance impact on a grid.
> >>>
> >>> пн, 25 мая 2020 г. в 11:08, Nikolay Izhikov <[email protected]>:
> >>>
> >>>> Hello, Alexei.
> >>>>
> >>>> I think we want to implement this feature without nodes restart.
> >>>> In the ideal scenario all nodes will stay alive and respond to the
> user
> >>>> requests.
> >>>>
> >>>>> 24 мая 2020 г., в 15:24, Alexei Scherbakov <
> >>>> [email protected]> написал(а):
> >>>>>
> >>>>> Pavel Pereslegin,
> >>>>>
> >>>>> I see another opportunity.
> >>>>> We can use rebalancing to re-encrypt node data with a new key.
> >>>>> It's a trivial procedure for me: stop a node, clear database, change
> a
> >>>> key,
> >>>>> start node and wait for rebalancing to complete.
> >>>>> Data will be re-encrypted during rebalancing.
> >>>>>
> >>>>> Did I miss something ?
> >>>>>
> >>>>> пт, 22 мая 2020 г. в 16:14, Ivan Rakov <[email protected]>:
> >>>>>
> >>>>>> Folks,
> >>>>>>
> >>>>>> Just keeping you informed: I and my colleagues are highly interested
> >>>> in TDE
> >>>>>> in general and keys rotations specifically, but we don't have enough
> >>>> time
> >>>>>> so far.
> >>>>>> We'll dive into this feature and participate in reviews next month.
> >>>>>>
> >>>>>> --
> >>>>>> Best Regards,
> >>>>>> Ivan Rakov
> >>>>>>
> >>>>>> On Sun, May 17, 2020 at 10:51 PM Pavel Pereslegin <[email protected]
> >
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hello, Alexey.
> >>>>>>>
> >>>>>>>> is the encryption key for the data the same on all nodes in the
> >>>>>> cluster?
> >>>>>>> Yes, each encrypted cache group has its own encryption key, the key
> >>>> is
> >>>>>>> the same on all nodes.
> >>>>>>>
> >>>>>>>> Clearly, during the re-encryption there will exist pages
> >>>>>>>> encrypted with both new and old keys at the same time.
> >>>>>>> Yes, there will be pages encrypted with different keys at the same
> >>>> time.
> >>>>>>> Currently, we only store one key for one cache group. To rotate a
> >>>> key,
> >>>>>>> at a certain point in time it is necessary to support several keys
> >>>> (at
> >>>>>>> least for reading the WAL).
> >>>>>>> For the "in place" strategy, we'll store the encryption key
> >>>> identifier
> >>>>>>> on each encrypted page (we currently have some unused space on
> >>>>>>> encrypted page, so I don't expect any memory overhead here). Thus,
> we
> >>>>>>> will have several keys for reading and one key for writing. I
> assume
> >>>>>>> that the old key will be automatically deleted when a specific WAL
> >>>>>>> segment is deleted (and re-encryption is finished).
> >>>>>>>
> >>>>>>>> Will a node continue to re-encrypt the data after it restarts?
> >>>>>>> Yes.
> >>>>>>>
> >>>>>>>> If a node goes down during the re-encryption, but the rest of the
> >>>>>>>> cluster finishes re-encryption, will we consider the procedure
> >>>>>> complete?
> >>>>>>> I'm not sure, but it looks like the key rotation is complete when
> we
> >>>>>>> set the new key on all nodes so that the updates will be encrypted
> >>>>>>> with the new key (as required by PCI DSS).
> >>>>>>> Status of re-encryption can be obtained separately (locally or
> >>>> cluster
> >>>>>>> wide).
> >>>>>>>
> >>>>>>> I forgot to mention that with “in place” re-encryption it will be
> >>>>>>> impossible to quickly cancel re-encryption, because by canceling we
> >>>>>>> mean re-encryption with the old key.
> >>>>>>>
> >>>>>>>> How do you see the whole key rotation procedure will work?
> >>>>>>> Initial design for re-encryption with "partition copying" is
> >>>> described
> >>>>>>> here [1]. I'll prepare detailed design for "in place" re-encryption
> >>>> if
> >>>>>>> we'll go this way. In short, send the new encryption key
> >>>> cluster-wide,
> >>>>>>> each node adds a new key and starts background re-encryption.
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> >>>>>>
> >>>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign
> >>>>>>> .
> >>>>>>>
> >>>>>>> вс, 17 мая 2020 г. в 18:35, Alexey Goncharuk <
> >>>> [email protected]
> >>>>>>> :
> >>>>>>>>
> >>>>>>>> Pavel, Anton,
> >>>>>>>>
> >>>>>>>> How do you see the whole key rotation procedure will work?
> Clearly,
> >>>>>>> during
> >>>>>>>> the re-encryption there will exist pages encrypted with both new
> and
> >>>>>> old
> >>>>>>>> keys at the same time. Will a node continue to re-encrypt the data
> >>>>>> after
> >>>>>>> it
> >>>>>>>> restarts? If a node goes down during the re-encryption, but the
> >>>> rest of
> >>>>>>> the
> >>>>>>>> cluster finishes re-encryption, will we consider the procedure
> >>>>>> complete?
> >>>>>>> By
> >>>>>>>> the way, is the encryption key for the data the same on all nodes
> in
> >>>>>> the
> >>>>>>>> cluster?
> >>>>>>>>
> >>>>>>>> чт, 14 мая 2020 г. в 11:30, Anton Vinogradov <[email protected]>:
> >>>>>>>>
> >>>>>>>>> +1 to "In place re-encryption".
> >>>>>>>>>
> >>>>>>>>> - It has a simple design.
> >>>>>>>>> - Clusters under load may require just load to re-encrypt the
> data.
> >>>>>>>>> (Friendly to load).
> >>>>>>>>> - Easy to throttle.
> >>>>>>>>> - Easy to continue.
> >>>>>>>>> - Design compatible with the multi-key architecture.
> >>>>>>>>> - It can be optimized to use own WAL buffer and to re-encrypt
> pages
> >>>>>>> without
> >>>>>>>>> restoring them to on-heap.
> >>>>>>>>>
> >>>>>>>>> On Thu, May 14, 2020 at 1:54 AM Pavel Pereslegin <
> [email protected]
> >>>>>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hello Igniters.
> >>>>>>>>>>
> >>>>>>>>>> Recently, master key rotation for Apache Ignite Transparent Data
> >>>>>>>>>> Encryption was implemented [1], but some security standards (PCI
> >>>>>> DSS
> >>>>>>>>>> at least) require rotation of all encryption keys [2].
> Currently,
> >>>>>>>>>> encryption occurs when reading/writing pages to disk, cache
> >>>>>>> encryption
> >>>>>>>>>> keys are stored in metastore.
> >>>>>>>>>>
> >>>>>>>>>> I'm going to contribute cache encryption key rotation and want
> to
> >>>>>>>>>> consult what is the best way to re-encrypting existing data, I
> see
> >>>>>>> two
> >>>>>>>>>> different strategies.
> >>>>>>>>>>
> >>>>>>>>>> 1. In place re-encryption:
> >>>>>>>>>> Using the old key, sequentially read all the pages from the
> >>>>>>> datastore,
> >>>>>>>>>> mark as dirty and log them into the WAL. After checkpoint pages
> >>>>>> will
> >>>>>>>>>> be stored to disk encrypted with the new key (as usual, along
> with
> >>>>>>>>>> updates). This strategy requires store the identifier (number)
> of
> >>>>>> the
> >>>>>>>>>> encryption key into the encrypted page.
> >>>>>>>>>> pros:
> >>>>>>>>>> - can work in the background with minimal performance impact
> >>>>>> (this
> >>>>>>>>>> impact can be managed).
> >>>>>>>>>> cons:
> >>>>>>>>>> - page duplication in the WAL may affect performance and
> >>>>>> historical
> >>>>>>>>>> rebalance.
> >>>>>>>>>>
> >>>>>>>>>> 2. Copy partition with re-encryption.
> >>>>>>>>>> This strategy is similar to partition snapshotting [3] - create
> >>>>>>>>>> partition copy encrypted with the new key and then replace the
> >>>>>>>>>> original partition file with the new one (see details [4]).
> >>>>>>>>>> pros:
> >>>>>>>>>> - should work faster than "in place" re-encryption.
> >>>>>>>>>> cons:
> >>>>>>>>>> - re-encryption in active cluster (and on unstable topology) can
> >>>>>> be
> >>>>>>>>>> difficult to implement.
> >>>>>>>>>>
> >>>>>>>>>> (See more detailed comparison [5])
> >>>>>>>>>>
> >>>>>>>>>> Re-encryption of existing data is a long and rare procedure (It
> is
> >>>>>>>>>> recommended to change the key every 6 months, but at least once
> >>>>>> every
> >>>>>>>>>> 2 years). Thus, re-encryption can be implemented for maintenance
> >>>>>> mode
> >>>>>>>>>> (for example, on a stable topology in a read-only cluster) and
> in
> >>>>>>> such
> >>>>>>>>>> case the approach with partition copying seems simpler and
> faster.
> >>>>>>>>>>
> >>>>>>>>>> So, what do you think - do we need "online" re-encryption and
> >>>> which
> >>>>>>> of
> >>>>>>>>>> the proposed options is best suited for this?
> >>>>>>>>>>
> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-12186
> >>>>>>>>>> [2]
> >>>>>>> https://www.pcisecuritystandards.org/documents/PCI_DSS_v3-2-1.pdf
> >>>>>>>>>> [3]
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Partitionscopystrategy
> >>>>>>>>>> [4]
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign
> >>>>>>>>>> .
> >>>>>>>>>> [5]
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Comparison
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>> Best regards,
> >>>>> Alexei Scherbakov
> >>>>
> >>>>
> >>>
> >>> --
> >>>
> >>> Best regards,
> >>> Alexei Scherbakov
> >>>
> >>
> >>
> >> --
> >>
> >> Best regards,
> >> Alexei Scherbakov
> >>
> >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
>
>

-- 

Best regards,
Alexei Scherbakov

Re: [DISCUSS] Best way to re-encrypt existing data (TDE cache key rotation).

Reply via email to