+1 to "In place re-encryption". - It has a simple design. - Clusters under load may require just load to re-encrypt the data. (Friendly to load). - Easy to throttle. - Easy to continue. - Design compatible with the multi-key architecture. - It can be optimized to use own WAL buffer and to re-encrypt pages without restoring them to on-heap.
On Thu, May 14, 2020 at 1:54 AM Pavel Pereslegin <xxt...@gmail.com> wrote: > Hello Igniters. > > Recently, master key rotation for Apache Ignite Transparent Data > Encryption was implemented [1], but some security standards (PCI DSS > at least) require rotation of all encryption keys [2]. Currently, > encryption occurs when reading/writing pages to disk, cache encryption > keys are stored in metastore. > > I'm going to contribute cache encryption key rotation and want to > consult what is the best way to re-encrypting existing data, I see two > different strategies. > > 1. In place re-encryption: > Using the old key, sequentially read all the pages from the datastore, > mark as dirty and log them into the WAL. After checkpoint pages will > be stored to disk encrypted with the new key (as usual, along with > updates). This strategy requires store the identifier (number) of the > encryption key into the encrypted page. > pros: > - can work in the background with minimal performance impact (this > impact can be managed). > cons: > - page duplication in the WAL may affect performance and historical > rebalance. > > 2. Copy partition with re-encryption. > This strategy is similar to partition snapshotting [3] - create > partition copy encrypted with the new key and then replace the > original partition file with the new one (see details [4]). > pros: > - should work faster than "in place" re-encryption. > cons: > - re-encryption in active cluster (and on unstable topology) can be > difficult to implement. > > (See more detailed comparison [5]) > > Re-encryption of existing data is a long and rare procedure (It is > recommended to change the key every 6 months, but at least once every > 2 years). Thus, re-encryption can be implemented for maintenance mode > (for example, on a stable topology in a read-only cluster) and in such > case the approach with partition copying seems simpler and faster. > > So, what do you think - do we need "online" re-encryption and which of > the proposed options is best suited for this? > > [1] https://issues.apache.org/jira/browse/IGNITE-12186 > [2] https://www.pcisecuritystandards.org/documents/PCI_DSS_v3-2-1.pdf > [3] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Partitionscopystrategy > [4] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign > . > [5] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Comparison >