Hi Robert, I've taken some time to think over your PR and writeup, and have the following comments:
benefits of the PR I like this idea of native encryption a lot. While lower layers can offer encryption, I think there are a lot more situations where the lower layer has been delegated through cloud hosting, etc, and one is not really sure it is providing the expected capabilities without some unexpected caveat. I think native encryption should be very appropriate in a situation where the main system volume can be small and protected carefully but data volumes need to be cheap, large, easy to backup. Expunging uncertainty and manual shared key management I like systems like the regularly recycling of the per-shard key trying to somewhat limit something like momentary full system read access at one moment from inherently being able to snoop through old data that could have been expunged and all future data (after rekeys etc). I can understand why performance/design-wise the per-shard key is best wrapped and stored in the shard itself, but I find it a bit unfortunate for directly trusting data is expunged with low trust in data volumes to not be snapshotting, accidentally sharing backups, caching raw blocks, etc. Naturally, the current PR leaves choices open on managing the wrapping keys so access to the production db's active keys and backups doesn't have to always mean the abilitity to snoop through all past history, etc, but a site can trying to manage between the risks of data loss or insufficient key rotation has to consider a fairly complex set of constraints manually. Completeness of shared key design For symmetric encryption, I feel like the design for the wrapping is as complete as could be. I struggled to think of a reason for providing multiple wrapping slots in the shard header to give more than one wrapping key access to the current shard key but I don't see a lot of utility to multiple active shared wrapping keys. Multiple shared key slots might function as safety net for the backup process for the system to always be writing the current key and a future key and then progress and generate the new future key once one is sure the current key is not just on a filesystem that may fail, etc, but handling that constraint could also be left to a site to design at their own risk. Limits of shared keys A. While hardware acceleration could in theory be used, I don't think there is ever a point in combining this with a HSM/cryptoki/etc hardware keystore. The relevant wrapping keys are always going to be loaded into erlang's memory in production. B. There are manual rekeying choices to try to manage the different risks and changing pupose of a key as it ages, which may be difficult for most sites to get right. Benefits of adding asymmetric keys I think an additional asymmetric keying slot in the shard and corresponding encrypt/decrypt references could allow a number of nicer scenerios, for example: 1. not having a private key in production that can read backups, but encrypting everything for the current symmetric key and a backups asymmetric key. This would allow getting down to very little access to older symmetric keys in production and avoid risk of loss through loss of shorter lived symmetric keys by using the offline backup key as the insurance. 2. having a handle to a hardware (HSM) key as the private key and encrypting the shard keys to the publickey of this present token. This removes the ability to break in and steal the wrapping keys for later use. The attacker could ask the hardware token to decrypt each current shard key or any backups they already have access to but cant squirel away keys to combine with future access to backups, etc. Asymmetric keys left as a future capability I don't think there's any reason why the current design could not proceed and asymmetric keys be handled as a future addition. The only things I see to consider in easing future compatibility with asymmetric keys are: A. It would only be helpful if the data format for encrypted shards is flexible enough to not need a newer encrypted shard format later. B. Looking at the configuration, I don't think anything with the shared key configuration would need to change if there were an addition an asymmetric slot later and asymmetric keys were added in new sections. For example, only an asymmetric key section similar to "encryption_keys" would need support for indicating crypto:engine_key_ref() contents and similar. Kind Regards, Will