Hi Robert, I think it is best to try to clear up the matter of non-extractable keys since that is where I am confused about the capabilities without asymmetric keys. The setup I am used to seeing for non-extractable keys looks similar to crypto's OpenSSL engine references where erlang says it supports only RSA and the underlying OpenSSL supports RSA or some EC-variants. I think that is pretty inline for ~smartcard-chip pkcs11 tokens like yubikeys, i.e. I have an old feitian epass2003 which says it supports some symmetric key algorithms but really it supports generating only keypairs in pkcs11-tool and some ~acceleration of shared keys.
Looking at AWS' cloudHSM, OTOH, I see that it supports symmetric non-extractable keys, but also backing up, restoring and ending up with clones that are copies of one original HSM. I see how that could work with only symmetric keys, but I find that a little scary and I think clonability is supposed to never be the case for the portable non-extractable tokens I actually use. If a non-clonable non-extractable device is used I don't think it is practical to do correct management with just non-extractable symmetric keys. They all have to be present to be encrypted to making them all subject to mishap in production at the same time, and they can have no restorable backups as that is indistinguishable from future clones. Thus I arrived at asymmetric setup for tokens so encryption can occur to additional offline tokens. I would expect to need to pass an engine reference to keep access to the private key open to read a wrapped shard key and while theoretically one could blindly encrypt to public keys, I think there's probably a need to effectively doing a sign/encrypt with a/the local privatekey to each publickey. I.e.: a header would need at least 2 slots on a pure asymmetric setup: to-keyid:local-token encrypted-shard-KEY to-keyid:backups encrypted-shard-KEY In compaction one would encrypt the new shard key to whichever public-keys one wants but would naturally want to choose at least one that is the local token or private key to not put the shard offline. But as you alluded to, one doesn't really want a keypair for the node itself, one could mix one or more asymmetric keys to solve the management problems of trusting specific non-clonable tokens with a node local symmetric key for rewrapping to itself. As long as there's a trustworthy offline key, the admin could be much more confident that local keys could safely be rotated out and destroyed without permanently losing access to data or backups by mistake so keys can actually get deleted from production. That introduces the same kind of limit on the value of current production wrapper keys similar to the frequent shard key rotation but instead on any access to shards in backups of the data volume and even if there is no local hsm there is still not much opportunity to snoop by copying the symmetric keys that are being used today for looking at future backups. If we are talking only about supporting "enterprise" HSMs that can be cloned to various backup clusters, etc, I'm not really sure if they fix the problems I was concerned about or only punt to managing risk with online clones, etc. In many cases I think they actually make it more likely to have a very complete HSM configuration of every key ever used cloned around to every host and create a lot of internal access to be able to look through all old backups and so on. I would also feel uncomfortable using an HSM that only works on one cloud without signing to an outside asymmetric key to ensure continuity if something went wrong with agreements to use that cloud. So really I think I view the ability to encrypt to an offline asymmetric key that may or may not be an HSM key as kind of important to eventually feel comfortable using encryption in production and be tuning toward good security practice without fear of data loss. At any rate I hope that makes the direction I have been thinking in a little clearer? Thanks, Will