Hi
I have tried the attached kms_v3 patch and have some comments: 1) In the comments, I think you meant hmac + iv + encrypted data instead of iv + hmac + encrypted data? ---> in kmgr_wrap_key( ): + /* + * Assemble the wrapped key. The order of the wrapped key is iv, hmac and + * encrypted data. + */ 2) I see that create_keywrap_ctx function in kmgr_utils.c and regular cipher context init will both call ossl_aes256_encrypt_init to initialise context for encryption and key wrapping. In ossl_aes256_encrypt_init, the cipher method always initialises to aes-256-cbc, which is ok for keywrap because under CBC block cipher mode, we only had to supply one unique IV as initial value. But for actual WAL and buffer encryption that will come in later, I think the discussion is to use CTR block cipher mode, which requires one unique IV for each block, and the sequence id from WAL and buffer can be used to derive unique IV for each block for better security? I think it would be better to allow caller to decide which EVP_CIPHER to initialize? EVP_aes_256_cbc(), EVP_aes_256_ctr() or others? +ossl_aes256_encrypt_init(pg_cipher_ctx *ctx, uint8 *key) +{ + if (!EVP_EncryptInit_ex(ctx, EVP_aes_256_cbc(), NULL, NULL, NULL)) + return false; + if (!EVP_CIPHER_CTX_set_key_length(ctx, PG_AES256_KEY_LEN)) + return false; + if (!EVP_EncryptInit_ex(ctx, NULL, NULL, key, NULL)) + return false; + + /* + * Always enable padding. We don't need to check the return + * value as EVP_CIPHER_CTX_set_padding always returns 1. + */ + EVP_CIPHER_CTX_set_padding(ctx, 1); + + return true; +} 3) Following up point 2), I think we should enhance the enum to include not only the Encryption algorithm and key size, but also the block cipher mode as well because having all 3 pieces of information can describe exactly how KMS is performing the encryption and decryption. So when we call "ossl_aes256_encrypt_init", we can include the new enum as input parameter and it will initialise the EVP_CIPHER_CTX with either EVP_aes_256_cbc() or EVP_aes_256_ctr() for different purposes (key wrapping, or WAL encryption..etc). ---> kmgr.h +/* Value of key_management_cipher */ +enum +{ + KMGR_CIPHER_OFF = 0, + KMGR_CIPHER_AES256 +}; + so it would become +enum +{ + KMGR_CIPHER_OFF = 0, + KMGR_CIPHER_AES256_CBC = 1, + KMGR_CIPHER_AES256_CTR = 2 +}; If you agree with this change, several other places will need to be changed as well, such as "kmgr_cipher_string", "kmgr_cipher_value" and the initdb code.... 4) the pg_wrap_key and pg_unwrap_key SQL functions defined in kmgr.c I tried these new SQL functions and found that the pg_unwrap_key will produce the original key with 4 bytes less. This is because the result length is not set long enough to accommodate the 4 byte VARHDRSZ header used by the multi-type variable. the len variable in SET_VARSIZE(res, len) should include also the variable header VARHDRSZ. Now it is 4 byte short so it will produce incomplete output. ---> pg_unwrap_key function in kmgr.c + if (!kmgr_unwrap_key(UnwrapCtx, (uint8 *) VARDATA_ANY(data), datalen, + (uint8 *) VARDATA(res), &len)) + ereport(ERROR, + (errmsg("could not unwrap the given secret"))); + + /* + * The size of unwrapped key can be smaller than the size estimated + * before unwrapping since the padding is removed during unwrapping. + */ + SET_VARSIZE(res, len); + PG_RETURN_BYTEA_P(res); I am only testing their functionalities with random key as input data. It is currently not possible for a user to obtain the wrapped key from the server in order to use these wrap/unwrap functions. I personally don't think it is a good idea to expose these functions to user thank you Cary Huang ------------- HighGo Software Inc. (Canada) mailto:cary.hu...@highgo.ca http://www.highgo.ca ---- On Fri, 14 Feb 2020 08:00:45 -0800 Robert Haas <robertmh...@gmail.com> wrote ---- On Thu, Feb 6, 2020 at 9:19 PM Masahiko Sawada <mailto:masahiko.saw...@2ndquadrant.com> wrote: > This feature protects data from disk thefts but cannot protect data > from attackers who are able to access PostgreSQL server. In this > design application side still is responsible for managing the wrapped > secret in order to protect it from attackers. This is the same as when > we use pgcrypto now. The difference is that data is safe even if > attackers steal the wrapped secret and the disk. The data cannot be > decrypted either without the passphrase which can be stored to other > key management systems or without accessing postgres server. IOW for > example, attackers can get the data if they get the wrapped secret > managed by application side then run pg_kmgr_unwrap() to get the > secret and then steal the disk. If you only care about protecting against the theft of the disk, you might as well just encrypt the whole filesystem, which will probably perform better and probably be a lot harder to break since you won't have short encrypted strings but instead large encrypted blocks of data. Moreover, I think a lot of people who are interested in these kinds of features are hoping for more than just protecting against the theft of the disk. While some people may be hoping for too much in this area, setting your sights only on encryption at rest seems like a fairly low bar. It also doesn't seem very likely to actually provide any security. You're talking about sending the encryption key in the query string, which means that there's a good chance it's going to end up in a log file someplace. One way that could happen is if the user has configured log_statement=all or log_min_duration_statement, but it could also happen any time the query throws an error. In theory, you might arrange for the log messages to be sent to another server that is protected by separate layers of security, but a lot of people are going to just log locally. And, even if you do have a separate server, do you really want to have the logfile over there be full of passwords? I know I can be awfully negative some times, but that it seems like a weakness so serious as to make this whole thing effectively useless. One way to plug this hole is to use new protocol messages for key exchanges. For example, suppose that after authentication is complete, you can send the server a new protocol message: KeyPassphrase <key-name> <passphrase>. The server stores the passphrase in backend-private memory and returns ReadyForQuery, and does not log the message payload anywhere. Now you do this: INSERT INTO tbl VALUES (pg_encrypt('user data', 'key-name'); SELECT pg_decrypt(secret_column, 'key-name') FROM tbl; If the passphrase for the named key has not been loaded into the current session's memory, this produces an error; otherwise, it looks up the passphrase and uses it to do the decryption. Now the passphrase never gets logged anywhere, and, also, the user can't persuade the server to provide it with the encryption key, because there's no SQL-level function to access that data. We could take it a step further: suppose that encryption is a column property, and the value of the property is a key name. If the user hasn't sent a KeyPassphrase message with the relevant key name, attempts to access that column just error out. If they have, then the server just does the encryption and decryption automatically. Now the user can just do: INSERT INTO tbl VALUES ('user data'); SELECT secret_column FROM tbl; It's a huge benefit if the SQL doesn't need to be changed. All that an application needs to do in order to use encryption in this scenario is use PQsetKeyPassphrase() or whatever before doing whatever else they want to do. Even with these changes, the security of this whole approach can be criticized on the basis that a good amount of information about the data can be inferred without decrypting anything. You can tell which encrypted values are long and which are short. If someone builds an index on the column, you can tell the order of all the encrypted values even though you may not know what the actual values are. Those could well be meaningful information leaks, but I think such a system might still be of use for certain purposes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company