Re: [HACKERS] WIP: Data at rest encryption
On Mon, Jun 19, 2017 at 12:30 PM, Robert Haas wrote: > On Thu, Jun 15, 2017 at 7:51 PM, Alvaro Herrera > wrote: >> I thought we called it "incremental development". From the opposite >> point of view, would you say we should ban use of passphrase-protected >> SSL key files because the current user interface for them is bad? > > I think that we've got a number of features which exist in the tree > today only because either (a) our standards were lower at the time > that those features were committed than they are today or (b) we > didn't realize how much trouble those features were going to create. > Just because we don't want to hose the people already using those > features does not mean that we want more features engineered to that > same quality level. Obviously, there's room for debate in any > particular case about how reasonable it is to expect someone who wants > to implement A to also improve B, and, well, maybe handling thing as > we do SSL certificates is good enough for this feature, too. I find > myself a bit skeptical about that, though. It preclude as lot of > things we might want to do. You're not going to be able to interface > with some external key management server that way, nor do encryption > of only part of the database, nor have multiple keys for different > parts of the database using that kind of setup. > > One could argue that can all be added later, but I think there's a > question about how much that's going to affect the code structure. > Surely we don't want to install encryption v1 and then find that, by > not considering key management, we've made it really hard to get to > v2, and that it basically can't be done without ripping the whole > implementation out and replacing it. Maybe the database needs, at > some rather low level, a concept of whether the encryption key (or an > encryption key) is available yet, and maybe you get out of considering > that by deciding you're just going to prompt very early in startup, > but now when someone comes along and wants to improve things later, > and they've got to try to go find all of the code that depends on the > assumption that the key is always available and fix it. That could be > pretty annoying to validate. I think it's better to give at least > some consideration to these key management questions from the > beginning, lest we back ourselves into a corner. Whether or not the > SSL-passphrase style implementation is above or below the level we'd > consider a minimally viable feature, it's surely not where we want to > end up, and we shouldn't do anything that makes it likely that we'll > get stuck at precisely that point. > > Also, practically, I think that type of solution is going to be > extremely difficult to use for most people. It means that the > database can't be started by the system startup scripts; you'll have > to log into the PG OS user account and launch the database from there. > IIUC, that means it won't be able to be controlled by things like > systemd, that just know about start and stop, but not about ask for a > password in the middle. Maybe I'm wrong about that, though. And > certainly, there will be some users for whom starting the database > manually and prompting for a password will be good enough, if not > ideal. But for people who want to fetch the encryption key from a key > management server, which I bet is a lot of people, that's not really > going to be good enough. I'm not really sure that rushing a first > patch that "works" for sufficiently small values of "works" is > actually going ...to move us forward very much. >> I have no use for data-at-rest encryption myself, but I wouldn't stop >> development just because the initial design proposal doesn't include >> top-notch key management. I agree with that, but there's a difference between "not top-notch" and "pretty bad". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 7:51 PM, Alvaro Herrera wrote: > I thought we called it "incremental development". From the opposite > point of view, would you say we should ban use of passphrase-protected > SSL key files because the current user interface for them is bad? I think that we've got a number of features which exist in the tree today only because either (a) our standards were lower at the time that those features were committed than they are today or (b) we didn't realize how much trouble those features were going to create. Just because we don't want to hose the people already using those features does not mean that we want more features engineered to that same quality level. Obviously, there's room for debate in any particular case about how reasonable it is to expect someone who wants to implement A to also improve B, and, well, maybe handling thing as we do SSL certificates is good enough for this feature, too. I find myself a bit skeptical about that, though. It preclude as lot of things we might want to do. You're not going to be able to interface with some external key management server that way, nor do encryption of only part of the database, nor have multiple keys for different parts of the database using that kind of setup. One could argue that can all be added later, but I think there's a question about how much that's going to affect the code structure. Surely we don't want to install encryption v1 and then find that, by not considering key management, we've made it really hard to get to v2, and that it basically can't be done without ripping the whole implementation out and replacing it. Maybe the database needs, at some rather low level, a concept of whether the encryption key (or an encryption key) is available yet, and maybe you get out of considering that by deciding you're just going to prompt very early in startup, but now when someone comes along and wants to improve things later, and they've got to try to go find all of the code that depends on the assumption that the key is always available and fix it. That could be pretty annoying to validate. I think it's better to give at least some consideration to these key management questions from the beginning, lest we back ourselves into a corner. Whether or not the SSL-passphrase style implementation is above or below the level we'd consider a minimally viable feature, it's surely not where we want to end up, and we shouldn't do anything that makes it likely that we'll get stuck at precisely that point. Also, practically, I think that type of solution is going to be extremely difficult to use for most people. It means that the database can't be started by the system startup scripts; you'll have to log into the PG OS user account and launch the database from there. IIUC, that means it won't be able to be controlled by things like systemd, that just know about start and stop, but not about ask for a password in the middle. Maybe I'm wrong about that, though. And certainly, there will be some users for whom starting the database manually and prompting for a password will be good enough, if not ideal. But for people who want to fetch the encryption key from a key management server, which I bet is a lot of people, that's not really going to be good enough. I'm not really sure that rushing a first patch that "works" for sufficiently small values of "works" is actually going > I have no use for data-at-rest encryption myself, but I wouldn't stop > development just because the initial design proposal doesn't include > top-notch key management. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Hi all, I've noticed this thread got resurrected a few days ago, but I haven't managed to read all the messages until today. I do have a bunch of comments, but let me share them as a single consistent message instead of sending a thousand responses to individual messages. 1) Threat model --- Firstly, I think the thread would seriously benefit from an explanation and discussion of the threat model - what types of attacks it's meant to address, and what attacks it can't defend against. My understanding is that data-at-rest encryption generally addresses only the "someone stole the disk" case and pretty much nothing else. Moreover, I claim that encryption implemented at the database-level is strictly weaker compared to LUKS or encrypted disks, because it simply reveals a lot more information even without decryption (file sizes, timestamps, etc.) That is a serious issue in practice, and researches have been proving that for a long time now. I do recommend this paper from Cornell Tech as a great starting point (it cites many papers relevant to this thread): Why Your Encrypted Database Is Not Secure Paul Grubbs, Thomas Ristenpart, Vitaly Schmatikov http://eprint.iacr.org/2017/468.pdf The paper explains how encryption schemes on general-purpose databases fail, due to exactly such side-channels. MVCC, logging and other side channels turn all attackers into "persistent passive attackers". Now, this does not mean the feature is useless - nothing is perfect, and security is not a binary feature. It certainly makes attacks mode difficult compared to plaintext database. But it's untrue that it's basically LUKS, just implemented at the database level. I'm not suggesting that we should not pursue this idea, but the threat model is a crucial piece of information missing in this thread. 2) How do other databases do it? It was repeatedly mentioned that other databases support this type of encryption. So how do they deal with the hard parts? For example how do they get and protect the encryption key? I agree with Stephen that we should not require a full key management from v1 of the patch, that's an incredibly difficult thing. And it largely depends on the hardware (e.g. it should be possible to move the key to TrustZone on ARM / SGX on Intel). 3) Why do users prefer this to FDE? --- I suppose we're discussing this feature because we've been asked about it by users/customers who can't use FDE. Can we reach to them and ask them about the reasons? Why can't they use FDE? It was mentioned in the thread that the FDE solutions are not portable between different systems, but honestly - is that an issue? You probably can't copy the datadir anyway due locale differences anyway. If you're running multiple operating systems, FDE is just one of many differences. 4) Other solutions? --- Clearly, FDE (at the block device level) and DB-level encryption are not the only solutions. There are also filesystems-level solutions now, for example. ext4 (since kernel 4.1) and f2fs (since kernel 4.2) allow encryption at directory level, are transparent to the user space, and include things like key management (well, that's provided by kernel). NTFS can do something quite similar using EFS. https://lwn.net/Articles/639427/ https://blog.quarkslab.com/a-glimpse-of-ext4-filesystem-level-encryption.html Of course, if you happen to use another filesystem (e.g. XFS), this won't work for you. But then there's eCryptfs, for example: https://en.wikipedia.org/wiki/ECryptfs regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Fri, Jun 16, 2017 at 11:06:39AM +0300, Konstantin Knizhnik wrote: > Encryption is much easier to implement than compression, because it is not > changing page size. So I do not see any "complexity and flexibility > challenges" here. > Just for reference I attached to this mail our own encryption patch. I do I didn't see you using CPU AES instructions, which can improve performance by 3-10x. Is there a reason? > Postgres buffer manager interface significantly simplifies integration of > encryption and compression. There is actually single path through which data > is fetched/stored to the disk. > It is most obvious and natural solution to decompress/decrypt data when it > is read from the disk to page pool and compress/encrypt it when it is > written back. Taken in account that memory is cheap now and many databases > can completely fit in memory, storing pages in the buffer cache in plain > (decompressed/decrypted) format allows to minimize overhead of > compression/encryption and its influence on performance. For read only > queries working with cached data performance will be exactly the same as > without encryption/compression. > Write speed for encrypted pages will be certainly slightly worse, but still > encryption speed is much higher than disk IO speed. Good point. > I do not think that pluggable storage API is right approach to integrate > compression and especially encryption. It is better to plugin encryption > between buffer manager and storage device, > allowing to use it with any storage implementation. Also it is not clear to > me whether encryption of WAL can be provided using pluggable storage API. Yes, you are completely correct. I withdraw my suggestion of doing it as plugin storage. > The last discussed question is whether it is necessary to encrypt temporary > data (BufFile). In our solution we encrypt only main fork of non-system > relations and do no encrypt temporary relations. It may cause that some > secrete data will be stored at this disk in non-encrypted format. But > accessing this data is not trivial. You can not just copy/stole disk, open > database and do "select * from SecreteTable": you will have to extract data > from raw file yourself. So looks like it is better to allow user to make > choice whether to encrypt temporary data or not. If we go forward with in-db encryption, I think we are going to have to have a discussion about what parts of PGDATA need to be encrypted, i.e., I don't think pg_clog needs encryption. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 08:08:05PM -0400, Bruce Momjian wrote: > On Thu, Jun 15, 2017 at 04:56:36PM -0700, Andres Freund wrote: > > how few concerns about this feature's complexity / maintainability > > impact have been raised. > > Yeah, I guess we will just have to wait to see it since other people are > excited about it. My concern is code complexity and usability > challenges, vs punting the problem to the operating system, though > admittedly there are some cases where that is not possible. I know some OS's can create file systems inside files. Can you encrypt such file storage as non-root? I assume that is just too odd. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 16.06.2017 03:08, Bruce Momjian wrote: Yeah, I guess we will just have to wait to see it since other people are excited about it. My concern is code complexity and usability challenges, vs punting the problem to the operating system, though admittedly there are some cases where that is not possible. Let me also share my opinion about encryption and compression support at database level. PostgresPro Enterprise does support both. I have made presentation about it at PgConn 2016 in Tallinn. I was a little bit surprised that there were more questions about encryption than about compression. But right now we have several customers which are using compression and none of them use encryption (just because them do not need to protect their databases). But I absolutely sure that there are many Postgres users which first of all need to protect their data. Encryption is much easier to implement than compression, because it is not changing page size. So I do not see any "complexity and flexibility challenges" here. Just for reference I attached to this mail our own encryption patch. I do not want to propose it as alternative to Aasmas patch: it is less flexible and doesn't support encryption of WAL, just encryption of relation data. Also it doesn't allow custom encryption libraries: AES implementation is embedded. Encryption cipher is taken from environment variable. At Tallin's conferences I was informed about possible security issue with passing key through environment variable: it is possible to inspect server's environment variables using plpython/plperl stored procedure. This is why we unset this environment variable after reading. I am not expect in security, but I do not know other issues with such solution. Concerning the question whether to implement compression/encryption on database level or rely on OS, my opinion is that there are many scenarios where it is not possible or is not desirable to use OS level encryption/protection. It first of all includes cloud installations and embedded applications. I do not want to repeat arguments already mentioned in this thread. But the fact is that there are many people which really need compression/encryption support and them can not or do not want to redirect this aspects to OS. Almost all DBMSes are supporting compression encryption, so lack of this features in Postgres definitely can not be considered as Postgres advantage. Postgres buffer manager interface significantly simplifies integration of encryption and compression. There is actually single path through which data is fetched/stored to the disk. It is most obvious and natural solution to decompress/decrypt data when it is read from the disk to page pool and compress/encrypt it when it is written back. Taken in account that memory is cheap now and many databases can completely fit in memory, storing pages in the buffer cache in plain (decompressed/decrypted) format allows to minimize overhead of compression/encryption and its influence on performance. For read only queries working with cached data performance will be exactly the same as without encryption/compression. Write speed for encrypted pages will be certainly slightly worse, but still encryption speed is much higher than disk IO speed. So I do not think that it is really necessary to support encryption of some particular tables, storing "non-secrete" data in plain format without encryption. It should not cause noticeable improve of performance, but may complicate implementation and increase possibility of leaking secure data. I do not think that pluggable storage API is right approach to integrate compression and especially encryption. It is better to plugin encryption between buffer manager and storage device, allowing to use it with any storage implementation. Also it is not clear to me whether encryption of WAL can be provided using pluggable storage API. The last discussed question is whether it is necessary to encrypt temporary data (BufFile). In our solution we encrypt only main fork of non-system relations and do no encrypt temporary relations. It may cause that some secrete data will be stored at this disk in non-encrypted format. But accessing this data is not trivial. You can not just copy/stole disk, open database and do "select * from SecreteTable": you will have to extract data from raw file yourself. So looks like it is better to allow user to make choice whether to encrypt temporary data or not. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/storage/file/Makefile b/src/backend/storage/file/Makefile index d2198f2..9492662 100644 --- a/src/backend/storage/file/Makefile +++ b/src/backend/storage/file/Makefile @@ -12,6 +12,6 @@ subdir = src/backend/storage/file top_builddir = ../../../.. include $(top_builddir)/src/Makefile.global -OBJS = fd.o buffile.o copydir.o reinit.o +OBJS
Re: [HACKERS] WIP: Data at rest encryption
> Yeah, I guess we will just have to wait to see it since other people are > excited about it. My concern is code complexity and usability > challenges, vs punting the problem to the operating system, though > admittedly there are some cases where that is not possible. Oracle sells this feature only with the expensive enterprise edition. And people actually buy it. I guess the feature is pretty important for some users. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 04:56:36PM -0700, Andres Freund wrote: > On 2017-06-15 19:44:43 -0400, Bruce Momjian wrote: > > Understood, but now you are promoting a feature with an admittedly-poor > > API, duplication of an OS feature, and perhaps an invasive change to the > > code. > > *Perhaps* an invasive change to the code? To me it's pretty evident > that this'll be a pretty costly feature from that angle. We've quite a > few places that manipulate on-disk files, and they'll all have to be > manipulated. Several of those are essentially critical sections, adding > memory allocations to them wouldn't be good, so we'll need > pre-allocation APIs. > > I've only skimmed the discussion, but based on that I'm very surprised > how few concerns about this feature's complexity / maintainability > impact have been raised. Yeah, I guess we will just have to wait to see it since other people are excited about it. My concern is code complexity and usability challenges, vs punting the problem to the operating system, though admittedly there are some cases where that is not possible. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 2017-06-15 19:44:43 -0400, Bruce Momjian wrote: > Understood, but now you are promoting a feature with an admittedly-poor > API, duplication of an OS feature, and perhaps an invasive change to the > code. *Perhaps* an invasive change to the code? To me it's pretty evident that this'll be a pretty costly feature from that angle. We've quite a few places that manipulate on-disk files, and they'll all have to be manipulated. Several of those are essentially critical sections, adding memory allocations to them wouldn't be good, so we'll need pre-allocation APIs. I've only skimmed the discussion, but based on that I'm very surprised how few concerns about this feature's complexity / maintainability impact have been raised. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 07:51:36PM -0400, Alvaro Herrera wrote: > Bruce Momjian wrote: > > On Thu, Jun 15, 2017 at 07:27:55PM -0400, Stephen Frost wrote: > > > I expect the same would happen with the shell-command approach suggested > > > up-thread and the prompt-on-stdin approach too, they aren't great but I > > > expect users would still use the feature. As Robert and I have > > > mentioned, there is a good bit of value to having this feature simply > > > because it avoids the need to get someone with root privileges to set up > > > an encrypted volume and I don't think having to use a shell command or > > > providing the password on stdin at startup really changes that very > > > much. > > > > Understood, but now you are promoting a feature with an admittedly-poor > > API, duplication of an OS feature, and perhaps an invasive change to the > > code. Those are high hurdles. > > I thought we called it "incremental development". From the opposite > point of view, would you say we should ban use of passphrase-protected > SSL key files because the current user interface for them is bad? > > I have no use for data-at-rest encryption myself, but I wouldn't stop > development just because the initial design proposal doesn't include > top-notch key management. Yes, but we have to have a plan on how to improve it. Why add a feature that is hard to maintain, and hard to use. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Thu, Jun 15, 2017 at 07:27:55PM -0400, Stephen Frost wrote: > > I expect the same would happen with the shell-command approach suggested > > up-thread and the prompt-on-stdin approach too, they aren't great but I > > expect users would still use the feature. As Robert and I have > > mentioned, there is a good bit of value to having this feature simply > > because it avoids the need to get someone with root privileges to set up > > an encrypted volume and I don't think having to use a shell command or > > providing the password on stdin at startup really changes that very > > much. > > Understood, but now you are promoting a feature with an admittedly-poor > API, duplication of an OS feature, and perhaps an invasive change to the > code. Those are high hurdles. Of those, the only one that worries me, at least, is that it might be an invasive and difficult to maintain code change. As Robert said, and I agree with, "duplication of an OS feature" is something we pretty routinly, and justifiably, do. The poor interface is unfortunate, but if it's consistent with what we have today for a similar feature then I'm really not too upset with it. If we can do better, great, I'm all for that, but if not, then I'd rather have the feature with the poor interface than not have it at all. If it's an invasive code change or one which ends up being difficult to maintain, then that's a problem. Getting some focus on that aspect would be great and I certainly appreciate Robert's initial review and commentary on it. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
Bruce Momjian wrote: > On Thu, Jun 15, 2017 at 07:27:55PM -0400, Stephen Frost wrote: > > I expect the same would happen with the shell-command approach suggested > > up-thread and the prompt-on-stdin approach too, they aren't great but I > > expect users would still use the feature. As Robert and I have > > mentioned, there is a good bit of value to having this feature simply > > because it avoids the need to get someone with root privileges to set up > > an encrypted volume and I don't think having to use a shell command or > > providing the password on stdin at startup really changes that very > > much. > > Understood, but now you are promoting a feature with an admittedly-poor > API, duplication of an OS feature, and perhaps an invasive change to the > code. Those are high hurdles. I thought we called it "incremental development". From the opposite point of view, would you say we should ban use of passphrase-protected SSL key files because the current user interface for them is bad? I have no use for data-at-rest encryption myself, but I wouldn't stop development just because the initial design proposal doesn't include top-notch key management. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 07:27:55PM -0400, Stephen Frost wrote: > I expect the same would happen with the shell-command approach suggested > up-thread and the prompt-on-stdin approach too, they aren't great but I > expect users would still use the feature. As Robert and I have > mentioned, there is a good bit of value to having this feature simply > because it avoids the need to get someone with root privileges to set up > an encrypted volume and I don't think having to use a shell command or > providing the password on stdin at startup really changes that very > much. Understood, but now you are promoting a feature with an admittedly-poor API, duplication of an OS feature, and perhaps an invasive change to the code. Those are high hurdles. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Thu, Jun 15, 2017 at 06:41:08PM -0400, Stephen Frost wrote: > > > > > One serious difference between in-database-encryption and SSH keys is > > > > > that the use of passwords for SSH is well understood and reasonable to > > > > > use, while I think we all admit that use of passwords for database > > > > > objects like SSL keys is murky. Use of keys for OS-level encryption > > > > > is > > > > > a little better handled, but not as clean as SSH keys. > > > > > > > > Peter pointed out upthread that our handling of SSL passphrases leaves > > > > a lot to be desired, and that maybe we should fix that problem first; > > > > I agree. But I don't think this is any kind of intrinsic limitation > > > > of PostgreSQL vs. encrypted filesystems vs. SSH; it's just a > > > > quality-of-implementation issue. > > > > I'm not thrilled with asking Ants to implement a solution to SSL > > passphrases, and generalizing it to work for this, to get this feature > > accepted. I assume that the reason for asking for that work to be done > > now is because we decided that the current approach for SSL sucks but we > > couldn't actually drop support for it, but we don't want to add other > > features which work in a similar way because, well, it sucks. > > My point is that if our support for db-level encryption is as bad as SSL > key passwords, then it will be nearly useless, so we might as well not > have it. Isn't that obvious? Well, no, because the reason we even have an approach at all for SSL key passwords is because multiple people (myself and Magnus, at least, as I recall) complained as we are aware of installations which are actively using that approach. That doesn't mean it's a great solution or that it doesn't suck- really, I tend to agree that it does, but it's necessary because we need a solution, it works, and users are using it. Having a better solution would be great and something agent-based might be the right answer (or perhaps something where we have support for using hardware accellerators through an existing library...). I expect the same would happen with the shell-command approach suggested up-thread and the prompt-on-stdin approach too, they aren't great but I expect users would still use the feature. As Robert and I have mentioned, there is a good bit of value to having this feature simply because it avoids the need to get someone with root privileges to set up an encrypted volume and I don't think having to use a shell command or providing the password on stdin at startup really changes that very much. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 06:41:08PM -0400, Stephen Frost wrote: > > > > One serious difference between in-database-encryption and SSH keys is > > > > that the use of passwords for SSH is well understood and reasonable to > > > > use, while I think we all admit that use of passwords for database > > > > objects like SSL keys is murky. Use of keys for OS-level encryption is > > > > a little better handled, but not as clean as SSH keys. > > > > > > Peter pointed out upthread that our handling of SSL passphrases leaves > > > a lot to be desired, and that maybe we should fix that problem first; > > > I agree. But I don't think this is any kind of intrinsic limitation > > > of PostgreSQL vs. encrypted filesystems vs. SSH; it's just a > > > quality-of-implementation issue. > > I'm not thrilled with asking Ants to implement a solution to SSL > passphrases, and generalizing it to work for this, to get this feature > accepted. I assume that the reason for asking for that work to be done > now is because we decided that the current approach for SSL sucks but we > couldn't actually drop support for it, but we don't want to add other > features which work in a similar way because, well, it sucks. My point is that if our support for db-level encryption is as bad as SSL key passwords, then it will be nearly useless, so we might as well not have it. Isn't that obvious? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Thu, Jun 15, 2017 at 05:04:17PM -0400, Robert Haas wrote: > > > Also, there is the sense that security requires > > > trust of the root user, while using Postgres doesn't require the root > > > user to also use Postgres. > > > > I don't understand this. It is certainly true that you're running > > binaries owned by root, the root user could Trojan the binaries and > > break any security you think you have. But that problem is no better > > or worse for PostgreSQL than anything else. > > I couldn't find a cleaner way to see it --- it is that database use > doesn't involve the root user using it, while database security requires > the root user to also be security-conscious. I tend to agree with Robert here (in general). Further, there are certainly environments where the administrator with root access is absolutely security conscious, but that doesn't mean that the DBA has easy access to the root account or to folks who have that level of access and it's much easier for the DBA if they're able to address all of their requirements by building PG themselves, installing it, and, ideally, encrypting the DB. > > > One serious difference between in-database-encryption and SSH keys is > > > that the use of passwords for SSH is well understood and reasonable to > > > use, while I think we all admit that use of passwords for database > > > objects like SSL keys is murky. Use of keys for OS-level encryption is > > > a little better handled, but not as clean as SSH keys. > > > > Peter pointed out upthread that our handling of SSL passphrases leaves > > a lot to be desired, and that maybe we should fix that problem first; > > I agree. But I don't think this is any kind of intrinsic limitation > > of PostgreSQL vs. encrypted filesystems vs. SSH; it's just a > > quality-of-implementation issue. I'm not thrilled with asking Ants to implement a solution to SSL passphrases, and generalizing it to work for this, to get this feature accepted. I assume that the reason for asking for that work to be done now is because we decided that the current approach for SSL sucks but we couldn't actually drop support for it, but we don't want to add other features which work in a similar way because, well, it sucks. I get that. I'm not thrilled with it, but I get it. I'm hopeful it ends up not being too bad, but if it ends up meaning we don't get this feature, then I'll reconsider my position about agreeing that it's an acceptable requirement. > I think there are environmental issues that make password use on SSH > easier than the other cases --- it isn't just code quality. However, it > would be good to research how SSH handles it to see if we can get any > ideas. Actually, the approach SSH uses is a really good one, imv, and one which we might be able to leverage.. I'm not sure though. I will say that, in general, I like the idea of leveraging the external libraries which handle keys and deal with encryption to make this happen as those allow things like hardware devices to hold the key and possibly perform the encryption/decryption, etc. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 05:04:17PM -0400, Robert Haas wrote: > > Also, there is the sense that security requires > > trust of the root user, while using Postgres doesn't require the root > > user to also use Postgres. > > I don't understand this. It is certainly true that you're running > binaries owned by root, the root user could Trojan the binaries and > break any security you think you have. But that problem is no better > or worse for PostgreSQL than anything else. I couldn't find a cleaner way to see it --- it is that database use doesn't involve the root user using it, while database security requires the root user to also be security-conscious. > > One serious difference between in-database-encryption and SSH keys is > > that the use of passwords for SSH is well understood and reasonable to > > use, while I think we all admit that use of passwords for database > > objects like SSL keys is murky. Use of keys for OS-level encryption is > > a little better handled, but not as clean as SSH keys. > > Peter pointed out upthread that our handling of SSL passphrases leaves > a lot to be desired, and that maybe we should fix that problem first; > I agree. But I don't think this is any kind of intrinsic limitation > of PostgreSQL vs. encrypted filesystems vs. SSH; it's just a > quality-of-implementation issue. I think there are environmental issues that make password use on SSH easier than the other cases --- it isn't just code quality. However, it would be good to research how SSH handles it to see if we can get any ideas. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 4:29 PM, Bruce Momjian wrote: > I think the big win for having OS features in the database is > selectivity --- the ability to selectively apply a feature to part of > the database. This is what you are doing by putting a password on your > SSH key, and my idea about row encryption. I agree. I think we will eventually want to be able to apply encryption selectively, although I don't think we need to have that feature in a first patch. One problem is that if you don't encrypt the WAL, there's not much point in encrypting the table data, so it becomes tricky to encrypt some things and not others. However, I am sure we can eventually solve those problems, given enough time and development effort. > It is also a question of convenience. If SSH told users they have to > create an encrypted volume to store their SSH keys with a password, it > would be silly, since the files are so small compared to a file system. > I think the assumption is that any security-concerned deployment of > Postgres will already have Postgres on its own partition and have the > root administrator involved. I think it is this assumption that drives > the idea that requiring root to run Postgres doesn't make sense, but it > does to do encryption. I don't think that's a particularly good assumption, though, especially with the proliferation of virtual and containerized environments where access to root privileges tends to be more circumscribed. Also, there are a lot of small databases out there that you might want to be able to encrypt without encrypting everything on the filesystem. For example, there are products that embed PostgreSQL. If a particular PostgreSQL configuration requires root access, then using that configuration means that the installing the product which contains it also requires root access. Installing the product means changing /etc/fstab, and uninstalling it means reversing those changes. That's very awkward. I agree that if you've got a terabyte of sensitive data, you probably want to encrypt the filesystem and involve the DBA, but there are still people who have a gigabyte of sensitive data. For those people, a separate filesystem likely doesn't make sense, but they may still want encryption. > Also, there is the sense that security requires > trust of the root user, while using Postgres doesn't require the root > user to also use Postgres. I don't understand this. It is certainly true that you're running binaries owned by root, the root user could Trojan the binaries and break any security you think you have. But that problem is no better or worse for PostgreSQL than anything else. > One serious difference between in-database-encryption and SSH keys is > that the use of passwords for SSH is well understood and reasonable to > use, while I think we all admit that use of passwords for database > objects like SSL keys is murky. Use of keys for OS-level encryption is > a little better handled, but not as clean as SSH keys. Peter pointed out upthread that our handling of SSL passphrases leaves a lot to be desired, and that maybe we should fix that problem first; I agree. But I don't think this is any kind of intrinsic limitation of PostgreSQL vs. encrypted filesystems vs. SSH; it's just a quality-of-implementation issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 03:09:32PM -0400, Robert Haas wrote: > To be honest, I find the hostility toward this feature a bit baffling. > The argument seems to be essentially that we shouldn't have this > feature because we'd have to maintain the code and many of the same > goals could be accomplished by using facilities that already exist > outside the database server. But that's also true for parallel query > (cf. Stado), logical replication (cf. Slony, Bucardo, Londiste), > physical replication (cf. DRBD), partitioning (cf. pg_partman), RLS > (cf. veil), and anything that could be written as application logic > (eg. psql's \if ... \endif, every procedural language we have, > user-defined functions themselves, database-enforced constraints, > FDWs). Yet, in every one of those cases, we find it worthwhile to > have the feature because it works better and is easier to use when > it's built in. I don't think that a patch for this feature is likely > to be bigger than (or even as large as) the patches for logical > replication or parallel query, and it will probably be less work to > maintain going forward than either. I think the big win for having OS features in the database is selectivity --- the ability to selectively apply a feature to part of the database. This is what you are doing by putting a password on your SSH key, and my idea about row encryption. It is also a question of convenience. If SSH told users they have to create an encrypted volume to store their SSH keys with a password, it would be silly, since the files are so small compared to a file system. I think the assumption is that any security-concerned deployment of Postgres will already have Postgres on its own partition and have the root administrator involved. I think it is this assumption that drives the idea that requiring root to run Postgres doesn't make sense, but it does to do encryption. Also, there is the sense that security requires trust of the root user, while using Postgres doesn't require the root user to also use Postgres. One serious difference between in-database-encryption and SSH keys is that the use of passwords for SSH is well understood and reasonable to use, while I think we all admit that use of passwords for database objects like SSL keys is murky. Use of keys for OS-level encryption is a little better handled, but not as clean as SSH keys. I admit there is no hard line here, so I guess we will have to see what the final patch looks like. I am basing my statements on what I guess the complexity will be. Complexity has a cost so we will have to weigh it when we see it. When SSH added password access, it was probably an easy decision because the use-case was high and the complexity was low. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Thu, Jun 15, 2017 at 12:06 PM, Peter Eisentraut wrote: > Making this work well would be a major part of the usability story that > this is being sold on. If the proposed solution is that you can cobble > together a few bits of shell, then not only is that not very > user-friendly, it also won't work consistently across platforms, won't > work under systemd (launchd? Windows service?), and might behave > awkwardly under restricted environments where there is no terminal or > only a limited OS environment. Moreover, it leaves the security aspects > of that part of the solution (keys lingering in memory or in swap) up to > the user. > > There was a discussion a while ago about how to handle passphrase entry > for SSL keys. The conclusion was that it works pretty crappily right > now, and several suggestions for improvement were discussed. I suggest > that fixing that properly and with flexibility could also yield a > solution for encryption key entry. That sounds like a good idea to me. However, I'd like to disagree with the idea that key management is the primary way in which this feature would improve usability. To me, the big advantage is that you don't need to be root (also, we can have more consistency in behavior across operating systems). I disagree vigorously with the idea that anyone who wants to encrypt their PostgreSQL database should just get root privileges on the system and use an encrypted filesystem. In some environments, "just get root privileges" is not something that is very easy to do; but even if you *have* root privileges, you don't necessarily want to have to use them just to install and configure your database. Right now, I can compile and install PostgreSQL from my own user account, run initdb, and start it up. Surely everyone PostgreSQL developer in the world - and a good number of users - would agree that if I suddenly needed to run initdb as root, that would be a huge usability regression. You wouldn't even be able to run 'make check' without root privileges, which would suck. In the same way, when we removed (for most users) the need to tune System V shared memory parameters (b0fc0df9364d2d2d17c0162cf3b8b59f6cb09f67), a lot of users were very happy precisely because it eliminated a setup step that formerly had to be done as root. It is very reasonable to suppose that users who need encryption will similarly be happy if they no longer need to be root to get an encrypted PostgreSQL working. Sure, that's only a subset of users rather than all of them, but it's the same kind of issue. Also, I don't think we should be presenting filesystem encryption and built-in encryption as if they were two methods of solving the exact same problem. In some scenarios, they are in fact solving the same problem. However, it's entirely reasonable to want to use both. For example, the hard disk in my laptop is encrypted, because that's a thing Apple does. If somebody steals the hard disk out of my laptop, they may have some difficulty recovering the contents. However, as Stephen also mentioned, that has not deterred me from putting a passphrase on my SSH keys. There are situations in which the passphrase provides protection that the whole-drive encryption won't. For example, if I copy my home directory onto a USB stick and then copy it from there to a new laptop, somebody might steal the USB stick. If they manage to do that, they will get most of my files, but they won't get my SSH keys, or at least not without guessing my passphrase. Similarly, if I foolishly walk away from my laptop in the presence of some nefarious person (say, Magnus) without locking it, that person can't steal my keys. That person might be able to impersonate me for as long as I'm away from the laptop, if the keys are loaded into my SSH agent, but not afterwards. The issues for the database are similar. You might want one of these things or the other or both depending on the exact situation. To be honest, I find the hostility toward this feature a bit baffling. The argument seems to be essentially that we shouldn't have this feature because we'd have to maintain the code and many of the same goals could be accomplished by using facilities that already exist outside the database server. But that's also true for parallel query (cf. Stado), logical replication (cf. Slony, Bucardo, Londiste), physical replication (cf. DRBD), partitioning (cf. pg_partman), RLS (cf. veil), and anything that could be written as application logic (eg. psql's \if ... \endif, every procedural language we have, user-defined functions themselves, database-enforced constraints, FDWs). Yet, in every one of those cases, we find it worthwhile to have the feature because it works better and is easier to use when it's built in. I don't think that a patch for this feature is likely to be bigger than (or even as large as) the patches for logical replication or parallel query, and it will probably be less work to maintain going forward
Re: [HACKERS] WIP: Data at rest encryption
On 6/14/17 17:41, Stephen Frost wrote: >> Relying on environment variables is clearly pretty crappy. So if that's >> the proposal, then I think it needs to be better. > I don't believe that was ever intended to be the final solution, I was > just pointing out that it's what the WIP patch did. > > The discussion had moved into having a command called which provided the > key on stdout, as I recall, allowing it to be whatever the user wished, > including binary of any kind. > > If you have other suggestions, I'm sure they would be well received. As > to the question of complexity, it certainly looks like it'll probably be > quite straight-forward for users to use. I think the passphrase entry part of the problem is actually a bit harder than it appears. Making this work well would be a major part of the usability story that this is being sold on. If the proposed solution is that you can cobble together a few bits of shell, then not only is that not very user-friendly, it also won't work consistently across platforms, won't work under systemd (launchd? Windows service?), and might behave awkwardly under restricted environments where there is no terminal or only a limited OS environment. Moreover, it leaves the security aspects of that part of the solution (keys lingering in memory or in swap) up to the user. There was a discussion a while ago about how to handle passphrase entry for SSL keys. The conclusion was that it works pretty crappily right now, and several suggestions for improvement were discussed. I suggest that fixing that properly and with flexibility could also yield a solution for encryption key entry. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Wed, Jun 14, 2017 at 5:41 PM, Stephen Frost wrote: > I don't believe that was ever intended to be the final solution, I was > just pointing out that it's what the WIP patch did. > > The discussion had moved into having a command called which provided the > key on stdout, as I recall, allowing it to be whatever the user wished, > including binary of any kind. > > If you have other suggestions, I'm sure they would be well received. As > to the question of complexity, it certainly looks like it'll probably be > quite straight-forward for users to use. To me, this reads a bit like you're still trying to shut down the discussion here. Perhaps I am misreading it. Upthread, you basically said that we shouldn't talk about key management (specifically, you said, "Key management is an entirely independent discussion from this") which I think is a ridiculous statement. We have to have some kind of simple key management in order to have the feature at all. It does not have to be crazy complicated, but it has to exist. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 6/13/17 18:11, Stephen Frost wrote: > >> Let's see a proposal in those terms then. How easy can you make it, > >> compared to existing OS-level solutions, and will that justify the > >> maintenance overhead? > > From the original post on this thread, which included a WIP patch: > > > > -- > > Usage > > = > > > > Set up database like so: > > > > (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; > > export PGENCRYPTIONKEY > > initdb -k -K pgcrypto $PGDATA ) > > > > Start PostgreSQL: > > > > (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; > > export PGENCRYPTIONKEY > > postgres $PGDATA ) > > -- > > Relying on environment variables is clearly pretty crappy. So if that's > the proposal, then I think it needs to be better. I don't believe that was ever intended to be the final solution, I was just pointing out that it's what the WIP patch did. The discussion had moved into having a command called which provided the key on stdout, as I recall, allowing it to be whatever the user wished, including binary of any kind. If you have other suggestions, I'm sure they would be well received. As to the question of complexity, it certainly looks like it'll probably be quite straight-forward for users to use. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On 6/13/17 18:11, Stephen Frost wrote: >> Let's see a proposal in those terms then. How easy can you make it, >> compared to existing OS-level solutions, and will that justify the >> maintenance overhead? > From the original post on this thread, which included a WIP patch: > > -- > Usage > = > > Set up database like so: > > (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; > export PGENCRYPTIONKEY > initdb -k -K pgcrypto $PGDATA ) > > Start PostgreSQL: > > (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; > export PGENCRYPTIONKEY > postgres $PGDATA ) > -- Relying on environment variables is clearly pretty crappy. So if that's the proposal, then I think it needs to be better. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 6/14/17 05:04, Aleksander Alekseev wrote: > A few companies that hired system administrators that are too > lazy to read two or three man pages is not a reason to re-implement file > system encryption (or compression, or mirroring if that matters) in any > open source RDBMS. To be fair, we did implement our own compression (TOAST) and mirroring (compared to, say, DRBD) because there were clear advantages in simplicity and performance. Checksumming is another area where we moved forward in spite of arguments that the file system should do it. So we will need to see the whole picture. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Wed, Jun 14, 2017 at 06:41:43PM +0300, Ants Aasma wrote: > On Wed, Jun 14, 2017 at 6:26 PM, Bruce Momjian wrote: > > Are you checking the CPU type or if AES instructions are enabled on the > > CPU? I ask this because I just realized in researching my new TLS talk > > that my BIOS defaults to AES instructions disabled, and I had to > > manually enable it. > > There is zero code for that now, but the plan was to check the CPUID > instruction. My understanding is that it should report what is > currently enabled on the CPU. Will double check when actually writing > the code for the check. Just for specifics, I have two Intel Xeon CPU E5620, but the AES instructions were disabled for this CPU since 2012 when I bought it. :-( The good news is that only recently have I forced https in some pages so this is the first time I heavily need it. I now have a boot test, which returns 16: grep -c '\' /proc/cpuinfo > >> > I anticipate that one of the trickier problems here will be handling > >> > encryption of the write-ahead log. Suppose you encrypt WAL a block at > >> > a time. In the current system, once you've written and flushed a > >> > block, you can consider it durably committed, but if that block is > >> > encrypted, this is no longer true. A crash might tear the block, > >> > making it impossible to decrypt. Replay will therefore stop at the > >> > end of the previous block, not at the last record actually flushed as > >> > would happen today. > >> > >> My patch is currently doing a block at a time for WAL. The XTS mode > > > > Uh, how are you writing partial writes to the WAL. I assume you are > > doing a streaming cipher so you can write in increments, right? > > We were doing 8kB page aligned writes to WAL anyway. I just encrypt > the block before it gets written out. Oh, we do. The beauty of streaming ciphers built on block ciphers is that you can pre-compute the cipher to be XOR'ed with the data because the block cipher output doesn't depend on the user data. This is used for SSH, for example. > >> I think we need to require wal_log_hints=on when encryption is > >> enabled. Currently I have not considered tearing on CLOG bits. Other > >> SLRUs probably have similar issues. I need to think a bit about how to > >> solve that. > > > > I am not sure if clog even needs to be encrypted. > > Me neither, but it currently is, and it looks like that's broken in a > "silently corrupts your data" way in face of torn writes. Using OFB > mode (xor plaintext with pseudorandom stream for cipher) looks like it > might help here, if other approaches fail. I would just document the limitation and move on. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Wed, Jun 14, 2017 at 6:26 PM, Bruce Momjian wrote: > Are you checking the CPU type or if AES instructions are enabled on the > CPU? I ask this because I just realized in researching my new TLS talk > that my BIOS defaults to AES instructions disabled, and I had to > manually enable it. There is zero code for that now, but the plan was to check the CPUID instruction. My understanding is that it should report what is currently enabled on the CPU. Will double check when actually writing the code for the check. >> > I anticipate that one of the trickier problems here will be handling >> > encryption of the write-ahead log. Suppose you encrypt WAL a block at >> > a time. In the current system, once you've written and flushed a >> > block, you can consider it durably committed, but if that block is >> > encrypted, this is no longer true. A crash might tear the block, >> > making it impossible to decrypt. Replay will therefore stop at the >> > end of the previous block, not at the last record actually flushed as >> > would happen today. >> >> My patch is currenly doing a block at a time for WAL. The XTS mode > > Uh, how are you writing partial writes to the WAL. I assume you are > doing a streaming cipher so you can write in increments, right? We were doing 8kB page aligned writes to WAL anyway. I just encrypt the block before it gets written out. >> I think we need to require wal_log_hints=on when encryption is >> enabled. Currently I have not considered tearing on CLOG bits. Other >> SLRUs probably have similar issues. I need to think a bit about how to >> solve that. > > I am not sure if clog even needs to be encrypted. Me neither, but it currently is, and it looks like that's broken in a "silently corrupts your data" way in face of torn writes. Using OFB mode (xor plaintext with pseudorandom stream for cipher) looks like it might help here, if other approaches fail. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Wed, Jun 14, 2017 at 06:10:32PM +0300, Ants Aasma wrote: > On Tue, Jun 13, 2017 at 6:35 PM, Robert Haas wrote: > > Performance is likely to be poor on large databases, > > because every time a page transits between shared_buffers and the > > buffer cache we've got to en/decrypt, but as long as it's only poor > > for the people who opt into the feature I don't see a big problem with > > that. > > It would make sense to tune the database with large shared buffers if > encryption is enabled. That should make sure that most shared buffers > traffic is going to disk anyway. As for performance, I have a > prototype assembly implementation of AES that does 3GB/s/core on my > laptop. If we add that behind a CPUID check the overhead should be > quite reasonable. Are you checking the CPU type or if AES instructions are enabled on the CPU? I ask this because I just realized in researching my new TLS talk that my BIOS defaults to AES instructions disabled, and I had to manually enable it. > > I anticipate that one of the trickier problems here will be handling > > encryption of the write-ahead log. Suppose you encrypt WAL a block at > > a time. In the current system, once you've written and flushed a > > block, you can consider it durably committed, but if that block is > > encrypted, this is no longer true. A crash might tear the block, > > making it impossible to decrypt. Replay will therefore stop at the > > end of the previous block, not at the last record actually flushed as > > would happen today. > > My patch is currenly doing a block at a time for WAL. The XTS mode Uh, how are you writing partial writes to the WAL. I assume you are doing a streaming cipher so you can write in increments, right? > used to encrypt has the useful property that blocks that share > identical prefix unencrypted also have identical prefix when > encrypted. It requires that the tearing is 16B aligned, but I think > that is true for pretty much all storage systems. That property of > course has security downsides, but for table/index storage we have a > nonce in the form of LSN in the page header eliminating the issue. > > > So, your synchronous_commit suddenly isn't. A > > similar problem will occur any other page where we choose not to > > protect against torn pages using full page writes. For instance, > > unless checksums are enabled or wal_log_hints=on, we'll write a data > > page where a single bit has been flipped and assume that the bit will > > either make it to disk or not; the page can't really be torn in any > > way that hurts us. But with encryption that's no longer true, because > > the hint bit will turn into much more than a single bit flip, and > > rereading that page with half old and half new contents will be the > > end of the world (TM). I don't know off-hand whether we're > > protecting, say, CLOG page writes with FPWs.: because setting a couple > > of bits is idempotent and doesn't depend on the existing page > > contents, we might not need it currently, but with encryption, every > > bit in the page depends on every other bit in the page, so we > > certainly would. I don't know how many places we've got assumptions > > like this baked into the system, but I'm guessing there are a bunch. > > I think we need to require wal_log_hints=on when encryption is > enabled. Currently I have not considered tearing on CLOG bits. Other > SLRUs probably have similar issues. I need to think a bit about how to > solve that. I am not sure if clog even needs to be encrypted. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 6:35 PM, Robert Haas wrote: > Of course, what would be even more useful is fine-grained encryption - > encrypt these tables (and the corresponding indexes, toast tables, and > WAL records related to any of that) with this key, encrypt these other > tables (and the same list of associated stuff) with this other key, > and leave the rest unencrypted. The problem with that is that you > probably can't run recovery without all of the keys, and even on a > clean startup there would be a good deal of engineering work involved > in refusing access to tables whose key hadn't been provided yet. That's pretty much the reason why I decided to skip anything more complicated for now. Anything that would be able to run recovery without knowing the encryption key looks like an order of magnitude more complicated to implement. > Performance is likely to be poor on large databases, > because every time a page transits between shared_buffers and the > buffer cache we've got to en/decrypt, but as long as it's only poor > for the people who opt into the feature I don't see a big problem with > that. It would make sense to tune the database with large shared buffers if encryption is enabled. That should make sure that most shared buffers traffic is going to disk anyway. As for performance, I have a prototype assembly implementation of AES that does 3GB/s/core on my laptop. If we add that behind a CPUID check the overhead should be quite reasonable. > I anticipate that one of the trickier problems here will be handling > encryption of the write-ahead log. Suppose you encrypt WAL a block at > a time. In the current system, once you've written and flushed a > block, you can consider it durably committed, but if that block is > encrypted, this is no longer true. A crash might tear the block, > making it impossible to decrypt. Replay will therefore stop at the > end of the previous block, not at the last record actually flushed as > would happen today. My patch is currenly doing a block at a time for WAL. The XTS mode used to encrypt has the useful property that blocks that share identical prefix unencrypted also have identical prefix when encrypted. It requires that the tearing is 16B aligned, but I think that is true for pretty much all storage systems. That property of course has security downsides, but for table/index storage we have a nonce in the form of LSN in the page header eliminating the issue. > So, your synchronous_commit suddenly isn't. A > similar problem will occur any other page where we choose not to > protect against torn pages using full page writes. For instance, > unless checksums are enabled or wal_log_hints=on, we'll write a data > page where a single bit has been flipped and assume that the bit will > either make it to disk or not; the page can't really be torn in any > way that hurts us. But with encryption that's no longer true, because > the hint bit will turn into much more than a single bit flip, and > rereading that page with half old and half new contents will be the > end of the world (TM). I don't know off-hand whether we're > protecting, say, CLOG page writes with FPWs.: because setting a couple > of bits is idempotent and doesn't depend on the existing page > contents, we might not need it currently, but with encryption, every > bit in the page depends on every other bit in the page, so we > certainly would. I don't know how many places we've got assumptions > like this baked into the system, but I'm guessing there are a bunch. I think we need to require wal_log_hints=on when encryption is enabled. Currently I have not considered tearing on CLOG bits. Other SLRUs probably have similar issues. I need to think a bit about how to solve that. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 06:29:20PM -0400, Stephen Frost wrote: > > Isn't the leakage controlled by OS permissions, so is it really leakage, > > i.e., if you can see the leakage, you probably have bypassed the OS > > permissions and see the key and data anyway. > > The case I'm mainly considering is if you somehow lose control over the > medium in which the encrypted database resides- that is, someone steals > the hard drive, or perhaps the hard drive is sold without properly being > wiped first, things like that. > > In such a case, there's no OS permissions to bypass because the OS is > now controlled by the attacker. In that case, if the key wasn't stored > on the hard drive, then the attacker would be able to see the contents > of the filesystem and the associated metadata, but not the contents of > the cluster. > > In that case, the distinction between filesystem-level encryption and > PG-level encryption is that with filesystem-level encryption the > attacker wouldn't be able to even see what files exist or any metadata > about them, whereas with PG-level encryption that information would be > available to the attacker. Yes, Peter Eisentraut pointed that out in an earlier email in this thread. Thanks. > > > > The benefit is allowing configuration > > > > in the database rather than the OS? > > > > > > No, the benefit is that the database administrator can configure it and > > > set it up and not have to get an OS-level administrator involved. There > > > may also be other reasons why filesystem-level encryption is difficult > > > to set up or use in a certain environment, but this wouldn't depend on > > > anything OS-related and therefore could be done. > > > > OK, my only point here is that we are going down a slippery slope of > > implementing OS things in the database. There is nothing wrong with > > that but it has often been something we have avoided, because of the > > added complexity required in the db server. > > I'm not sure that I actually agree that encryption is really solely an > OS-level function, or even that encryption at rest is solely the OS's > job. As a counter-example, I encrypt my SSH keys and GPG keys > routinely, even when I'm using OS-level filesystem encryption. Perhaps > that's excessive of me, but, well, I don't find it so. :) Well, I think SSH and GPG keys are a case of selective encryption, which is where I said database encryption could really be a win, because you can't do that outside the database. Just to go crazy, here is something I think would be cool for a fully or partially encrypted data row: row data encrypted with symmetric key k k encrypted with public key of user 1 k encrypted with public key of user 2 hash of previous fields signed by insert user The would allow the insert user complete control over who sees the row. The database administrator could corrupt the row or add/remove users, but that would be detected by the hash signature being invalid. This is kind of like TLS bundled in the database. I think this is actually all possible now too. :-) > > As a counter-example, we only added an external collation library to > > Postgres when we clearly saw a benefit, e.g. detecting collation > > changes. > > Right, and there's also the potential for adding more flexibility down > the road, which I'm certainly all for, but I see value in having even > this initial version of the feature too. Understood. > > > Also, tools like pg_basebackup could be used, with nearly zero changes, > > > I think, to get an encrypted copy of the database for backup purposes, > > > removing the need to work out a way to handle encrypting backups. > > > > I do think we need much more documentation on how to encrypt things, > > though that is a separate issue. It might help to document how you > > _should_ do things now to see the limitations we have. > > Improving our documentation would certainly be good, but I'm not sure > that we can really recommend specific ways of doing things like > filesystem-level encryption, as that's really OS-dependent and there > could be trade-offs in different ways a given OS might provide that > capability. I'm not sure that having our documentation generically say > "you should use filesystem-level encryption" would really be very > helpful. > > Perhaps I misunderstood your suggestion here though..? I was just throwing out the idea that sometimes writing things down shows the gaps in our feature-set --- it might not apply here. > > > > Is the problem that you have to encrypt before sending and decrypt on > > > > arrival, if you don't trust the transmission link? Is that used a lot? > > > > Is having the db encrypt every write a reasonable solution to that? > > > > > > There's multiple use-cases here. Making it easier to copy the database > > > is just one of them and it isn't the biggest one. The biggest benefit > > > is that there's cases where filesystem-level encryption isn't re
Re: [HACKERS] WIP: Data at rest encryption
On Wed, Jun 14, 2017 at 04:13:57PM +0300, Aleksander Alekseev wrote: > > > While I agree that configuring full disk encryption is not technically > > > difficult, it requires much more privileged access to the system and > > > basically requires the support of a system administrator. In addition, > > > if a volume is not available for encryption, PostgreSQL support for > > > encryption would still allow for its data to be encrypted and as others > > > have mentioned can be enabled by the DBA alone. > > > > Frankly I'm having difficulties imagining when it could be a real > > problem. It doesn't seem to be such a burden to ask a colleague for > > assistance in case you don't have sufficient permissions to do > > something. And I got a strong feeling that solving bureaucracy issues of > > specific organizations by changing PostgreSQL core in very invasive way > > (keeping in mind testing, maintaining, etc) is misguided. > > In the same time implementing a plugable storage API and then implementing > encrypted / compressed / whatever storage in a standalone extension using > this API seems to be a reasonable thing to do. Agreed, good point. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
> > While I agree that configuring full disk encryption is not technically > > difficult, it requires much more privileged access to the system and > > basically requires the support of a system administrator. In addition, > > if a volume is not available for encryption, PostgreSQL support for > > encryption would still allow for its data to be encrypted and as others > > have mentioned can be enabled by the DBA alone. > > Frankly I'm having difficulties imagining when it could be a real > problem. It doesn't seem to be such a burden to ask a colleague for > assistance in case you don't have sufficient permissions to do > something. And I got a strong feeling that solving bureaucracy issues of > specific organizations by changing PostgreSQL core in very invasive way > (keeping in mind testing, maintaining, etc) is misguided. In the same time implementing a plugable storage API and then implementing encrypted / compressed / whatever storage in a standalone extension using this API seems to be a reasonable thing to do. -- Best regards, Aleksander Alekseev signature.asc Description: PGP signature
Re: [HACKERS] WIP: Data at rest encryption
Hi Kenneth, > > > File system encryption already exists and is well-tested. I don't see > > > any big advantages in re-implementing all of this one level up. You > > > would have to touch every single place in PostgreSQL backend and tool > > > code where a file is being read or written. Yikes. > > > > I appreciate your work, but unfortunately I must agree with Peter. > > > > On Linux you can configure the full disc encryption using LUKS / > > dm-crypt in like 5 minutes [1]. On FreeBSD you can do the same using > > geli [2]. In my personal opinion PostgreSQL is already complicated > > enough. A few companies that hired system administrators that are too > > lazy to read two or three man pages is not a reason to re-implement file > > system encryption (or compression, or mirroring if that matters) in any > > open source RDBMS. > > While I agree that configuring full disk encryption is not technically > difficult, it requires much more privileged access to the system and > basically requires the support of a system administrator. In addition, > if a volume is not available for encryption, PostgreSQL support for > encryption would still allow for its data to be encrypted and as others > have mentioned can be enabled by the DBA alone. Frankly I'm having difficulties imagining when it could be a real problem. It doesn't seem to be such a burden to ask a colleague for assistance in case you don't have sufficient permissions to do something. And I got a strong feeling that solving bureaucracy issues of specific organizations by changing PostgreSQL core in very invasive way (keeping in mind testing, maintaining, etc) is misguided. -- Best regards, Aleksander Alekseev signature.asc Description: PGP signature
Re: [HACKERS] WIP: Data at rest encryption
On Wed, Jun 14, 2017 at 12:04:26PM +0300, Aleksander Alekseev wrote: > Hi Ants, > > On Tue, Jun 13, 2017 at 09:07:49AM -0400, Peter Eisentraut wrote: > > On 6/12/17 17:11, Ants Aasma wrote: > > > I'm curious if the community thinks this is a feature worth having? > > > Even considering that security experts would classify this kind of > > > encryption as a checkbox feature. > > > > File system encryption already exists and is well-tested. I don't see > > any big advantages in re-implementing all of this one level up. You > > would have to touch every single place in PostgreSQL backend and tool > > code where a file is being read or written. Yikes. > > I appreciate your work, but unfortunately I must agree with Peter. > > On Linux you can configure the full disc encryption using LUKS / > dm-crypt in like 5 minutes [1]. On FreeBSD you can do the same using > geli [2]. In my personal opinion PostgreSQL is already complicated > enough. A few companies that hired system administrators that are too > lazy to read two or three man pages is not a reason to re-implement file > system encryption (or compression, or mirroring if that matters) in any > open source RDBMS. > Hi Aleksander, While I agree that configuring full disk encryption is not technically difficult, it requires much more privileged access to the system and basically requires the support of a system administrator. In addition, if a volume is not available for encryption, PostgreSQL support for encryption would still allow for its data to be encrypted and as others have mentioned can be enabled by the DBA alone. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Hi Ants, On Tue, Jun 13, 2017 at 09:07:49AM -0400, Peter Eisentraut wrote: > On 6/12/17 17:11, Ants Aasma wrote: > > I'm curious if the community thinks this is a feature worth having? > > Even considering that security experts would classify this kind of > > encryption as a checkbox feature. > > File system encryption already exists and is well-tested. I don't see > any big advantages in re-implementing all of this one level up. You > would have to touch every single place in PostgreSQL backend and tool > code where a file is being read or written. Yikes. I appreciate your work, but unfortunately I must agree with Peter. On Linux you can configure the full disc encryption using LUKS / dm-crypt in like 5 minutes [1]. On FreeBSD you can do the same using geli [2]. In my personal opinion PostgreSQL is already complicated enough. A few companies that hired system administrators that are too lazy to read two or three man pages is not a reason to re-implement file system encryption (or compression, or mirroring if that matters) in any open source RDBMS. [1] http://eax.me/dm-crypt/ [2] http://eax.me/freebsd-geli/ -- Best regards, Aleksander Alekseev signature.asc Description: PGP signature
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 03:20:12PM -0400, Stephen Frost wrote: > > > OK, so let's go back. You are saying there are no security benefits to > > > this vs. file system encryption. > > > > I'm not sure that I can see any, myself.. Perhaps I'm wrong there, but > > it seems unlikely that this would be an improvement over filesystem > > level encryption in the general sense. I'm not sure that I see it as > > really worse than filesystem-level encryption either, to be clear. > > There's a bit of increased information leakage, as Peter mentioned and I > > agreed with, but it's not a lot and I expect in most cases that > > information leak would be acceptable. That seems like something which > > would need to be considered on a case-by-case basis. > > Isn't the leakage controlled by OS permissions, so is it really leakage, > i.e., if you can see the leakage, you probably have bypassed the OS > permissions and see the key and data anyway. The case I'm mainly considering is if you somehow lose control over the medium in which the encrypted database resides- that is, someone steals the hard drive, or perhaps the hard drive is sold without properly being wiped first, things like that. In such a case, there's no OS permissions to bypass because the OS is now controlled by the attacker. In that case, if the key wasn't stored on the hard drive, then the attacker would be able to see the contents of the filesystem and the associated metadata, but not the contents of the cluster. In that case, the distinction between filesystem-level encryption and PG-level encryption is that with filesystem-level encryption the attacker wouldn't be able to even see what files exist or any metadata about them, whereas with PG-level encryption that information would be available to the attacker. In terms of an online attack where an attacker has gained access to the system then, in general, you're right that if the attacker is able to see into the PG data directory at all then they've figured out a way to bypass the OS-level permissions and would then be able to see the data directly anyway. That's a different scenario which would most likely be helped by something like SELinux being used, which could prevent the attacker from being able to look at the PG data directory because the attacker has connected to the system from a network which isn't allowed to directly access those files. > > > The benefit is allowing configuration > > > in the database rather than the OS? > > > > No, the benefit is that the database administrator can configure it and > > set it up and not have to get an OS-level administrator involved. There > > may also be other reasons why filesystem-level encryption is difficult > > to set up or use in a certain environment, but this wouldn't depend on > > anything OS-related and therefore could be done. > > OK, my only point here is that we are going down a slippery slope of > implementing OS things in the database. There is nothing wrong with > that but it has often been something we have avoided, because of the > added complexity required in the db server. I'm not sure that I actually agree that encryption is really solely an OS-level function, or even that encryption at rest is solely the OS's job. As a counter-example, I encrypt my SSH keys and GPG keys routinely, even when I'm using OS-level filesystem encryption. Perhaps that's excessive of me, but, well, I don't find it so. :) > As a counter-example, we only added an external collation library to > Postgres when we clearly saw a benefit, e.g. detecting collation > changes. Right, and there's also the potential for adding more flexibility down the road, which I'm certainly all for, but I see value in having even this initial version of the feature too. > > Of course, either way you'd have to provide for a way to get the key > > from one system to the other. > > Uh, doesn't scp do this? I have trouble seeing how avoiding calling > openssl justifies changes to our database server. scp may not be an option as it requires network connectivity between the systems. This is also just one of the use-cases, and not the main reason, at least in my view, to add this feature. > > Also, tools like pg_basebackup could be used, with nearly zero changes, > > I think, to get an encrypted copy of the database for backup purposes, > > removing the need to work out a way to handle encrypting backups. > > I do think we need much more documentation on how to encrypt things, > though that is a separate issue. It might help to document how you > _should_ do things now to see the limitations we have. Improving our documentation would certainly be good, but I'm not sure that we can really recommend specific ways of doing things like filesystem-level encryption, as that's really OS-dependent and there could be trade-offs in different ways a given OS might provide that capability. I'm not sure that having our doc
Re: [HACKERS] WIP: Data at rest encryption
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 6/13/17 15:20, Stephen Frost wrote: > > And then you would need openssl on the other system to decrypt it. > > Or make the USB file system encrypted as well? If you're in that kind > of environment, that would surely be feasible, if not required. Right, but requiring file system encryption to work on a USB stick across different types of systems strikes me as actually a higher bar than requiring openssl to exist on both the source and destination sides. Naturally, if the environment you're in has already solved that problem across the enterprise then it's a good approach, although you might want to use a different encryption key, perhaps, though hopefully that's something you'd be able to do pretty easily too. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 6/13/17 15:20, Stephen Frost wrote: > > No, the benefit is that the database administrator can configure it and > > set it up and not have to get an OS-level administrator involved. There > > may also be other reasons why filesystem-level encryption is difficult > > to set up or use in a certain environment, but this wouldn't depend on > > anything OS-related and therefore could be done. > > Let's see a proposal in those terms then. How easy can you make it, > compared to existing OS-level solutions, and will that justify the > maintenance overhead? From the original post on this thread, which included a WIP patch: -- Usage = Set up database like so: (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; export PGENCRYPTIONKEY initdb -k -K pgcrypto $PGDATA ) Start PostgreSQL: (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; export PGENCRYPTIONKEY postgres $PGDATA ) -- That certainly seems very straight-forward to me, though I expect that packagers would probably improve upon it further. > Considering how ubiquitous file-system encryption is, I have my doubts > that the trade-offs will come out right, but let's see. There's definitely environments out there where DBAs aren't able to have root privileges and that limits what they're able to do. I'm not really sure how to objectively weigh "you don't need to be root to encrypt the database" vs. maintenance overhead of this feature. Subjectively, for my 2c anyway, it seems well worth it, but that's naturally subjective. :) Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 10:28:14AM -0400, Peter Eisentraut wrote: > On 6/13/17 09:24, Stephen Frost wrote: > > but there are use-cases where it'd be really nice to be able to > > have PG doing the encryption instead of the filesystem because > > then you can do things like backup the database, copy it somewhere > > else directly, and then restore it using the regular PG > > mechanisms, as long as you have access to the key. That's not > > something you can directly do with filesystem-level encryption > > Interesting point. > > I wonder what the proper extent of "encryption at rest" should be. > If you encrypt just on a file or block level, then someone looking > at the data directory or a backup can still learn a number of things > about the number of tables, transaction rates, various configuration > settings, and so on. In the end, information leaks at a strictly positive baud rate because physics (cf. Claude Shannon, et al). Encryption at rest is one technique whereby people can slow this rate, but there's no such thing as getting it to zero. Let's not creep this feature in the ultimately futile attempt to do so. > In the scenario of a sensitive application hosted on a shared > SAN, I don't think that is good enough. > > Also, in the use case you describe, if you use pg_basebackup to make a > direct encrypted copy of a data directory, I think that would mean you'd > have to keep using the same key for all copies. Right. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 04:08:29PM -0400, Peter Eisentraut wrote: > On 6/13/17 15:51, Bruce Momjian wrote: > > Isn't the leakage controlled by OS permissions, so is it really leakage, > > i.e., if you can see the leakage, you probably have bypassed the OS > > permissions and see the key and data anyway. > > One scenario (among many) is when you're done with the disk. If the > content was fully encrypted, then you can just throw it into the trash > or have your provider dispose of it or reuse it. If not, then, > depending on policy, you will have to physically obtain it and burn it. Oh, I see your point --- db-level encryption stores the file system as mountable on the device, while it is not with storage-level encryption --- got it. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 6/13/17 15:51, Bruce Momjian wrote: > Isn't the leakage controlled by OS permissions, so is it really leakage, > i.e., if you can see the leakage, you probably have bypassed the OS > permissions and see the key and data anyway. One scenario (among many) is when you're done with the disk. If the content was fully encrypted, then you can just throw it into the trash or have your provider dispose of it or reuse it. If not, then, depending on policy, you will have to physically obtain it and burn it. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 03:20:12PM -0400, Stephen Frost wrote: > Bruce, > > * Bruce Momjian (br...@momjian.us) wrote: > > On Tue, Jun 13, 2017 at 02:38:58PM -0400, Stephen Frost wrote: > > > It's good to discuss what the feature would bring and what cases it > > > doesn't cover, as well as discussing how it can be designed to make sure > > > that later improvements are able to be done without having to change it > > > around. I do think it's a good idea for us to consider taking an > > > incremental approach where we're adding pieces and building things up as > > > we go. I'm concerned that if we try to do too much in the initial > > > implementation that we'll end up not having anything. > > > > > > As it relates to the different attack vectors that this would address, > > > it's primairly the same ones which filesystem-level encryption also > > > addresses, but it's an improvement when it comes to ease of use. > > > Unfortunately, it won't address cases where the OS is compromised. > > > > OK, so let's go back. You are saying there are no security benefits to > > this vs. file system encryption. > > I'm not sure that I can see any, myself.. Perhaps I'm wrong there, but > it seems unlikely that this would be an improvement over filesystem > level encryption in the general sense. I'm not sure that I see it as > really worse than filesystem-level encryption either, to be clear. > There's a bit of increased information leakage, as Peter mentioned and I > agreed with, but it's not a lot and I expect in most cases that > information leak would be acceptable. That seems like something which > would need to be considered on a case-by-case basis. Isn't the leakage controlled by OS permissions, so is it really leakage, i.e., if you can see the leakage, you probably have bypassed the OS permissions and see the key and data anyway. > > The benefit is allowing configuration > > in the database rather than the OS? > > No, the benefit is that the database administrator can configure it and > set it up and not have to get an OS-level administrator involved. There > may also be other reasons why filesystem-level encryption is difficult > to set up or use in a certain environment, but this wouldn't depend on > anything OS-related and therefore could be done. OK, my only point here is that we are going down a slippery slope of implementing OS things in the database. There is nothing wrong with that but it has often been something we have avoided, because of the added complexity required in the db server. As a counter-example, we only added an external collation library to Postgres when we clearly saw a benefit, e.g. detecting collation changes. > > You stated you can transfer > > db-level encrypted files between servers, but can't you do that anyway? > > If the filesystem is encrypted and you wanted to transfer the entire > cluster from one system to another, keeping it encrypted with the same > key, you'd have to transfer the entire filesystem at a block level. > That's not typically very easy to do (ZFS, specifically, has this > capability where you can export a filesystem and send it from one > machine to another, but I don't know of any other filesystems which do > and ZFS isn't always an option..). > > You could go through a process of re-encrypting the files prior to > transferring them, or deciding that simply having the transport > mechanism encrypted is sufficient (eg: SSH), but if what you really want > to do is keep the existing encryption of the database and transfer it to > another system, this allows that pretty easily. > > For example, you could simply do: > > cp -a /path/to/PG /mnt/usb > > and you're done. If you're using filesystem level encryption then you'd > have to re-encrypt the data, using something like: > > tar -cf - /path/to/PG | openssl -key private.key > > /mnt/usb/encrypted_cluster.tar > > And then you would need openssl on the other system to decrypt it. > > Of course, either way you'd have to provide for a way to get the key > from one system to the other. Uh, doesn't scp do this? I have trouble seeing how avoiding calling openssl justifies changes to our database server. > Also, tools like pg_basebackup could be used, with nearly zero changes, > I think, to get an encrypted copy of the database for backup purposes, > removing the need to work out a way to handle encrypting backups. I do think we need much more documentation on how to encrypt things, though that is a separate issue. It might help to document how you _should_ do things now to see the limitations we have. > > Is the problem that you have to encrypt before sending and decrypt on > > arrival, if you don't trust the transmission link? Is that used a lot? > > Is having the db encrypt every write a reasonable solution to that? > > There's multiple use-cases here. Making it easier to copy the database > is just one of them and it isn't the biggest one. The biggest benefit > is that there's case
Re: [HACKERS] WIP: Data at rest encryption
On 6/13/17 15:20, Stephen Frost wrote: > For example, you could simply do: > > cp -a /path/to/PG /mnt/usb > > and you're done. If you're using filesystem level encryption then you'd > have to re-encrypt the data, using something like: > > tar -cf - /path/to/PG | openssl -key private.key > > /mnt/usb/encrypted_cluster.tar > > And then you would need openssl on the other system to decrypt it. Or make the USB file system encrypted as well? If you're in that kind of environment, that would surely be feasible, if not required. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 6/13/17 15:20, Stephen Frost wrote: > No, the benefit is that the database administrator can configure it and > set it up and not have to get an OS-level administrator involved. There > may also be other reasons why filesystem-level encryption is difficult > to set up or use in a certain environment, but this wouldn't depend on > anything OS-related and therefore could be done. Let's see a proposal in those terms then. How easy can you make it, compared to existing OS-level solutions, and will that justify the maintenance overhead? Considering how ubiquitous file-system encryption is, I have my doubts that the trade-offs will come out right, but let's see. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 02:38:58PM -0400, Stephen Frost wrote: > > It's good to discuss what the feature would bring and what cases it > > doesn't cover, as well as discussing how it can be designed to make sure > > that later improvements are able to be done without having to change it > > around. I do think it's a good idea for us to consider taking an > > incremental approach where we're adding pieces and building things up as > > we go. I'm concerned that if we try to do too much in the initial > > implementation that we'll end up not having anything. > > > > As it relates to the different attack vectors that this would address, > > it's primairly the same ones which filesystem-level encryption also > > addresses, but it's an improvement when it comes to ease of use. > > Unfortunately, it won't address cases where the OS is compromised. > > OK, so let's go back. You are saying there are no security benefits to > this vs. file system encryption. I'm not sure that I can see any, myself.. Perhaps I'm wrong there, but it seems unlikely that this would be an improvement over filesystem level encryption in the general sense. I'm not sure that I see it as really worse than filesystem-level encryption either, to be clear. There's a bit of increased information leakage, as Peter mentioned and I agreed with, but it's not a lot and I expect in most cases that information leak would be acceptable. That seems like something which would need to be considered on a case-by-case basis. > The benefit is allowing configuration > in the database rather than the OS? No, the benefit is that the database administrator can configure it and set it up and not have to get an OS-level administrator involved. There may also be other reasons why filesystem-level encryption is difficult to set up or use in a certain environment, but this wouldn't depend on anything OS-related and therefore could be done. > You stated you can transfer > db-level encrypted files between servers, but can't you do that anyway? If the filesystem is encrypted and you wanted to transfer the entire cluster from one system to another, keeping it encrypted with the same key, you'd have to transfer the entire filesystem at a block level. That's not typically very easy to do (ZFS, specifically, has this capability where you can export a filesystem and send it from one machine to another, but I don't know of any other filesystems which do and ZFS isn't always an option..). You could go through a process of re-encrypting the files prior to transferring them, or deciding that simply having the transport mechanism encrypted is sufficient (eg: SSH), but if what you really want to do is keep the existing encryption of the database and transfer it to another system, this allows that pretty easily. For example, you could simply do: cp -a /path/to/PG /mnt/usb and you're done. If you're using filesystem level encryption then you'd have to re-encrypt the data, using something like: tar -cf - /path/to/PG | openssl -key private.key > /mnt/usb/encrypted_cluster.tar And then you would need openssl on the other system to decrypt it. Of course, either way you'd have to provide for a way to get the key from one system to the other. Also, tools like pg_basebackup could be used, with nearly zero changes, I think, to get an encrypted copy of the database for backup purposes, removing the need to work out a way to handle encrypting backups. > Is the problem that you have to encrypt before sending and decrypt on > arrival, if you don't trust the transmission link? Is that used a lot? > Is having the db encrypt every write a reasonable solution to that? There's multiple use-cases here. Making it easier to copy the database is just one of them and it isn't the biggest one. The biggest benefit is that there's cases where filesystem-level encryption isn't really an option or, even if it is, it's not desirable for other reasons. > As far as future features, we don't have to add the all features at this > time, but if someone has a good idea for an API and we can make it work > easily while adding this feature, why wouldn't we do that? Sure, I'm all for specific suggestions about how to improve the API, or just in general recommendations of how to improve the patch. The suggestions which have been made about key management don't really come across to me as specific API-level recommendations but rather "this would also be nice to have" kind of comments, which isn't really the same. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 02:38:58PM -0400, Stephen Frost wrote: > It's good to discuss what the feature would bring and what cases it > doesn't cover, as well as discussing how it can be designed to make sure > that later improvements are able to be done without having to change it > around. I do think it's a good idea for us to consider taking an > incremental approach where we're adding pieces and building things up as > we go. I'm concerned that if we try to do too much in the initial > implementation that we'll end up not having anything. > > As it relates to the different attack vectors that this would address, > it's primairly the same ones which filesystem-level encryption also > addresses, but it's an improvement when it comes to ease of use. > Unfortunately, it won't address cases where the OS is compromised. OK, so let's go back. You are saying there are no security benefits to this vs. file system encryption. The benefit is allowing configuration in the database rather than the OS? You stated you can transfer db-level encrypted files between servers, but can't you do that anyway? Is the problem that you have to encrypt before sending and decrypt on arrival, if you don't trust the transmission link? Is that used a lot? Is having the db encrypt every write a reasonable solution to that? As far as future features, we don't have to add the all features at this time, but if someone has a good idea for an API and we can make it work easily while adding this feature, why wouldn't we do that? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 02:23:39PM -0400, Stephen Frost wrote: > > I'm not trying to shut down discussion, I'm simply pointing out where > > this feature will be helpful and where it won't be. If there's a way to > > make it better and able to address an attack where the OS permission > > system is bypassed, that'd be great, but I certainly don't know of any > > way to do that and we don't want to claim that this feature will protect > > against an attack vector that it won't. > > > > If the lack of that means you don't support the feature, that's > > unfortunate as it seems to imply, to me at least, that we'll never have > > any kind of encryption because there's no way for it to prevent attacks > > where the OS permission system is able to be bypassed. > > It means if we can't discuss the actual benefits that this feature > brings, and doesn't bring, and how it will deal with future feature > additions, then you are right we will never have it. I apologize for having come across as trying to shut down discussion, that was not my intent. It's good to discuss what the feature would bring and what cases it doesn't cover, as well as discussing how it can be designed to make sure that later improvements are able to be done without having to change it around. I do think it's a good idea for us to consider taking an incremental approach where we're adding pieces and building things up as we go. I'm concerned that if we try to do too much in the initial implementation that we'll end up not having anything. As it relates to the different attack vectors that this would address, it's primairly the same ones which filesystem-level encryption also addresses, but it's an improvement when it comes to ease of use. Unfortunately, it won't address cases where the OS is compromised. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 02:23:39PM -0400, Stephen Frost wrote: > I'm not trying to shut down discussion, I'm simply pointing out where > this feature will be helpful and where it won't be. If there's a way to > make it better and able to address an attack where the OS permission > system is bypassed, that'd be great, but I certainly don't know of any > way to do that and we don't want to claim that this feature will protect > against an attack vector that it won't. > > If the lack of that means you don't support the feature, that's > unfortunate as it seems to imply, to me at least, that we'll never have > any kind of encryption because there's no way for it to prevent attacks > where the OS permission system is able to be bypassed. It means if we can't discuss the actual benefits that this feature brings, and doesn't bring, and how it will deal with future feature additions, then you are right we will never have it. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 01:25:00PM -0400, Stephen Frost wrote: > > > I think the big win of Postgres doing the encryption is that the > > > user-visible file system is no longer a target (assuming OS permissions > > > are bypassed), while for file system encryption it is the storage device > > > that is encrypted. > > > > If OS permissions are bypassed then the encryption isn't going to help > > because the attacker can just access shared memory. > > > > The big wins for doing the encryption in PostgreSQL are, as Robert and I > > have both mentioned on this thread already, that it provides > > data-at-rest encryption in an easier to deploy fashion which will work > > the same across different systems and allows the encrypted cluster to be > > transferred more easily between systems. There are almsot certainly > > other wins from having PG do the encryption, but the above strikes me as > > the big ones, and those are certainly valuable enough on their own for > > us to seriously consider adding this capability. > > Since you seem to be trying to shut down discussion, I will simply say I > am unimpressed that this use-case is sufficient justification to add the > feature. I'm not trying to shut down discussion, I'm simply pointing out where this feature will be helpful and where it won't be. If there's a way to make it better and able to address an attack where the OS permission system is bypassed, that'd be great, but I certainly don't know of any way to do that and we don't want to claim that this feature will protect against an attack vector that it won't. If the lack of that means you don't support the feature, that's unfortunate as it seems to imply, to me at least, that we'll never have any kind of encryption because there's no way for it to prevent attacks where the OS permission system is able to be bypassed. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 01:44:51PM -0400, Stephen Frost wrote: > > Just to be clear, I don't have any issue with discussing the idea that > > we want to get to a point where we can work with multiple keys and > > encrypt different tables with different keys (or not encrypt certain > > tables, et al) with the goal of implementing the single-key approach in > > a way that allows us to expand on it down the road easily, I just don't > > think we need to have it all done in the very first patch which adds the > > ability to encrypt the data files. Maybe you're not saying that it has > > to be included in the first implementation, in which case we seem to > > just be talking past each other, but that isn't the impression I got.. > > We don't want to implement all-cluster encryption with a simple user API > and then realize we need another API for later encryption features, do > we? I actually suspect that's exactly where we'll end up- but due to necessity rather than because there's a way to avoid it. We are going to want to encrypt cluster-wide components of the system (shared catalogs, WAL, etc) and that means that we have to have a key provided very early on. That's a very different thing from being able to do something like encrypt specific tables, tablespaces, etc, where the key can be provided much later and we'll want to allow users to use SQL or the libpq protocol to be able to specify what to encrypt and possibly even provide the encryption keys. That said, the approach outlined here could be used for both by expanding on the command string, perhaps passing it a keyid which is what we store in the catalog to indicate what key a table is encrypted with and then the keyid is "%k" in the command string and the command has to return the specified key for us to decrypt the table. That would involve adding a new catalog table to identify keys and their keyids, I'd think, and an additional column in pg_class which specifies the key (or perhaps we'd just have a new catalog table that says what tables are encrypted in what way). Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 01:44:51PM -0400, Stephen Frost wrote: > Just to be clear, I don't have any issue with discussing the idea that > we want to get to a point where we can work with multiple keys and > encrypt different tables with different keys (or not encrypt certain > tables, et al) with the goal of implementing the single-key approach in > a way that allows us to expand on it down the road easily, I just don't > think we need to have it all done in the very first patch which adds the > ability to encrypt the data files. Maybe you're not saying that it has > to be included in the first implementation, in which case we seem to > just be talking past each other, but that isn't the impression I got.. We don't want to implement all-cluster encryption with a simple user API and then realize we need another API for later encryption features, do we? And we are not going to know that if we don't talk about it, but hey, this is just an email thread and I can marshal opposition to the feature later when it appears, and point this all out again. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Joe, * Joe Conway (m...@joeconway.com) wrote: > On 06/13/2017 10:20 AM, Stephen Frost wrote: > > * Joe Conway (m...@joeconway.com) wrote: > >> Except shell escaping issues, etc, etc > > > > That's not an issue- we're talking about reading the stdout of some > > other process, there's no shell escaping that has to be done there. > > It could be an issue depending on how the user stores their master key. ... eh? The user gives us a command to run, we run it, it spits out some binary blob to stdout which we read in and use as the key. I don't see where in that there's any need to be concerned about shell escaping issues. > > I disagree that proper key management is "simple". If we really get to > > a point where we think we have a simple answer to it then perhaps that > > can be implemented in addition to the encryption piece in the same > > release cycle- but they certainly don't need to be in the same patch, > > nor do we need to make good key management a requirement for adding > > encryption support. > > I never said key management was simple. Indeed it is the most complex > and hazardous part of all this as you said earlier. What is simple is > implementing a master key encrypting actual keys scheme. Keeping the > user's master key management out of this design is unchanged by what I > proposed, and what I proposed is a superior yet simple method. Yes, it > can be done separately but what is the point? We should at least discuss > it as part of the design. The point is that we haven't got any encryption of any kind and you're suggesting we introduce key management which you agree isn't simple. That you're trying to argue that it actually is simple because it's just <> is a bit bizarre to me. > > No, but it seriously changes the level of complexity. I feel like we're > > trying to go from zero to light speed here because there's an idea that > > it's "simple" to add X, Y or Z additional requirement beyond the basic > > feature, but we don't have anything yet. > > I think that is hyperbole. It does not significantly add to the > complexity of what is being discussed. If I stipulate that it's, indeed, simple to implement a system where we have a master key and other keys- where are those other keys going to kept (even if they're encrypted)? How many extra keys are we talking about? When are those keys going to be used and how do we know what key to use when? If we're going to do per-tablespace or per-table encryption, how are we going to handle the WAL for that? Will we have an independent key for WAL (in which case, what's the point of using different keys for tables, et al, when all the data is in the WAL?)? Having a single key which is used cluster-wide is extremely simple and lets us get some form of encryption first before we try to tackle the more complicated multi-key/partial-encryption system. Just to be clear, I don't have any issue with discussing the idea that we want to get to a point where we can work with multiple keys and encrypt different tables with different keys (or not encrypt certain tables, et al) with the goal of implementing the single-key approach in a way that allows us to expand on it down the road easily, I just don't think we need to have it all done in the very first patch which adds the ability to encrypt the data files. Maybe you're not saying that it has to be included in the first implementation, in which case we seem to just be talking past each other, but that isn't the impression I got.. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 01:25:00PM -0400, Stephen Frost wrote: > > I think the big win of Postgres doing the encryption is that the > > user-visible file system is no longer a target (assuming OS permissions > > are bypassed), while for file system encryption it is the storage device > > that is encrypted. > > If OS permissions are bypassed then the encryption isn't going to help > because the attacker can just access shared memory. > > The big wins for doing the encryption in PostgreSQL are, as Robert and I > have both mentioned on this thread already, that it provides > data-at-rest encryption in an easier to deploy fashion which will work > the same across different systems and allows the encrypted cluster to be > transferred more easily between systems. There are almsot certainly > other wins from having PG do the encryption, but the above strikes me as > the big ones, and those are certainly valuable enough on their own for > us to seriously consider adding this capability. Since you seem to be trying to shut down discussion, I will simply say I am unimpressed that this use-case is sufficient justification to add the feature. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 06/13/2017 10:20 AM, Stephen Frost wrote: > * Joe Conway (m...@joeconway.com) wrote: >> Except shell escaping issues, etc, etc > > That's not an issue- we're talking about reading the stdout of some > other process, there's no shell escaping that has to be done there. It could be an issue depending on how the user stores their master key. > I disagree that proper key management is "simple". If we really get to > a point where we think we have a simple answer to it then perhaps that > can be implemented in addition to the encryption piece in the same > release cycle- but they certainly don't need to be in the same patch, > nor do we need to make good key management a requirement for adding > encryption support. I never said key management was simple. Indeed it is the most complex and hazardous part of all this as you said earlier. What is simple is implementing a master key encrypting actual keys scheme. Keeping the user's master key management out of this design is unchanged by what I proposed, and what I proposed is a superior yet simple method. Yes, it can be done separately but what is the point? We should at least discuss it as part of the design. > No, but it seriously changes the level of complexity. I feel like we're > trying to go from zero to light speed here because there's an idea that > it's "simple" to add X, Y or Z additional requirement beyond the basic > feature, but we don't have anything yet. I think that is hyperbole. It does not significantly add to the complexity of what is being discussed. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 01:01:32PM -0400, Stephen Frost wrote: > > > Well, usually the symetric key is stored using RSA and a symetric > > > cipher is used to encrypt/decrypt the data. I was thinking of a case > > > where you encrypt a row using a symetric key, then store RSA-encrypted > > > versions of the symetric key encrypted that only specific users could > > > decrypt and get the key to decrypt the data. > > > > This goes back to key management and I agree that it often makes sense > > to use RSA or similar to encrypt the symmetric key, and this approach > > would allow the user to do so. That doesn't actually give you a > > "write-only" encryption option though, since any user who can decrypt > > the symmetric key is able to use the symmetric key for both encryption > > and decryption, and someone who only has access to the RSA encryption > > key can't actually encrypt the data since they can't access the > > symmetric key. > > I think the big win of Postgres doing the encryption is that the > user-visible file system is no longer a target (assuming OS permissions > are bypassed), while for file system encryption it is the storage device > that is encrypted. If OS permissions are bypassed then the encryption isn't going to help because the attacker can just access shared memory. The big wins for doing the encryption in PostgreSQL are, as Robert and I have both mentioned on this thread already, that it provides data-at-rest encryption in an easier to deploy fashion which will work the same across different systems and allows the encrypted cluster to be transferred more easily between systems. There are almsot certainly other wins from having PG do the encryption, but the above strikes me as the big ones, and those are certainly valuable enough on their own for us to seriously consider adding this capability. > My big question is how many times are the OS permissions bypassed in a > way that would also not expose the db clusters key or db data? This is not the attack vector that this solution is attempting to address, so there really isn't much point in discussing it on this thread. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
Joe, * Joe Conway (m...@joeconway.com) wrote: > Except shell escaping issues, etc, etc That's not an issue- we're talking about reading the stdout of some other process, there's no shell escaping that has to be done there. > > Let us, please, stop stressing over the right way to do key management > > as part of this discussion about providing encryption. The two are > > different things and we do not need to solve both at once. > > Not stressing, but this is an important part of the design and should be > done correctly. It is also very simple, so should not be hard to add. I disagree that proper key management is "simple". If we really get to a point where we think we have a simple answer to it then perhaps that can be implemented in addition to the encryption piece in the same release cycle- but they certainly don't need to be in the same patch, nor do we need to make good key management a requirement for adding encryption support. > > Further, yes, we will definitely want to get to a point where we can > > encrypt subsets of the system in different ways, but that doesn't have > > to be done in the first implementation either. > > No, it doesn't, but that doesn't change the utility of doing it this way > from the start. No, but it seriously changes the level of complexity. I feel like we're trying to go from zero to light speed here because there's an idea that it's "simple" to add X, Y or Z additional requirement beyond the basic feature, but we don't have anything yet. I continue to be of the feeling that we should start simple and keep it to the basic feature first and make sure that we can actually get that right before we start looking into adding on additional bits. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 01:01:32PM -0400, Stephen Frost wrote: > > Well, usually the symetric key is stored using RSA and a symetric > > cipher is used to encrypt/decrypt the data. I was thinking of a case > > where you encrypt a row using a symetric key, then store RSA-encrypted > > versions of the symetric key encrypted that only specific users could > > decrypt and get the key to decrypt the data. > > This goes back to key management and I agree that it often makes sense > to use RSA or similar to encrypt the symmetric key, and this approach > would allow the user to do so. That doesn't actually give you a > "write-only" encryption option though, since any user who can decrypt > the symmetric key is able to use the symmetric key for both encryption > and decryption, and someone who only has access to the RSA encryption > key can't actually encrypt the data since they can't access the > symmetric key. I think the big win of Postgres doing the encryption is that the user-visible file system is no longer a target (assuming OS permissions are bypassed), while for file system encryption it is the storage device that is encrypted. My big question is how many times are the OS permissions bypassed in a way that would also not expose the db clusters key or db data? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 06/13/2017 10:05 AM, Stephen Frost wrote: > Bruce, Joe, > > * Bruce Momjian (br...@momjian.us) wrote: >> On Tue, Jun 13, 2017 at 09:55:10AM -0700, Joe Conway wrote: >> > > That way, if the user wants to store the key in an unencrypted text >> > > file, they can set the encryption_key_command = 'cat /not/very/secure' >> > > and call it a day. If they want to prompt the user on the console or >> > > request the key from an HSM or get it in any other way, they just have >> > > to write the appropriate shell script. We just provide mechanism, not >> > > policy, and the user can adopt any policy they like, from an extremely >> > > insecure policy to one suitable for Fort Knox. >> > >> > Agreed, but as Bruce alluded to, we want this to be a master key, which >> > is in turn used to encrypt the actual key, or keys, that are used to >> > encrypt the data. The actual data encryption keys could be very long >> > randomly generated binary, and there could be more than one of them >> > (e.g. one per tablespace) in a file which is encrypted with the master >> > key. This is more secure and allows, for example the master key to be >> > changed without having to decrypt/re-encrypt the entire database. >> >> Yes, thank you. Also, you can make multiple RSA-encrypted copies of the >> symetric key, one for each role you want to view the data. And good >> point on the ability to change the RSA key/password without having to >> reencrypt the data. > > There's nothing in this proposal that prevents the user from using a > very long randomly generated binary key. We aren't talking about > prompting the user for a password unless that's what they decide the > shell script should do, unless the user decides to do that and if they > do then that's their choice. Except shell escaping issues, etc, etc > Let us, please, stop stressing over the right way to do key management > as part of this discussion about providing encryption. The two are > different things and we do not need to solve both at once. Not stressing, but this is an important part of the design and should be done correctly. It is also very simple, so should not be hard to add. > Further, yes, we will definitely want to get to a point where we can > encrypt subsets of the system in different ways, but that doesn't have > to be done in the first implementation either. No, it doesn't, but that doesn't change the utility of doing it this way from the start. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] WIP: Data at rest encryption
Bruce, Joe, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 09:55:10AM -0700, Joe Conway wrote: > > > That way, if the user wants to store the key in an unencrypted text > > > file, they can set the encryption_key_command = 'cat /not/very/secure' > > > and call it a day. If they want to prompt the user on the console or > > > request the key from an HSM or get it in any other way, they just have > > > to write the appropriate shell script. We just provide mechanism, not > > > policy, and the user can adopt any policy they like, from an extremely > > > insecure policy to one suitable for Fort Knox. > > > > Agreed, but as Bruce alluded to, we want this to be a master key, which > > is in turn used to encrypt the actual key, or keys, that are used to > > encrypt the data. The actual data encryption keys could be very long > > randomly generated binary, and there could be more than one of them > > (e.g. one per tablespace) in a file which is encrypted with the master > > key. This is more secure and allows, for example the master key to be > > changed without having to decrypt/re-encrypt the entire database. > > Yes, thank you. Also, you can make multiple RSA-encrypted copies of the > symetric key, one for each role you want to view the data. And good > point on the ability to change the RSA key/password without having to > reencrypt the data. There's nothing in this proposal that prevents the user from using a very long randomly generated binary key. We aren't talking about prompting the user for a password unless that's what they decide the shell script should do, unless the user decides to do that and if they do then that's their choice. Let us, please, stop stressing over the right way to do key management as part of this discussion about providing encryption. The two are different things and we do not need to solve both at once. Further, yes, we will definitely want to get to a point where we can encrypt subsets of the system in different ways, but that doesn't have to be done in the first implementation either. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 12:23:01PM -0400, Stephen Frost wrote: > > > Of course, if the > > > key stored in the database is visible to someone using the operating > > > system, we really haven't added much/any security --- I guess my point > > > is that the OS easily can hide the key from the database, but the > > > database can't easily hide the key from the operating system. > > > > This is correct- the key must be available to the PostgreSQL process > > and therefore someone with privileged access to the OS would be able to > > retrieve the key, but that's also true of filesystem encryption. > > > > Basically, if the server is doing the encryption and you have the > > ability to read all memory on the server then you can get the key. Of > > course, if you can read all memory then you can just look at shared > > buffers and you don't really need to bother yourself with the key or > > the encryption, and it doesn't make any difference if you're encrypting > > in the database or in the filesystem. That attack vector is not one > > which this is intending to address. > > My point is that if you have the key accessible to the database server, > both the database server and OS have access to it. If you store it in > the OS, only the OS has access to it. I understand that, but that's not a particularly interesting distinction as the database server must, necessairly, have access to the data in the database. > > > I have to admit we tend to avoid heavy-API solutions that are designed > > > just to work around deployment challenges. Commercial databases are > > > fine in doing that, but it leads to very complex products. > > > > I'm not following what you mean here. > > By adding all-cluster encryption, we are re-implementing something the > OS does just fine, in most cases. We are going to have API overhead to > do it in the database, and historically we have avoided that. We do try to avoid the overhead of additional function calls, in places where it matters. As this is all about reading and writing data, the overhead for the additional check to see if we're doing encryption or not is unlikely to be interesting. > > > One cool idea I have is using public encryption to store the encryption > > > key by users who don't know the decryption key, e.g. RSA. It would be a > > > write-only encryption option. Not sure how useful that is, but it > > > easily possible, and doesn't require us to keep the _encryption_ key > > > secret, just the decryption one. > > > > The downside here is that asymmetric encryption is much more expensive > > than symmetric encryption and that probably makes it a non-starter. I > > do think we'll want to support multiple encryption methods and perhaps > > we can have an option where asymmetric encryption is used, but that's > > not what I expect will be typically used. > > Well, usually the symetric key is stored using RSA and a symetric > cipher is used to encrypt/decrypt the data. I was thinking of a case > where you encrypt a row using a symetric key, then store RSA-encrypted > versions of the symetric key encrypted that only specific users could > decrypt and get the key to decrypt the data. This goes back to key management and I agree that it often makes sense to use RSA or similar to encrypt the symmetric key, and this approach would allow the user to do so. That doesn't actually give you a "write-only" encryption option though, since any user who can decrypt the symmetric key is able to use the symmetric key for both encryption and decryption, and someone who only has access to the RSA encryption key can't actually encrypt the data since they can't access the symmetric key. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 09:55:10AM -0700, Joe Conway wrote: > > That way, if the user wants to store the key in an unencrypted text > > file, they can set the encryption_key_command = 'cat /not/very/secure' > > and call it a day. If they want to prompt the user on the console or > > request the key from an HSM or get it in any other way, they just have > > to write the appropriate shell script. We just provide mechanism, not > > policy, and the user can adopt any policy they like, from an extremely > > insecure policy to one suitable for Fort Knox. > > Agreed, but as Bruce alluded to, we want this to be a master key, which > is in turn used to encrypt the actual key, or keys, that are used to > encrypt the data. The actual data encryption keys could be very long > randomly generated binary, and there could be more than one of them > (e.g. one per tablespace) in a file which is encrypted with the master > key. This is more secure and allows, for example the master key to be > changed without having to decrypt/re-encrypt the entire database. Yes, thank you. Also, you can make multiple RSA-encrypted copies of the symetric key, one for each role you want to view the data. And good point on the ability to change the RSA key/password without having to reencrypt the data. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 06/13/2017 09:28 AM, Robert Haas wrote: > On Tue, Jun 13, 2017 at 12:23 PM, Stephen Frost wrote: >> Key management is an entirely independent discussion from this and the >> proposal from Ants, as I understand it, is that the key would *not* be >> in the database but could be anywhere that a shell command could get it >> from, including possibly a HSM (hardware device). > > Yes. I think the right way to implement this is something like: > > 1. Have a GUC that runs a shell command to get the key. > > 2. If the command successfully gets the key, it prints it to stdout > and returns 0. > > 3. If it doesn't get successfully get the key, it returns 1. The > database can retry or give up, whatever we decide to do. > > That way, if the user wants to store the key in an unencrypted text > file, they can set the encryption_key_command = 'cat /not/very/secure' > and call it a day. If they want to prompt the user on the console or > request the key from an HSM or get it in any other way, they just have > to write the appropriate shell script. We just provide mechanism, not > policy, and the user can adopt any policy they like, from an extremely > insecure policy to one suitable for Fort Knox. Agreed, but as Bruce alluded to, we want this to be a master key, which is in turn used to encrypt the actual key, or keys, that are used to encrypt the data. The actual data encryption keys could be very long randomly generated binary, and there could be more than one of them (e.g. one per tablespace) in a file which is encrypted with the master key. This is more secure and allows, for example the master key to be changed without having to decrypt/re-encrypt the entire database. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 12:23:01PM -0400, Stephen Frost wrote: > > As I understand it, having encryption in the database means the key is > > stored in the database, while having encryption in the file system means > > the key is stored in the operating system somewhere. > > Key management is an entirely independent discussion from this and the > proposal from Ants, as I understand it, is that the key would *not* be > in the database but could be anywhere that a shell command could get it > from, including possibly a HSM (hardware device). > > Having the data encrypted by PostgreSQL does not mean the key is stored > in the database. Yes, I was just simplifying. > > Of course, if the > > key stored in the database is visible to someone using the operating > > system, we really haven't added much/any security --- I guess my point > > is that the OS easily can hide the key from the database, but the > > database can't easily hide the key from the operating system. > > This is correct- the key must be available to the PostgreSQL process > and therefore someone with privileged access to the OS would be able to > retrieve the key, but that's also true of filesystem encryption. > > Basically, if the server is doing the encryption and you have the > ability to read all memory on the server then you can get the key. Of > course, if you can read all memory then you can just look at shared > buffers and you don't really need to bother yourself with the key or > the encryption, and it doesn't make any difference if you're encrypting > in the database or in the filesystem. That attack vector is not one > which this is intending to address. My point is that if you have the key accessible to the database server, both the database server and OS have access to it. If you store it in the OS, only the OS has access to it. > > I have to admit we tend to avoid heavy-API solutions that are designed > > just to work around deployment challenges. Commercial databases are > > fine in doing that, but it leads to very complex products. > > I'm not following what you mean here. By adding all-cluster encryption, we are re-implementing something the OS does just fine, in most cases. We are going to have API overhead to do it in the database, and historically we have avoided that. > > One cool idea I have is using public encryption to store the encryption > > key by users who don't know the decryption key, e.g. RSA. It would be a > > write-only encryption option. Not sure how useful that is, but it > > easily possible, and doesn't require us to keep the _encryption_ key > > secret, just the decryption one. > > The downside here is that asymmetric encryption is much more expensive > than symmetric encryption and that probably makes it a non-starter. I > do think we'll want to support multiple encryption methods and perhaps > we can have an option where asymmetric encryption is used, but that's > not what I expect will be typically used. Well, usually the symetric key is stored using RSA and a symetric cipher is used to encrypt/decrypt the data. I was thinking of a case where you encrypt a row using a symetric key, then store RSA-encrypted versions of the symetric key encrypted that only specific users could decrypt and get the key to decrypt the data. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 12:23 PM, Stephen Frost wrote: > Key management is an entirely independent discussion from this and the > proposal from Ants, as I understand it, is that the key would *not* be > in the database but could be anywhere that a shell command could get it > from, including possibly a HSM (hardware device). Yes. I think the right way to implement this is something like: 1. Have a GUC that runs a shell command to get the key. 2. If the command successfully gets the key, it prints it to stdout and returns 0. 3. If it doesn't get successfully get the key, it returns 1. The database can retry or give up, whatever we decide to do. That way, if the user wants to store the key in an unencrypted text file, they can set the encryption_key_command = 'cat /not/very/secure' and call it a day. If they want to prompt the user on the console or request the key from an HSM or get it in any other way, they just have to write the appropriate shell script. We just provide mechanism, not policy, and the user can adopt any policy they like, from an extremely insecure policy to one suitable for Fort Knox. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Tue, Jun 13, 2017 at 11:04:21AM -0400, Stephen Frost wrote: > > > Also, in the use case you describe, if you use pg_basebackup to make a > > > direct encrypted copy of a data directory, I think that would mean you'd > > > have to keep using the same key for all copies. > > > > That's true, but that might be acceptable and possibly even desirable in > > certain cases. On the other hand, it would certainly be a useful > > feature to have a way to migrate from one key to another. Perhaps that > > would start out as an off-line tool, but maybe we'd be able to work out > > a way to support having it done on-line in the future (certainly > > non-trivial, but if we supported multiple keys concurrently with a > > preference for which key is used to write data back out, and required > > that checksums be in place to allow us to test if decrypting with a > > specific key worked ... lots more hand-waving here... ). > > As I understand it, having encryption in the database means the key is > stored in the database, while having encryption in the file system means > the key is stored in the operating system somewhere. Key management is an entirely independent discussion from this and the proposal from Ants, as I understand it, is that the key would *not* be in the database but could be anywhere that a shell command could get it from, including possibly a HSM (hardware device). Having the data encrypted by PostgreSQL does not mean the key is stored in the database. > Of course, if the > key stored in the database is visible to someone using the operating > system, we really haven't added much/any security --- I guess my point > is that the OS easily can hide the key from the database, but the > database can't easily hide the key from the operating system. This is correct- the key must be available to the PostgreSQL process and therefore someone with privileged access to the OS would be able to retrieve the key, but that's also true of filesystem encryption. Basically, if the server is doing the encryption and you have the ability to read all memory on the server then you can get the key. Of course, if you can read all memory then you can just look at shared buffers and you don't really need to bother yourself with the key or the encryption, and it doesn't make any difference if you're encrypting in the database or in the filesystem. That attack vector is not one which this is intending to address. > Of course, if the storage is split from the database server then having > the key on the database server seems like a win. However, I think a db > server could easily encrypt blocks before sending them to the SAN > server. This would not work for NAS, of course, since it is file-based. The key doesn't necessairly have to be stored anywhere on the server- it just needs to be kept in memory while the database process is running and made available to the database at startup, unless an external system is used to perform the encryption, which might be possible with an extension, as discussed. In some environments, it might be acceptable to have the key stored on the database server, of course, but there's no requirement for the key to be stored on the database server or in the database at all. > I have to admit we tend to avoid heavy-API solutions that are designed > just to work around deployment challenges. Commercial databases are > fine in doing that, but it leads to very complex products. I'm not following what you mean here. > I think the larger issue is where to store the key. I would love for us > to come up with a unified solution to that and then build encryption on > that, including all-cluster encryption. Honestly, key management is something that I'd rather we *not* worry about in an initial implementation, which is one reason that I liked the approach discussed here of having a command which runs to provide the key. We could certainly look into improving that in the future, but key management really is a largely independent issue from encryption and it's much more difficult and complicated and whatever we come up with would still almost certainly be usable with the approach proposed here. > One cool idea I have is using public encryption to store the encryption > key by users who don't know the decryption key, e.g. RSA. It would be a > write-only encryption option. Not sure how useful that is, but it > easily possible, and doesn't require us to keep the _encryption_ key > secret, just the decryption one. The downside here is that asymmetric encryption is much more expensive than symmetric encryption and that probably makes it a non-starter. I do think we'll want to support multiple encryption methods and perhaps we can have an option where asymmetric encryption is used, but that's not what I expect will be typically used. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 11:04:21AM -0400, Stephen Frost wrote: > > Also, in the use case you describe, if you use pg_basebackup to make a > > direct encrypted copy of a data directory, I think that would mean you'd > > have to keep using the same key for all copies. > > That's true, but that might be acceptable and possibly even desirable in > certain cases. On the other hand, it would certainly be a useful > feature to have a way to migrate from one key to another. Perhaps that > would start out as an off-line tool, but maybe we'd be able to work out > a way to support having it done on-line in the future (certainly > non-trivial, but if we supported multiple keys concurrently with a > preference for which key is used to write data back out, and required > that checksums be in place to allow us to test if decrypting with a > specific key worked ... lots more hand-waving here... ). As I understand it, having encryption in the database means the key is stored in the database, while having encryption in the file system means the key is stored in the operating system somewhere. Of course, if the key stored in the database is visible to someone using the operating system, we really haven't added much/any security --- I guess my point is that the OS easily can hide the key from the database, but the database can't easily hide the key from the operating system. Of course, if the storage is split from the database server then having the key on the database server seems like a win. However, I think a db server could easily encrypt blocks before sending them to the SAN server. This would not work for NAS, of course, since it is file-based. I have to admit we tend to avoid heavy-API solutions that are designed just to work around deployment challenges. Commercial databases are fine in doing that, but it leads to very complex products. I think the larger issue is where to store the key. I would love for us to come up with a unified solution to that and then build encryption on that, including all-cluster encryption. One cool idea I have is using public encryption to store the encryption key by users who don't know the decryption key, e.g. RSA. It would be a write-only encryption option. Not sure how useful that is, but it easily possible, and doesn't require us to keep the _encryption_ key secret, just the decryption one. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 13, 2017 at 11:35:03AM -0400, Robert Haas wrote: > I anticipate that one of the trickier problems here will be handling > encryption of the write-ahead log. Suppose you encrypt WAL a block at > a time. In the current system, once you've written and flushed a > block, you can consider it durably committed, but if that block is > encrypted, this is no longer true. A crash might tear the block, > making it impossible to decrypt. Replay will therefore stop at the > end of the previous block, not at the last record actually flushed as > would happen today. So, your synchronous_commit suddenly isn't. A > similar problem will occur any other page where we choose not to > protect against torn pages using full page writes. For instance, > unless checksums are enabled or wal_log_hints=on, we'll write a data > page where a single bit has been flipped and assume that the bit will > either make it to disk or not; the page can't really be torn in any > way that hurts us. But with encryption that's no longer true, because > the hint bit will turn into much more than a single bit flip, and > rereading that page with half old and half new contents will be the > end of the world (TM). I don't know off-hand whether we're > protecting, say, CLOG page writes with FPWs.: because setting a couple > of bits is idempotent and doesn't depend on the existing page > contents, we might not need it currently, but with encryption, every > bit in the page depends on every other bit in the page, so we > certainly would. I don't know how many places we've got assumptions > like this baked into the system, but I'm guessing there are a bunch. That is not necessary true. You are describing a cipher mode where the user data goes through the cipher, e.g. AES in CBC mode. However, if you are using a stream cipher based on a block cipher, e.g. CTR, GCM, you XOR the user data with a random bit stream, and in that case one bit change in user data would be one bit change in the cipher output. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Mon, Jun 12, 2017 at 5:11 PM, Ants Aasma wrote: > Fundamentally there doesn't seem to be a big benefit of implementing > the encryption at PostgreSQL level instead of the filesystem. The > patch doesn't take any real advantage from the higher level knowledge > of the system, nor do I see much possibility for it to do that. The > main benefit for us is that it's much easier to get a PostgreSQL based > solution deployed. I agree with all of that, but ease of deployment has some value unto itself. I think pretty much every modern operating system has some way of encrypting a filesystem, but it's different on Linux vs. Windows vs. macOS vs. BSD, and you probably need to be the system administrator on any of those systems in order to set it up. Something built into PostgreSQL could run without administrator privileges and work the same way on every platform we support. That would be useful. Of course, what would be even more useful is fine-grained encryption - encrypt these tables (and the corresponding indexes, toast tables, and WAL records related to any of that) with this key, encrypt these other tables (and the same list of associated stuff) with this other key, and leave the rest unencrypted. The problem with that is that you probably can't run recovery without all of the keys, and even on a clean startup there would be a good deal of engineering work involved in refusing access to tables whose key hadn't been provided yet. I don't think we should wait to have this feature until all of those problems are solved. In my opinion, something coarse-grained that just encrypts the whole cluster would be a pretty useful place to start and would meet the needs of enough people to be worthwhile all on its own. Performance is likely to be poor on large databases, because every time a page transits between shared_buffers and the buffer cache we've got to en/decrypt, but as long as it's only poor for the people who opt into the feature I don't see a big problem with that. I anticipate that one of the trickier problems here will be handling encryption of the write-ahead log. Suppose you encrypt WAL a block at a time. In the current system, once you've written and flushed a block, you can consider it durably committed, but if that block is encrypted, this is no longer true. A crash might tear the block, making it impossible to decrypt. Replay will therefore stop at the end of the previous block, not at the last record actually flushed as would happen today. So, your synchronous_commit suddenly isn't. A similar problem will occur any other page where we choose not to protect against torn pages using full page writes. For instance, unless checksums are enabled or wal_log_hints=on, we'll write a data page where a single bit has been flipped and assume that the bit will either make it to disk or not; the page can't really be torn in any way that hurts us. But with encryption that's no longer true, because the hint bit will turn into much more than a single bit flip, and rereading that page with half old and half new contents will be the end of the world (TM). I don't know off-hand whether we're protecting, say, CLOG page writes with FPWs.: because setting a couple of bits is idempotent and doesn't depend on the existing page contents, we might not need it currently, but with encryption, every bit in the page depends on every other bit in the page, so we certainly would. I don't know how many places we've got assumptions like this baked into the system, but I'm guessing there are a bunch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > I wonder what the proper extent of "encryption at rest" should be. If > you encrypt just on a file or block level, then someone looking at the > data directory or a backup can still learn a number of things about the > number of tables, transaction rates, various configuration settings, and > so on. In the scenario of a sensitive application hosted on a shared > SAN, I don't think that is good enough. If someone has access to the SAN, it'd be very difficult to avoid revealing some information about transaction rates or I/O throughput. Being able to have the configuration files encrypted would be good (thinking particularly about pg_hba.conf/pg_ident.conf) but I don't know that it's strictly necessary or that it would have to be done in an initial version. Certainly, there is a trade-off here when it comes to the information which someone can learn about the system by looking at the number and sizes of files from using PG-based encryption vs. what information someone can learn from being able to look at only an encrypted filesystem, but that's a trade-off which security experts are good at making a determination on and will be case-by-case, based on how easy setting up filesystem-encryption is in a particular environment and what the use-cases are for the system. > Also, in the use case you describe, if you use pg_basebackup to make a > direct encrypted copy of a data directory, I think that would mean you'd > have to keep using the same key for all copies. That's true, but that might be acceptable and possibly even desirable in certain cases. On the other hand, it would certainly be a useful feature to have a way to migrate from one key to another. Perhaps that would start out as an off-line tool, but maybe we'd be able to work out a way to support having it done on-line in the future (certainly non-trivial, but if we supported multiple keys concurrently with a preference for which key is used to write data back out, and required that checksums be in place to allow us to test if decrypting with a specific key worked ... lots more hand-waving here... ). Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On 6/13/17 09:24, Stephen Frost wrote: > but there are > use-cases where it'd be really nice to be able to have PG doing the > encryption instead of the filesystem because then you can do things like > backup the database, copy it somewhere else directly, and then restore > it using the regular PG mechanisms, as long as you have access to the > key. That's not something you can directly do with filesystem-level > encryption Interesting point. I wonder what the proper extent of "encryption at rest" should be. If you encrypt just on a file or block level, then someone looking at the data directory or a backup can still learn a number of things about the number of tables, transaction rates, various configuration settings, and so on. In the scenario of a sensitive application hosted on a shared SAN, I don't think that is good enough. Also, in the use case you describe, if you use pg_basebackup to make a direct encrypted copy of a data directory, I think that would mean you'd have to keep using the same key for all copies. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
Ants, all, * Ants Aasma (ants.aa...@eesti.ee) wrote: > Yes, the plan is to pick it up again, Real Soon Now(tm). There are a > couple of loose ends for stuff that should be encrypted, but in the > current state of the patch aren't yet (from the top of my head, > logical decoding and pg_stat_statements write some files). The code > handling keys could really take better precautions as Peter pointed > out in another e-mail. And I expect there to be a bunch of polishing > work to make the APIs as good as they can be. Very glad to hear that you're going to be continuing to work on this effort. > To answer Peter's question about HSMs, many enterprise deployments are > on top of shared storage systems. For regulatory reasons or to limit > security clearance of storage administrators, the data on shared > storage should be encrypted. Now for there to be any point to this > endeavor, the key needs to be stored somewhere else. This is where > hardware security modules come in. They are basically hardware key > storage appliances that can either output the key when requested, or > for higher security hold onto the key and perform > encryption/decryption on behalf of the user. The patch enables the > user to use a custom shell command to go and fetch the key from the > HSM, for example using the KMIP protocol. Or a motivated person could > write an extension that implements the encryption hooks to delegate > encryption/decryption of blocks to an HSM. An extension, or perhaps even something built-in, would certainly be good here but I don't think it's necessary in an initial implementation as long as it's something we can do later. > Fundamentally there doesn't seem to be a big benefit of implementing > the encryption at PostgreSQL level instead of the filesystem. The > patch doesn't take any real advantage from the higher level knowledge > of the system, nor do I see much possibility for it to do that. The > main benefit for us is that it's much easier to get a PostgreSQL based > solution deployed. Making it easier to get a PostgreSQL solution deployed is certainly a very worthwhile goal. > I'm curious if the community thinks this is a feature worth having? > Even considering that security experts would classify this kind of > encryption as a checkbox feature. I would say that some security experts would consider it a 'checkbox' feature, while others would say that it's actually a quite useful capability for a database to have and isn't just for being able to check a given box. I tended to lean towards the 'checkbox' camp and encouraged people to use filesystem encryption also, but there are use-cases where it'd be really nice to be able to have PG doing the encryption instead of the filesystem because then you can do things like backup the database, copy it somewhere else directly, and then restore it using the regular PG mechanisms, as long as you have access to the key. That's not something you can directly do with filesystem-level encryption (unless you happen to be lucky enough to be able to use ZFS, which can do exporting, or you can do a block-level exact copy to an exactly identically sized partition on the remote side or similar..), and while you could encrypt the PG files during the backup, that requires that you make sure both sides agree on how that encryption is done and have the same tools for performing the encryption/decryption. Possible, certainly, but not nearly as convenient. +1 for having this capability. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] WIP: Data at rest encryption
On 6/12/17 17:11, Ants Aasma wrote: > I'm curious if the community thinks this is a feature worth having? > Even considering that security experts would classify this kind of > encryption as a checkbox feature. File system encryption already exists and is well-tested. I don't see any big advantages in re-implementing all of this one level up. You would have to touch every single place in PostgreSQL backend and tool code where a file is being read or written. Yikes. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Mon, Jun 12, 2017 at 10:38 PM, Robert Haas wrote: > On Mon, Jun 13, 2016 at 11:07 AM, Peter Eisentraut > wrote: >> On 6/7/16 9:56 AM, Ants Aasma wrote: >>> >>> Similar things can be achieved with filesystem level encryption. >>> However this is not always optimal for various reasons. One of the >>> better reasons is the desire for HSM based encryption in a storage >>> area network based setup. >> >> Could you explain this in more detail? > > I don't think Ants ever responded to this point. > > I'm curious whether this is something that is likely to be pursued for > PostgreSQL 11. Yes, the plan is to pick it up again, Real Soon Now(tm). There are a couple of loose ends for stuff that should be encrypted, but in the current state of the patch aren't yet (from the top of my head, logical decoding and pg_stat_statements write some files). The code handling keys could really take better precautions as Peter pointed out in another e-mail. And I expect there to be a bunch of polishing work to make the APIs as good as they can be. To answer Peter's question about HSMs, many enterprise deployments are on top of shared storage systems. For regulatory reasons or to limit security clearance of storage administrators, the data on shared storage should be encrypted. Now for there to be any point to this endeavor, the key needs to be stored somewhere else. This is where hardware security modules come in. They are basically hardware key storage appliances that can either output the key when requested, or for higher security hold onto the key and perform encryption/decryption on behalf of the user. The patch enables the user to use a custom shell command to go and fetch the key from the HSM, for example using the KMIP protocol. Or a motivated person could write an extension that implements the encryption hooks to delegate encryption/decryption of blocks to an HSM. Fundamentally there doesn't seem to be a big benefit of implementing the encryption at PostgreSQL level instead of the filesystem. The patch doesn't take any real advantage from the higher level knowledge of the system, nor do I see much possibility for it to do that. The main benefit for us is that it's much easier to get a PostgreSQL based solution deployed. I'm curious if the community thinks this is a feature worth having? Even considering that security experts would classify this kind of encryption as a checkbox feature. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Mon, Jun 13, 2016 at 11:07 AM, Peter Eisentraut wrote: > On 6/7/16 9:56 AM, Ants Aasma wrote: >> >> Similar things can be achieved with filesystem level encryption. >> However this is not always optimal for various reasons. One of the >> better reasons is the desire for HSM based encryption in a storage >> area network based setup. > > Could you explain this in more detail? I don't think Ants ever responded to this point. I'm curious whether this is something that is likely to be pursued for PostgreSQL 11. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 06/14/2016 09:59 PM, Jim Nasby wrote: On 6/12/16 2:13 AM, Ants Aasma wrote: On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi wrote: > 1. Instead of doing the entire database files encryption, how about > providing user an option to protect only some particular tables that > wants the encryption at table/tablespace level. This not only provides > an option to the user, it reduces the performance impact on tables > that doesn't need any encryption. The problem with this approach > is that every xlog record needs to validate to handle the encryption > /decryption, instead of at page level. Is there a real need for this? The customers I have talked to want to encrypt the whole database and my goal is to make the feature fast enough to make that feasible for pretty much everyone. I guess switching encryption off per table would be feasible, but the key setup would still need to be done at server startup. Per record encryption would result in some additional information leakage though. Overall I thought it would not be worth it, but I'm willing to have my mind changed on this. I actually design with this in mind. Tables that contain sensitive info go into designated schemas, partly so that you can blanket move all of those to an encrypted tablespace (or safer would be to move things not in those schemas to an unencrypted tablespace). Since that can be done with an encrypted filesystem maybe that's good enough. (It's not really clear to me what this buys us over an encrypted FS, other than a feature comparison checkmark...) the reason why this is needed is actually very simple: security guidelines and legal requirements ... we have dealt with a couple of companies recently, who explicitly demanded PostgreSQL level encryption in a transparent way to fulfill some internal or legal requirements. this is especially true for financial stuff. and yes, sure ... you can do a lot of stuff with filesystem encryption. the core idea of this entire thing is however to have a counterpart on the database level. if you don't have the key you cannot start the instance and if you happen to get access to the filesystem you are still not able to fire up the DB. as it said: requirements by ever bigger companies. as far as benchmarking is concerned: i did a quick test yesterday (not with the final AES implementation yet) and i got pretty good results. with a reasonably well cached database in a typical application I expect to loose around 10-20%. if everything fits in memory there is 0 loss of course. the worst I got with the standard AES (no hardware support used yet) I lost around 45% or so. but this requires a value as low as 32 MB of shared buffers or so. many thanks, hans -- Hans-Jürgen Schönig Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 6/12/16 2:13 AM, Ants Aasma wrote: On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi wrote: > 1. Instead of doing the entire database files encryption, how about > providing user an option to protect only some particular tables that > wants the encryption at table/tablespace level. This not only provides > an option to the user, it reduces the performance impact on tables > that doesn't need any encryption. The problem with this approach > is that every xlog record needs to validate to handle the encryption > /decryption, instead of at page level. Is there a real need for this? The customers I have talked to want to encrypt the whole database and my goal is to make the feature fast enough to make that feasible for pretty much everyone. I guess switching encryption off per table would be feasible, but the key setup would still need to be done at server startup. Per record encryption would result in some additional information leakage though. Overall I thought it would not be worth it, but I'm willing to have my mind changed on this. I actually design with this in mind. Tables that contain sensitive info go into designated schemas, partly so that you can blanket move all of those to an encrypted tablespace (or safer would be to move things not in those schemas to an unencrypted tablespace). Since that can be done with an encrypted filesystem maybe that's good enough. (It's not really clear to me what this buys us over an encrypted FS, other than a feature comparison checkmark...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Sun, Jun 12, 2016 at 5:13 PM, Ants Aasma wrote: > On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi > wrote: > >> 2. Instead of depending on a contrib module for the encryption, how >> about integrating pgcrypto contrib in to the core and add that as a >> default encryption method. And also provide an option to the user >> to use a different encryption methods if needs. > > Technically that would be simple enough, this is more of a policy > decision. I think having builtin encryption provided by pgcrypto is > completely fine. If a consensus emerges that it needs to be > integrated, it would need to be a separate patch anyway. In our proprietary database, we are using the encryption methods provided by openSSL [1]. May be we can have a look at those methods provided by openSSL for the use of encryption for builds under USE_SSL. Ignore it if you have already validated. >> 5. Instead of providing passphrase through environmental variable, >> better to provide some options to pg_ctl etc. > > That looks like it would be worse from a security perspective. > Integrating a passphrase prompt would be an option, but a way for > scripts to provide passphrases would still be needed. What I felt was, if we store the passphrase in an environmental variable, a person who is having an access to the system can get the details and using that it may be possible to decrypt the data files. [1] - https://www.openssl.org/docs/manmaster/crypto/EVP_EncryptInit.html Regards, Hari Babu Fujitsu Australia -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 6/12/16 3:13 AM, Ants Aasma wrote: 5. Instead of providing passphrase through environmental variable, > better to provide some options to pg_ctl etc. That looks like it would be worse from a security perspective. Integrating a passphrase prompt would be an option, but a way for scripts to provide passphrases would still be needed. Environment variables and command-line options are visible to other processes on the machine, so neither of these approaches is really going to work. We would need some kind of integration with secure password-entry mechanisms, such as pinentry. Also note that all tools that work directly on the data directory would need password-entry and encryption/decryption support, including pg_basebackup, pg_controldata, pg_ctl, pg_receivexlog, pg_resetxlog, pg_rewind, pg_upgrade, pg_xlogdump. It seems that your implementation doesn't encrypt pg_control, thus avoiding some of that. But that doesn't seem right. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 6/7/16 9:56 AM, Ants Aasma wrote: Similar things can be achieved with filesystem level encryption. However this is not always optimal for various reasons. One of the better reasons is the desire for HSM based encryption in a storage area network based setup. Could you explain this in more detail? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Mon, Jun 13, 2016 at 5:17 AM, Michael Paquier wrote: > On Sun, Jun 12, 2016 at 4:13 PM, Ants Aasma wrote: >>> I feel separate file is better to include the key data instead of pg_control >>> file. >> >> I guess that would be more flexible. However I think at least the fact >> that the database is encrypted should remain in the control file to >> provide useful error messages for faulty backup procedures. > > Another possibility could be always to do some encryption at data-type > level for text data. For example I recalled the following thing while > going through this thread: > https://github.com/nec-postgres/tdeforpg > Though I don't quite understand the use for encrypt.enable in this > code... This has the advantage to not patch upstream. While certainly possible, this does not cover the requirements I want to satisfy - user data never gets stored on disk unencrypted without making changes to the application or schema. This seems to be mostly about separating administrator roles, specifically that centralised storage and backup administrators should not have access to database contents. I see this as orthogonal to per column encryption, which in my opinion is better done in the application. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Sun, Jun 12, 2016 at 4:13 PM, Ants Aasma wrote: >> I feel separate file is better to include the key data instead of pg_control >> file. > > I guess that would be more flexible. However I think at least the fact > that the database is encrypted should remain in the control file to > provide useful error messages for faulty backup procedures. Another possibility could be always to do some encryption at data-type level for text data. For example I recalled the following thing while going through this thread: https://github.com/nec-postgres/tdeforpg Though I don't quite understand the use for encrypt.enable in this code... This has the advantage to not patch upstream. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi wrote: > 1. Instead of doing the entire database files encryption, how about > providing user an option to protect only some particular tables that > wants the encryption at table/tablespace level. This not only provides > an option to the user, it reduces the performance impact on tables > that doesn't need any encryption. The problem with this approach > is that every xlog record needs to validate to handle the encryption > /decryption, instead of at page level. Is there a real need for this? The customers I have talked to want to encrypt the whole database and my goal is to make the feature fast enough to make that feasible for pretty much everyone. I guess switching encryption off per table would be feasible, but the key setup would still need to be done at server startup. Per record encryption would result in some additional information leakage though. Overall I thought it would not be worth it, but I'm willing to have my mind changed on this. > 2. Instead of depending on a contrib module for the encryption, how > about integrating pgcrypto contrib in to the core and add that as a > default encryption method. And also provide an option to the user > to use a different encryption methods if needs. Technically that would be simple enough, this is more of a policy decision. I think having builtin encryption provided by pgcrypto is completely fine. If a consensus emerges that it needs to be integrated, it would need to be a separate patch anyway. > 3. Currently entire xlog pages are encrypted and stored in the file. > can pg_xlogdump works with those files? Technically yes, with the patch as it stands, no. Added this to my todo list. > 4. For logical decoding, how about the adding a decoding behavior > based on the module to decide whether data to be encrypted/decrypted. The data to be encrypted does not depend on the module used, so I don't think it should be module controlled. The reorder buffer contains pretty much the same stuff as the xlog, so not encrypting it does not look like a valid choice. For logical heap rewrites it could be argued that nothing useful is leaked in most cases, but encrypting it is not hard. Just a small matter of programming. > 5. Instead of providing passphrase through environmental variable, > better to provide some options to pg_ctl etc. That looks like it would be worse from a security perspective. Integrating a passphrase prompt would be an option, but a way for scripts to provide passphrases would still be needed. > 6. I don't have any idea whether is it possible to integrate the checksum > and encryption in a single shot to avoid performance penalty. Currently no, the checksum gets stored in the page header and for any decent cipher mode the encryption of the rest of the page will depend on it. However, the performance difference should be negligible because both algorithms are compute bound for cached data. The data is very likely to be completely in L1 cache as the operations are done in quick succession. The non-cryptographic checksum algorithm could actually be an attack vector for an adversary that can trigger repeated encryption by tweaking a couple of bytes at the end of the page to see when the checksum matches and try to infer the data from that. Similarly to the CRIME attack. However the LSN stored at the beginning of the page header basically provides a nonce that makes this impossible. This also means that encryption needs to imply wal_log_hints. Will include this in the next version of the patch. >> I would also like to incorporate some database identifier as a salt in >> key setup. However, system identifier stored in control file doesn't >> fit this role well. It gets initialized somewhat too late in the >> bootstrap process, and more importantly, gets changed on pg_upgrade. >> This will make link mode upgrades impossible, which seems like a no >> go. I'm torn whether to add a new value for this purpose (perhaps >> stored outside the control file) or allow setting of system identifier >> via initdb. The first seems like a better idea, the file could double >> as a place to store additional encryption parameters, like key length >> or different cipher primitive. > > I feel separate file is better to include the key data instead of pg_control > file. I guess that would be more flexible. However I think at least the fact that the database is encrypted should remain in the control file to provide useful error messages for faulty backup procedures. Thanks for your input. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On Tue, Jun 7, 2016 at 11:56 PM, Ants Aasma wrote: > Hi all, > > I have been working on data-at-rest encryption support for PostgreSQL. > In my experience this is a common request that customers make. The > short of the feature is that all PostgreSQL data files are encrypted > with a single master key and are decrypted when read from the OS. It > does not provide column level encryption which is an almost orthogonal > feature, arguably better done client side. > > Similar things can be achieved with filesystem level encryption. > However this is not always optimal for various reasons. One of the > better reasons is the desire for HSM based encryption in a storage > area network based setup. > > Attached to this mail is a work in progress patch that adds an > extensible encryption mechanism. There are some loose ends left to tie > up, but the general concept and architecture is at a point where it's > ready for some feedback, fresh ideas and bikeshedding. Yes, encryption is really a nice and wanted feature. Following are my thoughts regarding the approach. 1. Instead of doing the entire database files encryption, how about providing user an option to protect only some particular tables that wants the encryption at table/tablespace level. This not only provides an option to the user, it reduces the performance impact on tables that doesn't need any encryption. The problem with this approach is that every xlog record needs to validate to handle the encryption /decryption, instead of at page level. 2. Instead of depending on a contrib module for the encryption, how about integrating pgcrypto contrib in to the core and add that as a default encryption method. And also provide an option to the user to use a different encryption methods if needs. 3. Currently entire xlog pages are encrypted and stored in the file. can pg_xlogdump works with those files? 4. For logical decoding, how about the adding a decoding behavior based on the module to decide whether data to be encrypted/decrypted. 5. Instead of providing passphrase through environmental variable, better to provide some options to pg_ctl etc. 6. I don't have any idea whether is it possible to integrate the checksum and encryption in a single shot to avoid performance penalty. > I would also like to incorporate some database identifier as a salt in > key setup. However, system identifier stored in control file doesn't > fit this role well. It gets initialized somewhat too late in the > bootstrap process, and more importantly, gets changed on pg_upgrade. > This will make link mode upgrades impossible, which seems like a no > go. I'm torn whether to add a new value for this purpose (perhaps > stored outside the control file) or allow setting of system identifier > via initdb. The first seems like a better idea, the file could double > as a place to store additional encryption parameters, like key length > or different cipher primitive. I feel separate file is better to include the key data instead of pg_control file. Regards, Hari Babu Fujitsu Australia -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP: Data at rest encryption
Hi all, I have been working on data-at-rest encryption support for PostgreSQL. In my experience this is a common request that customers make. The short of the feature is that all PostgreSQL data files are encrypted with a single master key and are decrypted when read from the OS. It does not provide column level encryption which is an almost orthogonal feature, arguably better done client side. Similar things can be achieved with filesystem level encryption. However this is not always optimal for various reasons. One of the better reasons is the desire for HSM based encryption in a storage area network based setup. Attached to this mail is a work in progress patch that adds an extensible encryption mechanism. There are some loose ends left to tie up, but the general concept and architecture is at a point where it's ready for some feedback, fresh ideas and bikeshedding. Usage = Set up database like so: (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; export PGENCRYPTIONKEY initdb -k -K pgcrypto $PGDATA ) Start PostgreSQL: (read -sp "Postgres passphrase: " PGENCRYPTIONKEY; echo; export PGENCRYPTIONKEY postgres $PGDATA ) Design == The patch adds a new GUC called encryption_library, when specified the named library is loaded before shared_preload_libraries and is expected to register its encryption routines. For now the API is pretty narrow, one parameterless function that lets the extension do key setup on its own terms, and two functions for encrypting/decrypting an arbitrary sized block of data with tweak. The tweak should alter the encryption function so that identical block contents are encrypted differently based on their location. The GUC needs to be set at bootstrap time, so it gets set by a new option for initdb. During bootstrap an encryption sample gets stored in the control file, enabling useful error messages. The library name is not stored in controldata. I'm not quite sure about this decision. On one hand it would be very useful to tell the user what he needs to get at his data if the configuration somehow goes missing and it would get rid of the extra GUC. On the other hand I don't really want to bloat control data, and the same encryption algorithm could be provided by different implementations. For now the encryption is done for everything that goes through md, xlog and slru. Based on a review of read/write/fread/fwrite calls this list is missing: * BufFile - needs refactoring * Logical reorder buffer serialization - probably needs a stream mode cipher API addition. * logical_heap_rewrite - can be encrypted as one big block * 2PC state data - ditto * pg_stat_statements - query texts get appended so a stream mode cipher might be needed here too. copydir needed some changes too because tablespace and database oid are included in the tweak and so copying also needs to decrypt and encrypt with the new tweak value. For demonstration purposes I imported Brian Gladman's AES-128-XTS mode implementation into pgcrypto and used an environment variable for key setup. This part is not really in any reviewable state, the XTS code needs heavy cleanup to bring it up to PostgreSQL coding standards, keysetup needs something secure, like PBKDF2 or scrypt. Performance with current AES implementation is not great, but not horrible either, I'm seeing around 2x slowdown for larger than shared_buffers, smaller than free memory workloads. However the plan is to fix this - I have a prototype AES-NI implementation that does 3GB/s per core on my Haswell based laptop (1.25 B/cycle). Open questions == The main questions is what to do about BufFile? It currently provides both unaligned random access and a block based interface. I wonder if it would be a good idea to refactor it to be fully block based under the covers. I would also like to incorporate some database identifier as a salt in key setup. However, system identifier stored in control file doesn't fit this role well. It gets initialized somewhat too late in the bootstrap process, and more importantly, gets changed on pg_upgrade. This will make link mode upgrades impossible, which seems like a no go. I'm torn whether to add a new value for this purpose (perhaps stored outside the control file) or allow setting of system identifier via initdb. The first seems like a better idea, the file could double as a place to store additional encryption parameters, like key length or different cipher primitive. Regards, Ants Aasma diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile index 18bad1a..04ce887 100644 --- a/contrib/pgcrypto/Makefile +++ b/contrib/pgcrypto/Makefile @@ -20,7 +20,7 @@ SRCS = pgcrypto.c px.c px-hmac.c px-crypt.c \ mbuf.c pgp.c pgp-armor.c pgp-cfb.c pgp-compress.c \ pgp-decrypt.c pgp-encrypt.c pgp-info.c pgp-mpi.c \ pgp-pubdec.c pgp-pubenc.c pgp-pubkey.c pgp-s2k.c \ - pgp-pgsql.c + pgp-pgsql.c xts.c MODULE_big = pgcrypto OBJS = $(SRCS:.c=.o) $(WIN32R