On Tue, Mar 28, 2023 at 12:57:42AM +0200, Stephen Frost wrote: > I consider the operating system and its processes as much more of a > single entity than TLS over a network. > > This may be the case sometimes but there’s absolutely no shortage of other > cases and it’s almost more the rule these days, that there is some kind of > network between the OS processes and the storage- a SAN, an iSCSI network, > NFS, > are all quite common.
Yes, but consider that the database cluster is having to get its data from that remote storage --- the remote storage is not an independent entity that can be corrupted without the databaes server being compromised. If everything in PGDATA was GCM-verified, it would be secure, but because some parts are not, I don't think it would be. > > As specific examples, consider: > > > > An attack against the database system where the database server is shut > down, > > or a backup, and the encryption key isn’t available on the system. > > > > The backup system itself, not running as the PG user (an option > supported > by PG > > and at least pgbackrest) being compromised, thus allowing for injection > of > > changes into a backup or into a restore. > > I then question why we are not adding encryption to pg_basebackup or > pgbackrest rather than the database system. > > Pgbackrest has encryption and authentication of it … but that doesn’t actually > address the attack vector that I outlined. If the backup user is compromised > then they can change the data before it gets to the storage. If the backup > user is compromised then they have access to whatever key is used to encrypt > and authenticate the backup and therefore can trivially manipulate the data. So the idea is that the backup user can be compromised without the data being vulnerable --- makes sense, though that use-case seems narrow. > What were the _technical_ reasons for those objections? > > I believe largely the ones I’m bringing up here and which I outline above… I > don’t mean to pretend that any of this is of my own independent construction. > I > don’t believe it is and my apologies if it came across that way. Yes, there is value beyond the check-box, but in most cases those values are limited considering the complexity of the features, and the check-box is what most people are asking for, I think. > > I’ve grown weary of this argument as the other major piece of work it > was > > routinely applied to was RLS and yet that has certainly been seen > broadly > as a > > beneficial feature with users clearly leveraging it and in more than > some > > “checkbox” way. > > RLS has to overcome that objection, and I think it did, as was better > for doing that. > > Beyond it being called a checkbox - what were the arguments against it? I The RLS arguments were that queries could expoose some of the underlying data, but in summary, that was considered acceptable. > > We, as a community, are clearly losing value by lack of this capability, > if by > > no other measure than simply the numerous users of the commercial > > implementations feeling that they simply can’t use PG without this > feature, for > > whatever their reasoning. > > That is true, but I go back to my concern over useful feature vs. check > box. > > While it’s easy to label something as checkbox, I don’t feel we have been fair No, actually, it isn't. I am not sure why you are saying that. > to our users in doing so as it has historically prevented features which our > users are demanding and end up getting from commercial providers until we > implement them ultimately anyway. This particular argument simply doesn’t > seem > to actually hold the value that proponents of it claim, for us at least, and > we > have clear counter-examples which we can point to and I hope we learn from > those. I don't think you are addressing actual issues above. -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com Embrace your flaws. They make you human, rather than perfect, which you will never be.