Re: Proposed patch for key management

2021-01-05 Thread Alastair Turner
On Mon, 4 Jan 2021 at 17:56, Bruce Momjian  wrote:
>
> On Sat, Jan  2, 2021 at 12:47:19PM +, Alastair Turner wrote:
> >
> > There is also a further validation task - probably beyond the scope of
> > the key management patch and into the encryption patch[es] territory -
> > checking that the keys supplied are the same keys in use for the data
> > currently on disk. It feels to me like this should be done at startup,
> > rather than as each file is accessed, which could make startup quite
> > slow if there are a lot of keys with narrow scope.
>
> We do that already on startup by using GCM to validate the  KEK when
> encrypting each DEK.
>
Which validates two things - that the KEK is the same one which was
used to encrypt the DEKs (instead of returning garbage plaintext when
given a garbage key), and that the DEKs have not been tampered with at
rest. What it does not check is that the DEKs are the keys used to
encrypt the data, that one has not been copied or restored independent
of the other.




Re: Proposed patch for key management

2021-01-05 Thread Alastair Turner
Hi Bruce

On Mon, 4 Jan 2021 at 18:23, Bruce Momjian  wrote:
>
> On Fri, Jan  1, 2021 at 06:26:36PM +, Alastair Turner wrote:
> > After the long intro, my question - If using a standard format,
> > managed by a library, for the internal keystore does not result in a
> > smaller or simpler patch, is there enough other value for this to be
> > worth considering? For security features, standards and building on
> > well exercised code bases do really have value, but is it enough...
>
> I know Stephen Frost looked at PKCS12 but didn't see much value in it
> since the features provided by PKCS12 didn't seem to add much for
> Postgres, and added complexity...

Yeah, OpenSSL 1.1's PKCS12 implementation doesn't offer much for
managing arbitrary secrets. Building a reasonable abstraction over the
primitives it does provide requires a non-trivial amount of code.

> ... Using GCM for key wrapping, heap/index
> files, and WAL seemed simple.

Having Postgres take on the task of wrapping the keys and deriving the
wrapping key requires less code than getting a library to do it, sure.
It also expands the functional scope of Postgres, what Postgres is
responsible for. This isn't what libraries should ideally do for
software development, but it's not uncommon and it's very fertile
ground for some of the discussion which has dogged this feature's
development.

There's a question of whether the new scope should be taken on at all.
Fabien expressed a strong opinion that it should not be earlier in the
thread. Even if you don't take an absolute view on the additional
scope, lines of code do not all create equal risk. Ensuring the
confidentiality of an encryption key is far higher stakes code than
serialising and deserialising the key which is being secured and
stored by a library or external provider. Which is why I like the idea
of having OpenSSL (and, at some point, nss) do the KDF, wrapping and
storage bit, and have been hacking on some code in that direction.

If all that Postgres is doing is the serde, that's also something
which can objectively be said to be right or wrong. Key derivation and
wrapping are only ever "strong enough for the moment" or "in line with
policy/best practice". Which brings us to flexibility.

> There is all sorts of flexibility being proposed:
>
> *  scope of keys
> *  encryption method
> *  encryption mode
> *  internal/external
>
> Some of this is related to wrapping the data keys, and some of it gets
> to the encryption of cluster files. I just don't see how anything with
> the level of flexibility being asked for could ever be reliable or
> simple to administer,...

Externalising the key wrap and its KDF (either downward to OpenSSL or
to a separate process) makes the related points of flexibility
something else's problem. OpenSSL and the like have put a lot of
effort into making these things configurable and building the APIs
(mainly PCKS11 and KMIP) for handing the functions off to dedicated
hardware. Many or most organisations requiring that sort of
integration will have those complexities budgeted in already.

>... and I don't see the security value of that
> flexibility.  I feel at a certain point, we just need to choose the best
> options and move forward.
>
> I thought this had all been debated, but people continue to show up and
> ask for it to be debated again.  At one level, these discussions are
> valuable, but at a certain point you end up re-litigate this that you
> get nothing else done.  I know some people who have given up working on
> this feature for this reason.

Apologies for covering the PCKS12 ground again, I really did look for
any on-list and wiki references to the topic.

> I am not asking we stop discussing it, but I do ask we address the
> negatives of suggestions, especially when the negatives have already
> been clearly stated.

The negatives are increasing the amount of code involved in delivering
TDE and, implicitly, pushing back the delivery of TDE. I'm not
claiming that they aren't valid. I was hoping that PCKS12 would
deliver some benefits on the code side, but a bit of investigation
showed that wasn't the case.

To offset these, I believe that taking a slightly longer route now
(and I am keen to do some of that slog) has significant benefits in
getting Postgres (or at least the core server processes) out of the
business of key derivation and wrapping. Not just the implementation
detail, but also decisions about what is "good enough" - the KDF
worries me particularly in this regard. With my presales techie hat
on, I'm very aware that everyone's good enough is slightly different,
and not just on one axis. Using library functions (let's say OpenSSL)
will provide a range of flexibility to fit these varied boundaries, at
no incremental cost beyond the library integration, and avoid any
implementation decisions becoming a lightning rod for objections to
Postgres.

Regards
Alastair




Re: Proposed patch for key management

2021-01-04 Thread Neil Chen
On Tue, Jan 5, 2021 at 10:18 AM Bruce Momjian  wrote:

> On Fri, Jan  1, 2021 at 06:26:36PM +, Alastair Turner wrote:
>
> There is all sorts of flexibility being proposed:
>
> *  scope of keys
> *  encryption method
> *  encryption mode
> *  internal/external
>
> Some of this is related to wrapping the data keys, and some of it gets
> to the encryption of cluster files.  I just don't see how anything with
> the level of flexibility being asked for could ever be reliable or
> simple to administer, and I don't see the security value of that
> flexibility.  I feel at a certain point, we just need to choose the best
> options and move forward.
>
> +1.

I thought this had all been debated, but people continue to show up and
> ask for it to be debated again.  At one level, these discussions are
> valuable, but at a certain point you end up re-litigate this that you
> get nothing else done.  I know some people who have given up working on
> this feature for this reason.
>
> I am not asking we stop discussing it, but I do ask we address the
> negatives of suggestions, especially when the negatives have already
> been clearly stated.
>
> Well, in fact, because the discussion about TDE/KMS has several very long
threads, maybe not everyone can read them completely first, and inevitably,
there will be topics that have already been discussed.

At least for the initial version, I think we should pay more attention to
whether the currently provided functions are acceptable, instead of always
staring at what it lacks, and how it should be more flexible.

For basic KMS functions, we need to ensure its stability and safety. Use a
more flexible API to support different encryption algorithms. Yes, it does
look more powerful, but it does not see much value for stability and
security. Regardless of whether we want to do this or not, but I think this
can at least not be discussed in the first version. We will not discuss
whether it is safer to use more keys, or even introduce HSM management. We
only evaluate whether this design can resist attacks under our Threat_models

as the initial version.

Of course, the submission of a feature needs to accept the opinions of many
people, but for a large and complex feature, I think that dividing the
stage to limit the scope of the discussion will help us avoid getting into
endless disputes.

-- 
There is no royal road to learning.
HighGo Software Co.


Re: Proposed patch for key management

2021-01-04 Thread Bruce Momjian
On Fri, Jan  1, 2021 at 06:26:36PM +, Alastair Turner wrote:
> After the long intro, my question - If using a standard format,
> managed by a library, for the internal keystore does not result in a
> smaller or simpler patch, is there enough other value for this to be
> worth considering? For security features, standards and building on
> well exercised code bases do really have value, but is it enough...

I know Stephen Frost looked at PKCS12 but didn't see much value in it
since the features provided by PKCS12 didn't seem to add much for
Postgres, and added complexity.  Using GCM for key wrapping, heap/index
files, and WAL seemed simple.

FYI, the current patch has an initdb option --copy-encryption-key, which
copies the key from an old cluster, for use by pg_upgrade.  Technically
you could create a directory called pg_cryptokey/live, put wrapped files
0/1 in there, and use that when initializing a new cluster.  That would
allow you to generate the data keys using your own random source.  I am
not sure how useful that would be, but we could document this ability.

There is all sorts of flexibility being proposed:

*  scope of keys
*  encryption method
*  encryption mode
*  internal/external

Some of this is related to wrapping the data keys, and some of it gets
to the encryption of cluster files.  I just don't see how anything with
the level of flexibility being asked for could ever be reliable or
simple to administer, and I don't see the security value of that
flexibility.  I feel at a certain point, we just need to choose the best
options and move forward.

I thought this had all been debated, but people continue to show up and
ask for it to be debated again.  At one level, these discussions are
valuable, but at a certain point you end up re-litigate this that you
get nothing else done.  I know some people who have given up working on
this feature for this reason.

I am not asking we stop discussing it, but I do ask we address the
negatives of suggestions, especially when the negatives have already
been clearly stated.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key management

2021-01-04 Thread Bruce Momjian
On Sat, Jan  2, 2021 at 12:47:19PM +, Alastair Turner wrote:
> If keys can have arbitrary scope, then the pg backend won't know what
> to ask for. So the API becomes even simpler with no specific request
> on stdin and all the relevant keys on stdout. I generally like this
> approach as well, and it will be the only option for some
> integrations. On the other hand, there is an advantage to having the
> key retrieval piece of key management in-process - the keys are not
> being passed around in plain.
> 
> There is also a further validation task - probably beyond the scope of
> the key management patch and into the encryption patch[es] territory -
> checking that the keys supplied are the same keys in use for the data
> currently on disk. It feels to me like this should be done at startup,
> rather than as each file is accessed, which could make startup quite
> slow if there are a lot of keys with narrow scope.

We do that already on startup by using GCM to validate the  KEK when
encrypting each DEK.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key management

2021-01-04 Thread Bruce Momjian
On Sat, Jan  2, 2021 at 10:50:15AM +0100, Fabien COELHO wrote:
> > No, this isn't anything like the Kerberos model and there's no case
> > where the PG account won't have access to the DEK (data encryption key)
> > during normal operation (except with the possibility of off-loading to a
> > hardware crypto device which, while interesting, is definitely something
> > that's of very limited utility and which could be added on as a
> > capability later anyway.  This is also well understood by those who are
> > interested in this capability and it's perfectly acceptable that PG will
> > have, in memory, the DEK.
> 
> I'm sorry that I'm not very clear. My ranting is having a KEK "master key"
> in pg memory. I think I'm fine at this point with having DEKs available on
> the host to code/decode files. What I meant about kerberos is that the setup
> I was describing was making the database dependent on a remote host, which
> is somehow what keroberos does.

We bzero the master key once we are done with it in the server.  Why is
having the master key in memory for a short time while a problem while
having the data keys always in memory acceptable?  Aren't the data keys
more valuable, and hence they fact the master key is in memory for a
short time no additional risk?

> > The keys which are stored on disk with the proposed patch are encrypted
> > using a KEK which is external to PG- that's all part of the existing
> > patch and design.
> 
> Yep. My point is that the patch hardwires a cryptographic design with many
> assumptions, whereas I think it should design an API compatible with a
> larger class of designs, and possibly implement one with KEK and so on, I'm
> fine with that.

You are not addressing my complexity/fragility reasons for having a
single method.  As stated above, this feature has limited usefulness,
and if it breaks or you lose the KEK, your database server and backups
might be gone.

> > so it's unclear to me what the argument here is.
> 
> The argument is that professional cryptophers do wrong designs, and I do not
> see why you would do different. So instead of hardwiring choice that you
> think are THE good ones, ISTM that pg should design a reasonably flexible
> API, and also provide an implementation, obviously.

See above.

> > Further, we haven't even gotten to actual cluster encryption yet- this
> > is all just the key management side of things,
> 
> With hardwired choices about 1 KEK and 2 DEK that are debatable, see below.

Then let's debate them, since this patch isn't moving forward as long as
people are asking questions about the implementation.

> > > I'm not sure of the design of the key rotation stuff as I recall it from 
> > > the
> > > patches I looked at, but the API design looks like the more pressing 
> > > issue.
> > > First, the mere existence of an "master key" is a cryptographic choice 
> > > which
> > > is debatable, because it creates issues if one must be able to change it,
> > > hence the whole "key rotation" stuff. ISTM that what core needs to know is
> > > that one particular key is changed by a new one and the underlying file is
> > > rewritten. It should not need to know anything about master keys and key
> > > derivation at all. It should need to know that the user wants to change 
> > > file
> > > keys, though.
> > 
> > No, it's not debatable that a KEK is needed, I disagree entirely.
> 
> ISTM that there is cryptographic design which suits your needs and you seem
> to decide that it is the only possible way to do it It seems that you know.
> As a researcher, I know that I do not know:-) As a software engineer, I'm
> trying to suggest enabling more with the patch, including not hardwiring a
> three-key management scheme.

Well, your open API goal might work, but I am not comfortable
implementing that for reasons already explained, so what do we do?

> I'm advocating that you pg not decide for others what is best for them, but
> rather allow multiple choices and implementations through an API. Note that
> I'm not claiming that the 1 KEK + 2 DEK + AES… design is bad. I'm claiming
> that it is just one possible design, quite peculiar though, and that pg
> should not hardwire it, and it does not need to anyway.
> 
> The documentation included in the key management patch states: "Key zero is
> the key used to encrypt database heap and index files which are stored in
> the file system, plus temporary files created during database operation."
> Read as is, it suggests that the same key is used to encrypt many files.
> From a cryptographic point of view this looks like a bad idea. The mode used
> is unclear. If this is a stream cypher generated in counter mode, it could
> be a very bad idea. Hopefully this is not what is intended, but the included
> documentation is frankly misleading in that case. IMHO there should be one
> key per file. Also, the CTR mode should be avoided if possible because it
> has quite special properties, unless these properties are absolutely
> 

Re: Proposed patch for key management

2021-01-02 Thread Alastair Turner
Hi Fabien

On Sat, 2 Jan 2021 at 09:50, Fabien COELHO  wrote:
>
...
> ISTM that pg at the core level should (only) directly provide:
>
> (1) a per-file encryption scheme, with loadable (hook-replaceable
> functions??) to manage pages, maybe:
>
>encrypt(page_id, *key, *clear_page, *encrypted_page);
>decrypt(page_id, *key, *encrypted_page, *clear_page);
>
> What encrypt/decrypt does is beyond pg core stuff. Ok, a reasonable
> implementation should be provided, obviously, possibly in core. There may
> be some block-size issues if not full pages are encrypted, so maybe the
> interface needs to be a little more subtle.
>

There are a lot of specifics of the encryption implementation which
need to be addressed in future patches. This patch focuses on making
keys available to the encryption processes at run-time, so ...

>
> (2) offer a key management scheme interface, to manage *per-file* keys,
> possibly loadable (hook replaceable?). If all files have the same key,
> which is stored in a directory and encoded with a KEK, this is just one
> (debatable) implementation choice. For that, ISTM that what is needed at
> this level is:
>
>get_key(file_id (relative name? oid? 8 or 16 bytes something?));
>

Per-cluster keys for permanent data and WAL allow a useful level of
protection, even if it could be improved upon. It's also going to be
quicker/simpler to implement, so any API should allow for it. If
there's an arbitrary number of DEK's, using a scope label for
accessing them sounds right, so "WAL", "local_data",
"local_data/tablespaceiod" or "local_data/dboid/tableoid".

>
...
> (3) ISTM that the key management interface should be external, or at least
> it should be possible to make it external easily. I do not think that
> there is a significant performance issue because keys are needed once, and
> once loaded they are there. A simple way to do that is a separate process
> with a basic protocol on stdin/stdout to implement "get_key", which is
> basically already half implemented in the patch for retrieving the KEK.
>

If keys can have arbitrary scope, then the pg backend won't know what
to ask for. So the API becomes even simpler with no specific request
on stdin and all the relevant keys on stdout. I generally like this
approach as well, and it will be the only option for some
integrations. On the other hand, there is an advantage to having the
key retrieval piece of key management in-process - the keys are not
being passed around in plain.

There is also a further validation task - probably beyond the scope of
the key management patch and into the encryption patch[es] territory -
checking that the keys supplied are the same keys in use for the data
currently on disk. It feels to me like this should be done at startup,
rather than as each file is accessed, which could make startup quite
slow if there are a lot of keys with narrow scope.

Regards
Alastair




Re: Proposed patch for key management

2021-01-02 Thread Fabien COELHO


Hello Stephen,


I'm unsure about what is the "common use case". ISTM that having the
postgres process holding the master key looks like a "no" for me in any use
case with actual security concern: as the database must be remotely
accessible, a reasonable security assumption is that a pg account could be
compromised, and the "master key" (if any, that is just one particular
cryptographic design) should not be accessible in that case. The first
barrier would be pg admin account. With external, the options are that the
key is hold by another id, so the host must be compromised as well, and
could even be managed remotely on another host (I agree that this later
option would be fragile because of the implied network connection, but it
would be a tradeoff depending on the security concerns, but it pretty much
correspond to the kerberos model).


No, this isn't anything like the Kerberos model and there's no case
where the PG account won't have access to the DEK (data encryption key)
during normal operation (except with the possibility of off-loading to a
hardware crypto device which, while interesting, is definitely something
that's of very limited utility and which could be added on as a
capability later anyway.  This is also well understood by those who are
interested in this capability and it's perfectly acceptable that PG will
have, in memory, the DEK.


I'm sorry that I'm not very clear. My ranting is having a KEK "master key" 
in pg memory. I think I'm fine at this point with having DEKs available on 
the host to code/decode files. What I meant about kerberos is that the 
setup I was describing was making the database dependent on a remote host, 
which is somehow what keroberos does.



The keys which are stored on disk with the proposed patch are encrypted
using a KEK which is external to PG- that's all part of the existing
patch and design.


Yep. My point is that the patch hardwires a cryptographic design with many 
assumptions, whereas I think it should design an API compatible with a 
larger class of designs, and possibly implement one with KEK and so on, 
I'm fine with that.



Adding a pre-defined system will not prevent future work on a
completely external option.


Yes and no.

Having two side-by-side cluster-encryption scheme in core, the first that
could be implemented on top of the second without much effort, would look
pretty weird. Moreover, required changes in core to allow an API are the
very same as the one in this patch. The other concern I have with doing the
cleaning work afterwards is that the API would be somehow in the middle of
the existing functions if it has not been thought of before.


This has been considered and the functions which are proposed are
intentionally structured to allow for multiple implementations already,


Does it? This is unclear to me from looking at the code, and the limited
documentation does not point to that. I can see that the "kek provider" 
and the "random provider" can be changed, but the overall cryptographic 
design seems hardwired.



so it's unclear to me what the argument here is.


The argument is that professional cryptophers do wrong designs, and I do 
not see why you would do different. So instead of hardwiring choice that 
you think are THE good ones, ISTM that pg should design a reasonably 
flexible API, and also provide an implementation, obviously.


Further, we haven't even gotten to actual cluster encryption yet- this 
is all just the key management side of things,


With hardwired choices about 1 KEK and 2 DEK that are debatable, see 
below.


which is absolutely a necessary and important part but argueing that it 
doesn't address the cluster encryption approach in core simply doesn't 
make any sense as that's not a part of this patch.


Let's have that discussion when we actually get to the point of talking
about cluster encryption.


I know archive_command is completely external, but that is much simpler
than what would need to be done for key management, key rotation, and key
verification.


I'm not sure of the design of the key rotation stuff as I recall it from the
patches I looked at, but the API design looks like the more pressing issue.
First, the mere existence of an "master key" is a cryptographic choice which
is debatable, because it creates issues if one must be able to change it,
hence the whole "key rotation" stuff. ISTM that what core needs to know is
that one particular key is changed by a new one and the underlying file is
rewritten. It should not need to know anything about master keys and key
derivation at all. It should need to know that the user wants to change file
keys, though.


No, it's not debatable that a KEK is needed, I disagree entirely.


ISTM that there is cryptographic design which suits your needs and you 
seem to decide that it is the only possible way to do it It seems that you 
know. As a researcher, I know that I do not know:-) As a software 
engineer, I'm trying to suggest enabling more with the 

Re: Proposed patch for key management

2021-01-01 Thread Alastair Turner
Hi Stephen, Bruce, Fabien

On Thu, 31 Dec 2020 at 17:05, Stephen Frost  wrote:
>
> Greetings,
>
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Thu, Dec 31, 2020 at 11:11:11AM +0100, Fabien COELHO wrote:
> > > > I am not sure what else I can add to this discussion.  Having something
> > > > that is completely external might be a nice option, but I don't think it
> > > > is the common use-case, and will make the most common use cases more
> > > > complex.
> > >
> > > I'm unsure about what is the "common use case". ISTM that having the
> > > postgres process holding the master key looks like a "no" for me in any 
> > > use
> > > case with actual security concern: as the database must be remotely
> > > accessible, a reasonable security assumption is that a pg account could be
> > > compromised, and the "master key" (if any, that is just one particular
> > > cryptographic design) should not be accessible in that case. The first
> > > barrier would be pg admin account.
> >
> > Let's unpack this logic, since I know others, like Alastair Turner (CC
> > added), had similar concerns.  Frankly, I feel this feature has limited
> > security usefulness, so if we don't feel it has sufficient value, let's
> > mark it as not wanted and move on.
>
> Considering the amount of requests for this, I don't feel it's at all
> reasonable to suggest that it's 'not wanted'.
>

FWIW, I think that cluster file encryption has great value, even if we
only consider its incremental value on top of disk or volume
encryption. Particularly in big IT estates where monitoring tools,
backup managers, etc have read access to a lot of mounted volumes.
Limiting the impact of these systems (or their credentials) being
compromised is definitely useful.

>
> > I think there are two basic key configurations:
> >
> > 1.  pass data encryption keys in from an external source
> > 2.  store the data encryption keys wrapped by a key encryption key (KEK)
> > passed in from an external source
>
> I agree those are two basic configurations (though they aren't the only
> possible ones).
>

If we rephrase these two in terms of what the server process does to
get a key, we could say

 1. Get the keys from another process at startup (and maybe reload) time
 2. Get the wrapped keys from a local file...
 2.a. directly and unwrap them using the server's own logic and user prompt
 2.b. via a library which manages wrapping and user prompting

With the general feeling in the mailing list discussions favouring an
approach in the #2 family, I have been looking at options around 2b. I
was hoping to provide some flexibility for users without adding
complexity to the pg code base and use some long standing,
standardised, features rather than reinventing or recoding the wheel.
It has not been a complete success, or failure.

The nearest thing to a standard for wrapped secret store files is
PKCS#12. I searched and cannot find it discussed on-list in this
context before. It is the format used for Java local keystores in
recent versions. The format is supported, up to a point, by OpenSSL,
nss and gnuTLS. OpenSSL also has command line support for many
operations on the files, including updating the wrapping password.

Content in the PKCS12 files is stored in typed "bags". The challenge
is that the OpenSSL command line tools currently support only the
content types related to TLS operations - certificates, public or
private keys, CRLs, ... and not the untyped secret bag which is
appropriate for storing AES keys. The c APIs will work with this
content but there are not as many helper functions as for other
content types.

For future information - the nss command line tools understand the
secret bags, but don't offer nearly so many options for configuration.

For building something now - the OpenSSL pkcs12 functions provide
features like tags and friendly names for keys, and configurable key
derivation from passwords, but the code to interact with the
secretBags is accumulating quickly in my prototype.

After the long intro, my question - If using a standard format,
managed by a library, for the internal keystore does not result in a
smaller or simpler patch, is there enough other value for this to be
worth considering? For security features, standards and building on
well exercised code bases do really have value, but is it enough...

Regards
Alastair




Re: Proposed patch for key management

2020-12-31 Thread Joshua Drake
>
>
> > >I will say that if the community feels external-only should be the only
> > >option, I will stop working on this feature because I feel the result
> > >would be too fragile to be reliable,
> >
> > I'm do not see why it would be the case. I'm just arguing to have key
> > management in a separate, possibly suid something-else, process, which
> given
> > the security concerns which dictates the feature looks like a must have,
> or
> > at least must be possible. From a line count point of view, it should be
> a
> > small addition to the current code.
>
> All of this hand-waving really isn't helping.
>
> If it's a small addition to the current code then it'd be fantastic if
> you'd propose a specific patch which adds what you're suggesting.  I
> don't think either Bruce or I would have any issue with others helping
> out on this effort, but let's be clear- we need something that *is* part
> of core PG, even if we have an ability to have other parts exist outside
> of PG.
>

+1

JD


Re: Proposed patch for key management

2020-12-31 Thread Joshua Drake
On Wed, Dec 30, 2020 at 3:47 PM Bruce Momjian  wrote:

>
> I will say that if the community feels external-only should be the only
> option, I will stop working on this feature because I feel the result
> would be too fragile to be reliable, and I would not want to be
> associated with it.
>
>
I can say that the people that I work with would prefer an "officially"
supported mechanism from Postgresql.org. The use of a generic API would be
useful for the ecosystem who wants to build on core functionality but
Postgresql should have competent built-in support as well.

JD

>
>


Re: Proposed patch for key management

2020-12-31 Thread Bruce Momjian
On Thu, Dec 31, 2020 at 12:05:53PM -0500, Stephen Frost wrote:
> > Let's unpack this logic, since I know others, like Alastair Turner (CC
> > added), had similar concerns.  Frankly, I feel this feature has limited
> > security usefulness, so if we don't feel it has sufficient value, let's
> > mark it as not wanted and move on.
> 
> Considering the amount of requests for this, I don't feel it's at all
> reasonable to suggest that it's 'not wanted'.

Well, people ask for (or used to ask for) query hints, so request count
and usefulness aren't always correlated.  I brought up this point
because people sometimes want to add complexity to try and guard against
some exploit that is impossible to guard against, so I thought we should
be clear what we can't protect, no matter how we configure it.

> > When using AES256 with GCM (for verification) is #1 more secure than #2?
> > I don't think so.  If the Postgres account is compromised, they can grab
> > the data encryption keys as the come in from the external script (#1),
> > or when they are decrypted by the KEK (#2), or while they are in shared
> > memory while the server is running.  If the postgres account is not
> > compromised, I don't think it is any easier to get the data key wrapped
> > by a KEK than it is to try decrypting the actual heap/index/WAL blocks.
> > (Once you find the key for one, they would all decrypt.)
> 
> I can see some usefulness for supporting #1, but that has got next to
> nothing to do with what this patch is about and is all about the *next*
> step, which is to actually look at supporting encryption of the rest of
> the cluster, beyond just the keys.  We need to get past this first step
> though, and it seems to be nearly impossible to do so, which has
> frustrated multiple attempts to actually accomplish anything here.
> 
> I agree that none of these approaches address an online compromise of
> the postgres account.  Thankfully, that isn't actually what this is
> intended to address and to talk about that case is just a distraction
> that isn't solvable and wastes time.

I assume people don't want the data keys stored in the file system, even
in encrypted form, and they believe this makes the system more secure.  I
don't think it does, and I think it makes external key rotation very
difficult, among other negatives.  Also, keep in mind any failure of the
key management system could make the cluster unreadable, so making it as
simple/error-free as possible is a big requirement for me.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key management

2020-12-31 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Thu, Dec 31, 2020 at 11:11:11AM +0100, Fabien COELHO wrote:
> > > I am not sure what else I can add to this discussion.  Having something
> > > that is completely external might be a nice option, but I don't think it
> > > is the common use-case, and will make the most common use cases more
> > > complex.
> > 
> > I'm unsure about what is the "common use case". ISTM that having the
> > postgres process holding the master key looks like a "no" for me in any use
> > case with actual security concern: as the database must be remotely
> > accessible, a reasonable security assumption is that a pg account could be
> > compromised, and the "master key" (if any, that is just one particular
> > cryptographic design) should not be accessible in that case. The first
> > barrier would be pg admin account.
> 
> Let's unpack this logic, since I know others, like Alastair Turner (CC
> added), had similar concerns.  Frankly, I feel this feature has limited
> security usefulness, so if we don't feel it has sufficient value, let's
> mark it as not wanted and move on.

Considering the amount of requests for this, I don't feel it's at all
reasonable to suggest that it's 'not wanted'.

> I think there are two basic key configurations:
> 
> 1.  pass data encryption keys in from an external source
> 2.  store the data encryption keys wrapped by a key encryption key (KEK)
> passed in from an external source

I agree those are two basic configurations (though they aren't the only
possible ones).

> The current code implements #2 because it simplifies administration,
> checking, external key (KEK) rotation, and provides good error reporting
> when something goes wrong.  For example, with #1, there is no way to
> rotate the externally-stored key except reencrypting the entire cluster.

Right, and there also wouldn't be a very simple way to actually get PG
going with a single key or a passphrase.

> When using AES256 with GCM (for verification) is #1 more secure than #2?
> I don't think so.  If the Postgres account is compromised, they can grab
> the data encryption keys as the come in from the external script (#1),
> or when they are decrypted by the KEK (#2), or while they are in shared
> memory while the server is running.  If the postgres account is not
> compromised, I don't think it is any easier to get the data key wrapped
> by a KEK than it is to try decrypting the actual heap/index/WAL blocks.
> (Once you find the key for one, they would all decrypt.)

I can see some usefulness for supporting #1, but that has got next to
nothing to do with what this patch is about and is all about the *next*
step, which is to actually look at supporting encryption of the rest of
the cluster, beyond just the keys.  We need to get past this first step
though, and it seems to be nearly impossible to do so, which has
frustrated multiple attempts to actually accomplish anything here.

I agree that none of these approaches address an online compromise of
the postgres account.  Thankfully, that isn't actually what this is
intended to address and to talk about that case is just a distraction
that isn't solvable and wastes time.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key management

2020-12-31 Thread Bruce Momjian
On Thu, Dec 31, 2020 at 11:11:11AM +0100, Fabien COELHO wrote:
> > I am not sure what else I can add to this discussion.  Having something
> > that is completely external might be a nice option, but I don't think it
> > is the common use-case, and will make the most common use cases more
> > complex.
> 
> I'm unsure about what is the "common use case". ISTM that having the
> postgres process holding the master key looks like a "no" for me in any use
> case with actual security concern: as the database must be remotely
> accessible, a reasonable security assumption is that a pg account could be
> compromised, and the "master key" (if any, that is just one particular
> cryptographic design) should not be accessible in that case. The first
> barrier would be pg admin account.

Let's unpack this logic, since I know others, like Alastair Turner (CC
added), had similar concerns.  Frankly, I feel this feature has limited
security usefulness, so if we don't feel it has sufficient value, let's
mark it as not wanted and move on.

I think there are two basic key configurations:

1.  pass data encryption keys in from an external source
2.  store the data encryption keys wrapped by a key encryption key (KEK)
passed in from an external source

The current code implements #2 because it simplifies administration,
checking, external key (KEK) rotation, and provides good error reporting
when something goes wrong.  For example, with #1, there is no way to
rotate the externally-stored key except reencrypting the entire cluster.

When using AES256 with GCM (for verification) is #1 more secure than #2?
I don't think so.  If the Postgres account is compromised, they can grab
the data encryption keys as the come in from the external script (#1),
or when they are decrypted by the KEK (#2), or while they are in shared
memory while the server is running.  If the postgres account is not
compromised, I don't think it is any easier to get the data key wrapped
by a KEK than it is to try decrypting the actual heap/index/WAL blocks.
(Once you find the key for one, they would all decrypt.)

> With external, the options are that the
> key is hold by another id, so the host must be compromised as well, and
> could even be managed remotely on another host (I agree that this later
> option would be fragile because of the implied network connection, but it
> would be a tradeoff depending on the security concerns, but it pretty much
> correspond to the kerberos model).

We don't store the data encryption keys raw on the server, but rather
wrap them with a KEK.  If the external keystore is compromised using #1,
you can't rotate those keys without reencrypting the entire cluster.

> > Adding a pre-defined system will not prevent future work on a
> > completely external option.
> 
> Yes and no.
> 
> Having two side-by-side cluster-encryption scheme in core, the first that
> could be implemented on top of the second without much effort, would look
> pretty weird. Moreover, required changes in core to allow an API are the
> very same as the one in this patch. The other concern I have with doing the
> cleaning work afterwards is that the API would be somehow in the middle of
> the existing functions if it has not been thought of before.

The question is whether there is a common use-case, and if so, do we
implement that first, with all the safeguards, and then allow others to
customize?  I don't want to optimize for the rare case if that makes the
common case more error-prone.

> > I know archive_command is completely external, but that is much simpler
> > than what would need to be done for key management, key rotation, and
> > key verification.
> 
> I'm not sure of the design of the key rotation stuff as I recall it from the
> patches I looked at, but the API design looks like the more pressing issue.
> First, the mere existence of an "master key" is a cryptographic choice which
> is debatable, because it creates issues if one must be able to change it,
> hence the whole "key rotation" stuff. ISTM that what core needs to know is

Well, you would want to change the external-stored key independent of
the data encryption keys, which are harder to change.

> > I will say that if the community feels external-only should be the only
> > option, I will stop working on this feature because I feel the result
> > would be too fragile to be reliable,
> 
> I'm do not see why it would be the case. I'm just arguing to have key
> management in a separate, possibly suid something-else, process, which given
> the security concerns which dictates the feature looks like a must have, or
> at least must be possible. From a line count point of view, it should be a
> small addition to the current code.

See above.  Please explain the security value of this complexity.

> > and I would not want to be associated with it.
> 
> You do as you want. I'm no one, you can ignore me and proceed to commit
> whatever you want. I'm only advising to the best of my ability, what I 

Re: Proposed patch for key management

2020-12-31 Thread Stephen Frost
Greetings,

* Fabien COELHO (coe...@cri.ensmp.fr) wrote:
> >>The API should NOT make assumptions about the cryptographic design, what
> >>depends about what, where things are stored… ISTM that Pg should only care
> >>about naming keys, holding them when created/retrieved (but not create
> >>them), basically interacting with the key manager, passing the stuff to
> >>functions for encryption/decryption seen as black boxes.
> >>
> >>I may have suggested something along these lines at the beginning of the key
> >>management thread, probably. Not going this way implicitely implies making
> >>some assumptions which may or may not suit other use cases, so makes them
> >>specific not generic. I do not think pg should do that.
> >
> >I am not sure what else I can add to this discussion.  Having something
> >that is completely external might be a nice option, but I don't think it
> >is the common use-case, and will make the most common use cases more
> >complex.
> 
> I'm unsure about what is the "common use case". ISTM that having the
> postgres process holding the master key looks like a "no" for me in any use
> case with actual security concern: as the database must be remotely
> accessible, a reasonable security assumption is that a pg account could be
> compromised, and the "master key" (if any, that is just one particular
> cryptographic design) should not be accessible in that case. The first
> barrier would be pg admin account. With external, the options are that the
> key is hold by another id, so the host must be compromised as well, and
> could even be managed remotely on another host (I agree that this later
> option would be fragile because of the implied network connection, but it
> would be a tradeoff depending on the security concerns, but it pretty much
> correspond to the kerberos model).

No, this isn't anything like the Kerberos model and there's no case
where the PG account won't have access to the DEK (data encryption key)
during normal operation (except with the possibility of off-loading to a
hardware crypto device which, while interesting, is definitely something
that's of very limited utility and which could be added on as a
capability later anyway.  This is also well understood by those who are
interested in this capability and it's perfectly acceptable that PG will
have, in memory, the DEK.

The keys which are stored on disk with the proposed patch are encrypted
using a KEK which is external to PG- that's all part of the existing
patch and design.

> >Adding a pre-defined system will not prevent future work on a
> >completely external option.
> 
> Yes and no.
> 
> Having two side-by-side cluster-encryption scheme in core, the first that
> could be implemented on top of the second without much effort, would look
> pretty weird. Moreover, required changes in core to allow an API are the
> very same as the one in this patch. The other concern I have with doing the
> cleaning work afterwards is that the API would be somehow in the middle of
> the existing functions if it has not been thought of before.

This has been considered and the functions which are proposed are
intentionally structured to allow for multiple implementations already,
so it's unclear to me what the argument here is.  Further, we haven't
even gotten to actual cluster encryption yet- this is all just the key
management side of things, which is absolutely a necessary and important
part but argueing that it doesn't address the cluster encryption
approach in core simply doesn't make any sense as that's not a part of
this patch.

Let's have that discussion when we actually get to the point of talking
about cluster encryption.

> >I know archive_command is completely external, but that is much simpler
> >than what would need to be done for key management, key rotation, and key
> >verification.
> 
> I'm not sure of the design of the key rotation stuff as I recall it from the
> patches I looked at, but the API design looks like the more pressing issue.
> First, the mere existence of an "master key" is a cryptographic choice which
> is debatable, because it creates issues if one must be able to change it,
> hence the whole "key rotation" stuff. ISTM that what core needs to know is
> that one particular key is changed by a new one and the underlying file is
> rewritten. It should not need to know anything about master keys and key
> derivation at all. It should need to know that the user wants to change file
> keys, though.

No, it's not debatable that a KEK is needed, I disagree entirely.
Perhaps we can have support for an external key store in the future for
the DEKs, but we really need to have our own key management and the
ability to reasonably rotate keys (which is what the KEK provides).
Ideally, we'll get to a point where we can even have multiple DEKs and
deal with rotating them too, but that, if anything, point even stronger
to the need to have a key management system and KEK as we'll be dealing
with that many more keys.

> 

Re: Proposed patch for key management

2020-12-31 Thread Fabien COELHO


Hello Bruce,


The API should NOT make assumptions about the cryptographic design, what
depends about what, where things are stored… ISTM that Pg should only care
about naming keys, holding them when created/retrieved (but not create
them), basically interacting with the key manager, passing the stuff to
functions for encryption/decryption seen as black boxes.

I may have suggested something along these lines at the beginning of the key
management thread, probably. Not going this way implicitely implies making
some assumptions which may or may not suit other use cases, so makes them
specific not generic. I do not think pg should do that.


I am not sure what else I can add to this discussion.  Having something 
that is completely external might be a nice option, but I don't think it 
is the common use-case, and will make the most common use cases more 
complex.


I'm unsure about what is the "common use case". ISTM that having the 
postgres process holding the master key looks like a "no" for me in any 
use case with actual security concern: as the database must be remotely 
accessible, a reasonable security assumption is that a pg account could be 
compromised, and the "master key" (if any, that is just one particular 
cryptographic design) should not be accessible in that case. The first 
barrier would be pg admin account. With external, the options are that the 
key is hold by another id, so the host must be compromised as well, and 
could even be managed remotely on another host (I agree that this later 
option would be fragile because of the implied network connection, but it 
would be a tradeoff depending on the security concerns, but it pretty much 
correspond to the kerberos model).



Adding a pre-defined system will not prevent future work on a
completely external option.


Yes and no.

Having two side-by-side cluster-encryption scheme in core, the first that 
could be implemented on top of the second without much effort, would look 
pretty weird. Moreover, required changes in core to allow an API are the 
very same as the one in this patch. The other concern I have with doing 
the cleaning work afterwards is that the API would be somehow in the 
middle of the existing functions if it has not been thought of before.


I know archive_command is completely external, but that is much simpler 
than what would need to be done for key management, key rotation, and 
key verification.


I'm not sure of the design of the key rotation stuff as I recall it from 
the patches I looked at, but the API design looks like the more pressing 
issue. First, the mere existence of an "master key" is a cryptographic 
choice which is debatable, because it creates issues if one must be able 
to change it, hence the whole "key rotation" stuff. ISTM that what core 
needs to know is that one particular key is changed by a new one and the 
underlying file is rewritten. It should not need to know anything about 
master keys and key derivation at all. It should need to know that the 
user wants to change file keys, though.



I will say that if the community feels external-only should be the only
option, I will stop working on this feature because I feel the result
would be too fragile to be reliable,


I'm do not see why it would be the case. I'm just arguing to have key 
management in a separate, possibly suid something-else, process, which 
given the security concerns which dictates the feature looks like a must 
have, or at least must be possible. From a line count point of view, it 
should be a small addition to the current code.



and I would not want to be associated with it.


You do as you want. I'm no one, you can ignore me and proceed to commit 
whatever you want. I'm only advising to the best of my ability, what I 
think should be the level of API pursued for such a feature in pg, at the 
design level.


I feel that the current proposal is built around one particular use case 
with many implicit and unnecessary assumptions without giving much 
thoughts to the generic API which should really be pg concern, and I do 
not think that it can really be corrected once the ship has sailed, hence 
my attempt at having some thoughts given to the matter before that.


IMHO, the *minimum* should be to have the API clearly designed, and the 
implementation compatible with it at some level, including not having 
assumptions about underlying cryptographic functions and key derivations 
(I mean at the API level, the code itself can do it obviously).


If you want a special "key_mgt_command = internal" because you feel that 
processes are fragile and unreliable and do not bring security, so be it, 
but I think that the API should be designed beforehand nevertheless.


Note that if people think that I'm wrong, then I'm wrong by definition.

--
Fabien.

Re: Proposed patch for key management

2020-12-30 Thread Bruce Momjian
On Mon, Dec 28, 2020 at 06:15:44PM -0400, Fabien COELHO wrote:
> The API should NOT make assumptions about the cryptographic design, what
> depends about what, where things are stored… ISTM that Pg should only care
> about naming keys, holding them when created/retrieved (but not create
> them), basically interacting with the key manager, passing the stuff to
> functions for encryption/decryption seen as black boxes.
> 
> I may have suggested something along these lines at the beginning of the key
> management thread, probably. Not going this way implicitely implies making
> some assumptions which may or may not suit other use cases, so makes them
> specific not generic. I do not think pg should do that.

I am not sure what else I can add to this discussion.  Having something
that is completely external might be a nice option, but I don't think it
is the common use-case, and will make the most common use cases more
complex.  Adding a pre-defined system will not prevent future work on a
completely external option.  I know archive_command is completely
external, but that is much simpler than what would need to be done for
key management, key rotation, and key verification.

I will say that if the community feels external-only should be the only
option, I will stop working on this feature because I feel the result
would be too fragile to be reliable, and I would not want to be
associated with it.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee