On Fri, Oct 4, 2019 at 5:49 PM Bruce Momjian <br...@momjian.us> wrote:
> We spend a lot of time figuring out exactly how to safely encrypt WAL,
> heap, index, and pgsql_tmp files.   The idea of doing this for another
> 20 types of files --- to find a safe nonce, to be sure a file rewrite
> doesn't reuse the nonce, figuring the API, crash recovery, forensics,
> tool interface --- is something I would like to avoid.  I want to avoid
> it not because I don't like work, but because I am afraid the code
> impact and fragility will doom the feature.

I'm concerned about that, too, but there's no getting around the fact
that there are a bunch of types of files and that they do all need to
be dealt with. If we have a good scheme for doing that, hopefully
extending it to additional types of files is not that bad, which would
then spare us the trouble of arguing about each one individually, and
also be more secure.

As I also said to Stephen, the people who are discussing this here
should *really really really* be looking at the Cybertec patch instead
of trying to invent everything from scratch - unless that patch has,
like, typhoid, or something, in which case please let me know so that
I, too, can avoid looking at it. Even if you wanted to use 0% of the
code, you could look at the list of file types that they consider
encrypting and think about whether you agree with the decisions they
made. I suspect that you would quickly find that you've left some
things out of your list. In fact, I can think of a couple pretty clear
examples, like the stats files, which clearly contain user data.

Another reason that you should go look at that patch is because it
actually tries to grapple with the exact problem that you're worrying
about in the abstract: there are a LOT of different kinds of files and
they all need to be handled somehow. Even if you can convince yourself
that things like pg_clog don't need encryption, which I think is a
pretty tough sell, there are LOT of file types that directly contain
user data and do need to be handled. A lot of the code that writes
those various types of files is pretty ad-hoc. It doesn't necessarily
do nice things like build up a block of data and then write it out
together; it may for example write a byte a time. That's not going to
work well for encryption, I think, so the Cybertec patch changes that
stuff around. I personally don't think that the patch does that in a
way that is sufficiently clean and carefully considered for it to be
integrated into core, and my plan had been to work on that with the
patch authors.

However, that plan has been somewhat derailed by the fact that we now
have hundreds of emails arguing about the design, because I don't want
to be trying to push water up a hill if everyone else is going in a
different direction. It looks to me, though, like we haven't really
gotten beyond the point where that patch already was. The issues of
nonce and many file types have already been thought about carefully
there. I rather suspect that they did not get it all right. But, it
seems to me that it would be a lot more useful to look at the code
actually written and think about what it gets right and wrong than to
discuss these points as a strictly theoretical matter.

In other words: maybe I'm wrong here, but it looks to me like we're
laboriously reinventing the wheel when we could be working on
improving the working prototype.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to