On 12/09/2016 05:58 AM, Michael Paquier wrote:
On Thu, Dec 8, 2016 at 10:05 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote:
Actually, we don't give away that information currently. If you try to log
in with password or MD5 authentication, and the user doesn't exist, you get
the same error as with an incorrect password. So, I think we do need to give
the client a made-up salt and iteration count in that case, to hide the fact
that the user doesn't exist. Furthermore, you can't just generate random
salt and iteration count, because then you could simply try connecting
twice, and see if you get the same salt and iteration count. We need to
deterministically derive the salt from the username, so that you get the
same salt/iteration count every time you try connecting with that username.
But it needs indistinguishable from a random salt, to the client. Perhaps a
SHA hash of the username and some per-cluster secret value, created by
initdb. There must be research papers out there on how to do this..

A simple idea would be to use the system ID when generating this fake
salt? That's generated by initdb, once per cluster. I am wondering if
it would be risky to use it for the salt. For the number of iterations
the default number could be used.

I think I'd feel better with a completely separate randomly-generated value for this. System ID is not too difficult to guess, and there's no need to skimp on this. Yes, default number of iterations makes sense.

We cannot completely avoid leaking information through this, unfortunately. For example, if you have a user with a non-default number of iterations, and an attacker probes that, he'll know that the username was valid, because he got back a non-default number of iterations. But let's do our best.

I have been thinking more about this part quite a bit, and here is the
most simple thing that we could do while respecting the protocol.
That's more or less what I think you have in mind by re-reading
upthread, but it does not hurt to rewrite the whole flow to be clear:
1) Server gets the startup packet, maps pg_hba.conf and moves on to
the scram authentication code path.
2) Server sends back sendAuthRequest() to request user to provide a
password. This maps to the plain/md5 behavior as no errors would be
issued to user until he has provided a password.
3) Client sends back the password, and the first message with the user name.
4) Server receives it, and checks the data. If a failure happens at
this stage, just ERROR on PG-side without sending back a e= message.
This includes the username-mismatch, empty password and end of
password validity. So we would never use e=unknown-user. This sticks
with what you quoted upthread that the server may end the exchange
before sending the final message.

If we want to mimic the current behavior with MD5 authentication, I think we need to follow through with the challenge, and only fail in the last step, even if we know the password was empty or expired. MD5 authentication doesn't currently give away that information to the user.

But it's OK to bail out early on OOM, or if the client sends an outright broken message. Those don't give away any information on the user account.

5) Server sends back the challenge, and client answers back with its
reply to it.

Then enters the final stage of the exchange, at which point the server
would issue its final message that would be e= in case of errors. If
something like an OOM happens, no message would be sent so failing on
an OOM ERROR on PG side would be fine as well.

6) Read final message from client and validate.
7) issue final message of server.

On failure at steps 6) or 7), an e= message is returned instead of the
final message. Does that look right?

Yep.

One thing is: when do we look up at pg_authid? After receiving the
first message from client or before beginning the exchange? As the
first message from client has the user name, it would make sense to do
the lookup after receiving it, but from PG prospective it would just
make sense to use the data already present in the startup packet. The
current patch does the latter. What do you think?

Let's see what fits the program flow best. Probably best to do it before beginning the exchange. I'm hacking on this right now...

By the way, I have pushed the extra patches you sent into this branch:
https://github.com/michaelpq/postgres/tree/scram

Thanks! We had a quick chat with Michael, and agreed that we'd hack together on that github repository, to avoid stepping on each other's toes, and cut rebased patch sets from there to pgsql-hackers every now and then.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to