date:20090831

[HACKERS] Tightening binary receive functions

2009-08-31 Thread Heikki Linnakangas

After the thread started by Andrew McNamara a while ago:

http://archives.postgresql.org/message-id/e419f08d-b908-446d-9b1e-f3520163c...@object-craft.com.au

the question was left hanging if other binary recv functions are missing
sanity checks that the corresponding text input functions have, and
whether they might pose a security issue. Tom Lane went through all the
recv functions after that, and found a few missing checks but nothing
that would present a security issue, but posted them to the
pgsql-security list for a double check. I just went through them again,
and found no security issues either.

We should nevertheless fix the recv functions so that they don't accept
any values that the corresponding text functions reject. While the
server itself handles them, such values will throw an error on
pg_dump/restore, for example. Attached patch adds the required sanity
checks. I'm thinking of applying this to CVS HEAD, but not back-patch,
just like Tom did with the original problem reported with time_recv()
and timetz_recv().

The most notable of these is the change to char datatype. The patch
tightens it so that it no longer accepts values 127 with a multi-byte
database encoding. Also, the handling of encoding in xml_recv() was bogus.

Here's the list of issues found:

Tom Lane wrote:
array_recv:

Allows zero-length dimensions (because ArrayGetNItems doesn't complain);
whereas array_in does not. Not sure if this is an issue. We have had
-hackers discussions about the behavior of zero-length arrays, so I
think there may be other ways to get them into the system anyway.

Doesn't check lower bounds at all, so it's possible that lowerbound plus
length overflows an int. Not sure if this is a security problem either,
but it seems like something that should be rejected.

Yes, that seems dangerous.

I note that the handling of large bounds isn't very rigid in the text
input function either:

postgres=# SELECT '[:9] = {1}'::integer[];
int4
-
[2147483647:2147483647]={1}
(1 row)

We use plain atoi() to convert the subscripts to integers, which returns
INT_MAX for an overflow. I didn't do anything about that now, but the
patch adds a check for the upper bound overflow in array_recv().

charrecv

Doesn't bother its pretty little head with maintaining encoding validity.
Of course the entire datatype doesn't, so this is hardly the fault of
the recv routine in particular. It might be that the type is fine but we
ought to constrain char_text() to fail on high-bit-set char values unless
DB encoding is single-byte.

Constraining char_text() seems like a good idea. The current behavior of
char_text() with a high-bit-set byte is not useful, while using the full
range of char can be. However, we have to constrain charout() as well,
or we're back to square one with charout(c)::text. And if we constrain
charout(), then we should constrain charin() as well. Which brings us
back to forbidding such values altogether.

The patch constrains all the functions that you can use to get a char
into the system: charin(), charrecv(), i4tochar() and text_char(). They
now reject values with high bit set when using a multi-byte database
encoding

date_recv

Fails to do any range check, so the value might cause odd behavior later.
Should probably limit to the values date_in would accept.

Yep.

float4recv, float8recv, and geometric and other types depending on
pq_getmsgfloatN

These all accept any bit pattern whatever for a float. Now I know of no
machine where float doesn't consider all bitpatterns valid, but
nonetheless there are some issues here:

* We don't currently allow any other way to inject an IEEE signaling NaN
into the database. As far as I can think, this could only lead to float
traps in places where one might perhaps not have expected one, so I
don't think this is a real problem.

Agreed. This is frankly the first time I even hear about signaling NaNs,
and after reading up on that a bit, I get the impression that even on
platforms that support them, you need a #define or a compiler flag to
enable them.

* Some of these types probably aren't expecting NaNs or Infinities at all.
Can anything really bad happen if they get one?

Geometric types seem to handle NaNs and Infs gracefully, although I
wonder what it means to have e.g a box with the X-coordinate of one
corner being NaN. I think it's ok from a security point of view.

timestamp_recv (with float timestamps) accepts NaN, while timestamp_in
does not. I'm not sure what happens if you pass a NaN timestamp to the
system, but we should forbid that anyway.

88 matches

Mail list logo