On 10/1/14, 9:11 AM, Heikki Linnakangas wrote:
I spent a little time cleaning up the regression tests and docs, and
ended up with the attached. But then I realized that there's a problem
with UTF-8 conversion in the armor() function. It returns the armored
blob as text, but forcibly converts the keys and values to UTF-8. That's
not cool, because you will get invalidly encoded strings into the
database, if you use the function while connected to a database that
uses some other encoding than UTF-8.

RFC4880 says that the headers are in UTF-8, but armor() cannot safely
return UTF-8 encoded text unless the database's encoding is also UTF-8.
It also rightly says that using anything else than plain ASCII, even
though nominally it's UTF-8, is a bad idea, because the whole point of
ASCII-armoring is to make the format safe from encoding conversions.

Ugh.  Right.

We have two options:

1. Throw an error if there are any non-ASCII characters in the key/value
arrays.
2. Don't convert them to UTF-8, but use the current database encoding.

Both seem sane to me. If we use the current database encoding, then we
have to also decide what to do with the input, in pgp_armor_headers().
If armor() uses the database encoding, but pgp_armor_headers() treats
the input as UTF-8, then a round-trip with pgp_armor_headers(armor(?))
won't work.

Yeah. Both options seem fine to me. Throwing an error perhaps slightly more so.


.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to