On 06/10/2013 11:22 PM, Noah Misch wrote:
On Mon, Jun 10, 2013 at 11:20:13AM -0400, Andrew Dunstan wrote:
On 06/10/2013 10:18 AM, Tom Lane wrote:
Andrew Dunstan <and...@dunslane.net> writes:
After thinking about this some more I have come to the conclusion that
we should only do any de-escaping of \uxxxx sequences, whether or not
they are for BMP characters, when the server encoding is utf8. For any
other encoding, which is already a violation of the JSON standard
anyway, and should be avoided if you're dealing with JSON, we should
just pass them through even in text output. This will be a simple and
very localized fix.
Hmm.  I'm not sure that users will like this definition --- it will seem
pretty arbitrary to them that conversion of \u sequences happens in some
databases and not others.
Yep.  Suppose you have a LATIN1 database.  Changing it to a UTF8 database
where everyone uses client_encoding = LATIN1 should not change the semantics
of successful SQL statements.  Some statements that fail with one database
encoding will succeed in the other, but a user should not witness a changed
non-error result.  (Except functions like decode() that explicitly expose byte
representations.)  Having "SELECT '["\u00e4"]'::json ->> 0" emit 'รค' in the
UTF8 database and '\u00e4' in the LATIN1 database would move PostgreSQL in the
wrong direction relative to that ideal.

Then what should we do when there is no matching codepoint in the
database encoding? First we'll have to delay the evaluation so it's not
done over-eagerly, and then we'll have to try the conversion and throw
an error if it doesn't work. The second part is what's happening now,
but the delayed evaluation is not.
+1 for doing it that way.




As a final counter example, let me note that Postgres itself handles Unicode escapes differently in UTF8 databases - in other databases it only accepts Unicode escapes up to U+007f, i.e. ASCII characters.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to