Re: [HACKERS] JSON and unicode surrogate pairs

Andrew Dunstan Tue, 11 Jun 2013 15:59:49 -0700


On 06/11/2013 06:26 PM, Noah Misch wrote:

As a final counter example, let me note that Postgres itself handles
Unicode escapes differently in UTF8 databases - in other databases it
only accepts Unicode escapes up to U+007f, i.e. ASCII characters.

I don't see a counterexample there; every database that accepts without error
a given Unicode escape produces from it the same text value.  The proposal to
which I objected was akin to having non-UTF8 databases silently translate
E'\u0220' to E'\\u0220'.


What?

There will be no silent translation. The only debate here is about howthese databases turn strings values inside a json datum into PostgreSQLtext values via the documented operation of certain functions andoperators. If the JSON datum doesn't already contain a unicode escapethen nothing of what's been discussed would apply. Nothing whateverthat's been proposed would cause a unicode escape sequence to be emittedthat wasn't already there in the first place, and no patch that I havesubmitted has contained any escape sequence generation at all.


cheers

andrew









--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON and unicode surrogate pairs

Reply via email to