Robert Haas <robertmh...@gmail.com> writes: > The code I've written so far does no canonicalization of the input > value of any kind, just as we do for XML.
Fair enough. > So, given that framework, what the patch does is this: if you're using > UTF-8, then \uXXXX is accepted, provided that XXXX is something that > equates to a legal Unicode code point. It isn't converted to the > corresponding character: it's just validated. If you're NOT using > UTF-8, then it allows \uXXXX for code points up through 127 (which we > assume are the same in all encodings) and anything higher than that is > rejected. This seems a bit silly. If you're going to leave the escape sequence as ASCII, then why not just validate that it names a legal Unicode code point and be done? There is no reason whatever that that behavior needs to depend on the database encoding. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers