David Wheeler wrote:
My understanding is that the nul character is legal in a byte sequence, but if it's not properly escaped, it'll be parsed as the end of the statement. Unfortunately, I think that it's a very tough problem to solve.
No question wrt '\0' bytes -- they would have to be escaped when casting from bytea to text.

The harder issue is that there are apparently many other multiple byte sequences that, while valid in an ASCII encoding, are not valid in one or more multibyte encodings. See this thread:

http://archives.postgresql.org/pgsql-hackers/2002-04/msg00236.php

This is why currently all "non printable characters" are escaped (which I think is all bytes > 127). Text on the other hand is already known to be valid for a particular encoding, so it doesn't need escaping.

I'm not sure what happens when the backend encoding and client encoding don't match -- I'd guess there is some probability of invalid byte sequences in that case too.

Joe


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Reply via email to