Tom Lane wrote:
A possible answer is to add a verifymbstr to the string literal
converter anytime it has processed a numeric backslash-escape in the
string. Open questions for that are (1) does it have negative effects
for bytea, and if so is there any hope of working around it? (2) how
can we postpone the conversion/test to the parse analysis phase?
Finding out how to do (2) seems to me the only possible answer to (1).
I'll have a look.
Is that going to cover data coming in via COPY? and parameters for
prepared statements?
. for chr() under UTF8, it seems to be generally agreed that the
argument should represent the codepoint and the function should return
the correspondingly encoded character. If so, possible the argument
should be a bigint to accommodate the full range of possible code
points. It is not clear what the argument should represent for other
multi-byte encodings for any argument higher than 127. Similarly, it is
not clear what ascii() should return in such cases. I would be inclined
just to error out.
In SQL_ASCII I'd argue for allowing 0..255. In actual MB encodings,
OK with throwing error.
I was planning on allowing up to 255 for all single byte encodings too.
cheers
andrew
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate