Re: [HACKERS] invalidly encoded strings

Andrew Dunstan Sun, 09 Sep 2007 08:57:28 -0700


Tom Lane wrote:


A possible answer is to add a verifymbstr to the string literal
converter anytime it has processed a numeric backslash-escape in the
string.  Open questions for that are (1) does it have negative effects
for bytea, and if so is there any hope of working around it?  (2) how
can we postpone the conversion/test to the parse analysis phase?

Finding out how to do (2) seems to me the only possible answer to (1).I'll have a look.

Is that going to cover data coming in via COPY? and parameters forprepared statements?

. for chr() under UTF8, it seems to be generally agreed that theargument should represent the codepoint and the function should returnthe correspondingly encoded character. If so, possible the argumentshould be a bigint to accommodate the full range of possible codepoints. It is not clear what the argument should represent for othermulti-byte encodings for any argument higher than 127. Similarly, it isnot clear what ascii() should return in such cases. I would be inclinedjust to error out.
In SQL_ASCII I'd argue for allowing 0..255.  In actual MB encodings,
OK with throwing error.

        


I was planning on allowing up to 255 for all single byte encodings too.

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Re: [HACKERS] invalidly encoded strings

Reply via email to