Re: [HACKERS] invalidly encoded strings

Tom Lane Mon, 10 Sep 2007 18:43:36 -0700

Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> If you regard the unicode code point as simply a number, why not
> regard the multibyte characters as a number too?


Because there's a standard specifying the Unicode code points *as
numbers*.  The mapping from those numbers to UTF8 strings (and other
representations) is well-defined by the standard.

> Also I'm wondering you what we should do with different
> backend/frontend encoding combo.

Nothing.  chr() has always worked with reference to the database
encoding, and we should keep it that way.

BTW, it strikes me that there is another hole that we need to plug in
this area, and that's the convert() function.  Being able to create
a value of type text that is not in the database encoding is simply
broken.  Perhaps we could make it work on bytea instead (providing
a cast from text to bytea but not vice versa), or maybe we should just
forbid the whole thing if the database encoding isn't SQL_ASCII.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [HACKERS] invalidly encoded strings

Reply via email to