Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

MauMau Fri, 20 Sep 2013 17:35:23 -0700

From: "Robert Haas" <[email protected]>

On Thu, Sep 19, 2013 at 7:58 PM, Tatsuo Ishii <[email protected]>wrote:

What about limiting to use NCHAR with a database which has same
encoding or "compatible" encoding (on which the encoding conversion is
defined)? This way, NCHAR text can be automatically converted from
NCHAR to the database encoding in the server side thus we can treat
NCHAR exactly same as CHAR afterward.  I suppose what encoding is used
for NCHAR should be defined in initdb time or creation of the database
(if we allow this, we need to add a new column to know what encoding
is used for NCHAR).


For example, "CREATE TABLE t1(t NCHAR(10))" will succeed if NCHAR is
UTF-8 and database encoding is UTF-8. Even succeed if NCHAR is
SHIFT-JIS and database encoding is UTF-8 because there is a conversion
between UTF-8 and SHIFT-JIS. However will not succeed if NCHAR is
SHIFT-JIS and database encoding is ISO-8859-1 because there's no
conversion between them.


I think the point here is that, at least as I understand it, encoding
conversion and sanitization happens at a very early stage right now,
when we first receive the input from the client.  If the user sends a
string of bytes as part of a query or bind placeholder that's not
valid in the database encoding, it's going to error out before any
type-specific code has an opportunity to get control.   Look at
textin(), for example.  There's no encoding check there.  That means
it's already been done at that point.  To make this work, someone's
going to have to figure out what to do about *that*.  Until we have a
sketch of what the design for that looks like, I don't see how we can
credibly entertain more specific proposals.

OK, I see your point. Let's consider that design. I'll learn the coderegarding this. Does anybody, especially Tatsuo san, Tom san, Peter san,have any good idea?


Regards
MauMau



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

Reply via email to