From: "Robert Haas" <robertmh...@gmail.com>
On Thu, Sep 19, 2013 at 7:58 PM, Tatsuo Ishii <is...@postgresql.org>
wrote:
What about limiting to use NCHAR with a database which has same
encoding or "compatible" encoding (on which the encoding conversion is
defined)? This way, NCHAR text can be automatically converted from
NCHAR to the database encoding in the server side thus we can treat
NCHAR exactly same as CHAR afterward. I suppose what encoding is used
for NCHAR should be defined in initdb time or creation of the database
(if we allow this, we need to add a new column to know what encoding
is used for NCHAR).
For example, "CREATE TABLE t1(t NCHAR(10))" will succeed if NCHAR is
UTF-8 and database encoding is UTF-8. Even succeed if NCHAR is
SHIFT-JIS and database encoding is UTF-8 because there is a conversion
between UTF-8 and SHIFT-JIS. However will not succeed if NCHAR is
SHIFT-JIS and database encoding is ISO-8859-1 because there's no
conversion between them.
I think the point here is that, at least as I understand it, encoding
conversion and sanitization happens at a very early stage right now,
when we first receive the input from the client. If the user sends a
string of bytes as part of a query or bind placeholder that's not
valid in the database encoding, it's going to error out before any
type-specific code has an opportunity to get control. Look at
textin(), for example. There's no encoding check there. That means
it's already been done at that point. To make this work, someone's
going to have to figure out what to do about *that*. Until we have a
sketch of what the design for that looks like, I don't see how we can
credibly entertain more specific proposals.
OK, I see your point. Let's consider that design. I'll learn the code
regarding this. Does anybody, especially Tatsuo san, Tom san, Peter san,
have any good idea?
Regards
MauMau
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers