On 03.06.2013 18:27, k...@rice.edu wrote:
On Mon, Jun 03, 2013 at 04:09:29PM +0100, Martin Schäfer wrote:

If I change the strCreate query and add double quotes around the column
name, then the problem disappears. But the original name is already in
lowercase, so I think it should also work without quoting the column name.
Am I missing some setup in either the database or in the use of libpq?

I’m using PostgreSQL 9.2.1, compiled by Visual C++ build 1600, 64-bit

The database uses:
ENCODING = 'UTF8'
LC_COLLATE = 'English_United Kingdom.1252'
LC_CTYPE = 'English_United Kingdom.1252'

Thanks for any help,

Martin


Hi Martin,

If you do not want the lowercase behavior, you must put double-quotes
around the column name per the documentation:

http://www.postgresql.org/docs/9.2/interactive/sql-syntax-
lexical.html#SQL-SYNTAX-IDENTIFIERS

section 4.1.1.

Regards,
Ken

The original name 'id_äß' is already in lowercase. The backend should leave it 
unchanged IMO.

Only in utf-8 which needs to be double-quoted for a column name as you have
seen, otherwise the value will be lowercased per byte.

He *is* using UTF-8. Or trying to, anyway :-). The downcasing in the backend is supposed to leave bytes with the high-bit set alone, ie. in UTF-8 encoding, it's supposed to leave ä and ß alone.

I suspect that the conversion to UTF-8, before the string is sent to the server, is not being done correctly. I'm not sure what's wrong there, but I'd suggest printing the actual byte sequence sent to the server, to check if it's in fact valid UTF-8. ie. replace the PQexec() line with something like:

    const char *s = ToUtf8(strCreate.c_str()).c_str();
    int i;
    for (i=0; s[i]; i++)
      printf("%02x", (unsigned char) s[i]);
    printf("\n");
    pResult = PQexec(pConn, s);

That should contain the UTF-8 byte sequence for äß, "c3a4c39f"

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to