Re: [HACKERS] String encoding during connection handshake

2007-12-02 Thread Neil Conway
On Wed, 2007-11-28 at 09:38 -0800, Trevor Talbot wrote: PostgreSQL's problem is that it (and AFAICT POSIX) conflates encoding with locale, when the two are entirely separate concepts. In what way does PostgreSQL conflate encoding with locale? -Neil ---(end of

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread sulfinu
Ok, that's bad. I've also read crypt.c and md5.c. And what a nightmare is C compared to Java (granted, there's a difference in age of more than 20 years). My guess is that since the char type is one byte long, all char * expressions are actually pointers to array of bytes which are transmitted

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Martijn van Oosterhout
On Wed, Nov 28, 2007 at 11:39:33AM +0200, [EMAIL PROTECTED] wrote: During the authentication phase, no such conversion takes place - you were right and I couldn't believe it! In the case when your database name, your user name or password contain non-ASCII characters, you're out of luck if

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread sulfinu
Martijn, :) don't take it personal, I am just trying to obtain confirmation that I understood well the problem. Afterall, it's just that C has a very outdated notion of chars (and no notion of Unicode). I was naively under the impression that chars have evolved in nowadays C. Regarding the

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Alvaro Herrera
[EMAIL PROTECTED] escribió: Martijn, :) don't take it personal, I am just trying to obtain confirmation that I understood well the problem. Afterall, it's just that C has a very outdated notion of chars (and no notion of Unicode). I was naively under the impression that chars have

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Martijn van Oosterhout
On Wed, Nov 28, 2007 at 05:54:05PM +0200, [EMAIL PROTECTED] wrote: Regarding the problem of One True Encoding, the answer seems obvious to me: use only one encoding per database cluster, either UTF-8 or UTF-16 or another Unicode-aware scheme, whichever yields a statistically smaller database

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Trevor Talbot
On 11/28/07, Martijn van Oosterhout [EMAIL PROTECTED] wrote: On Wed, Nov 28, 2007 at 05:54:05PM +0200, [EMAIL PROTECTED] wrote: Regarding the problem of One True Encoding, the answer seems obvious to me: use only one encoding per database cluster, either UTF-8 or UTF-16 or another

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread sulfinu
On Wednesday 28 November 2007, Trevor Talbot wrote: I'm not entirely sure how that's supposed to solve the client authentication issue though. Demanding that clients present auth data in UTF-8 is no different than demanding they present it in the encoding it was entered in originally... Oh

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Trevor Talbot
On 11/28/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On Wednesday 28 November 2007, Trevor Talbot wrote: I'm not entirely sure how that's supposed to solve the client authentication issue though. Demanding that clients present auth data in UTF-8 is no different than demanding they

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread sulfinu
On Wednesday 28 November 2007, Alvaro Herrera wrote: [EMAIL PROTECTED] escribió: Martijn, :) don't take it personal, I am just trying to obtain confirmation that I understood well the problem. Afterall, it's just that C has a very outdated notion of chars (and no notion of Unicode). I

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Gregory Stark
[EMAIL PROTECTED] writes: Yes, you support (and worry about) encodings simply because of a C limitation dating from 1974, if I recall correctly... In Java, for example, a char is a very well defined datum, namely a Unicode point. While in C it can be some char or another (or an error!)

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Trevor Talbot
On 11/28/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Yes, you support (and worry about) encodings simply because of a C limitation dating from 1974, if I recall correctly... In Java, for example, a char is a very well defined datum, namely a Unicode point. While in C it can be some char or

Re: [HACKERS] String encoding during connection handshake

2007-11-28 Thread Kris Jurka
On Wed, 28 Nov 2007, [EMAIL PROTECTED] wrote: I consider this matter closed from my point of view and I have modified the JDBC driver according to my needs. Could you explain in more detail what you've done to the JDBC driver in case it is generally useful or other people have the same

[HACKERS] String encoding during connection handshake

2007-11-27 Thread sulfinu
Hi all. I have read the documentation, searched the mailing lists and inspected the code JDBC driver code. I do need to address this question to actual developers. Simply put, what is the client encoding that the server assumes BEFORE the client connection is established, that is, during the

Re: [HACKERS] String encoding during connection handshake

2007-11-27 Thread Martijn van Oosterhout
On Tue, Nov 27, 2007 at 02:51:32PM +0200, [EMAIL PROTECTED] wrote: Simply put, what is the client encoding that the server assumes BEFORE the client connection is established, that is, during the authentication phase? I know there's a client_encoding setting on the server side that indicates

Re: [HACKERS] String encoding during connection handshake

2007-11-27 Thread sulfinu
On Tuesday 27 November 2007, Martijn van Oosterhout wrote: I was under the impression that the username/password, had no encoding, they are Just a Bunch of Bits, i.e. byte[]. I cannot agree to that, simply because Postgres supports (or at least claims to) multi-byte characters. And user names,

Re: [HACKERS] String encoding during connection handshake

2007-11-27 Thread Usama Munir
PROTECTED] on behalf of [EMAIL PROTECTED] Sent: Tue 11/27/2007 8:55 PM To: Martijn van Oosterhout Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] String encoding during connection handshake On Tuesday 27 November 2007, Martijn van Oosterhout wrote: I was under the impression that the username