Re: [HACKERS] [GENERAL] trouble with to_char('L')

Bruce Momjian Tue, 20 Apr 2010 07:03:55 -0700

Magnus Hagander wrote:
> > Another idea is to use GetLocaleInfoW() [1], that is win32 native locale
> > functions, instead of the libc one. It returns locale characters in wide
> > chars, so we can safely convert them as UTF16->UTF8->db. But it requires
> > an additional branch in our locale codes only for Windows.
> 
> If we can go UTF16->db directly, it might be a good idea. If we're
> going via UTF8 anyway, I doubt it's going to be worth it.
> 
> Let's work off what we have now to start with at least. Bruce, can you
> comment on that thing about the extra parameter? And UTF8?


I do like the idea of using UTF16 directly because that would eliminate
our need to even set LC_CTYPE for Win32 in this routine.  That would
also eliminate any need to refer to the encoding for numeric/monetary,
so we could get rid of the odd case where their encoding is UTF8 but
their numeric/monetary locale settings have to use a non-UTF8 encoding. 
For example, the original bug report has these locale settings:

        http://archives.postgresql.org/pgsql-general/2009-04/msg00829.php

        psql (PostgreSQL) 8.3.7
        
        server_version 8.3.7
        server_encoding UTF8
        client_encoding win1252
        lc_numeric Finnish, Finland
        lc_monetary Finnish, Finland

but really needed to use "Finnish_Finland.1252":

        http://archives.postgresql.org/pgsql-general/2009-04/msg00859.php
        
        However, I noticed that both lc_collate and lc_ctype are set to
        Finnish_Finland.1252 by the installer. Should I have just run initdb
        with --locale fi_FI.UTF8 at the very start? The to_char('L') works
        fine with a database with win1252 encoding.

Of course, that still does not work with our current CVS code if the
database encoding is UTF8, which is what we are trying to fix now.

I am not even sure how users set these things properly but I assume the
installer does all that magic.  And, of course, if someone manually runs
initdb on Windows, they can easily set things wrong.

Magnus, if I remember correctly, all our non-UTF8 to UTF8 conversion
already has to pass through UTF16 as an intermediary case, so going to
UTF16 directly seems fine.

-- 
  Bruce Momjian  <[email protected]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] trouble with to_char('L')

Reply via email to