OK, I've been spreading rumours about fixing the internationalization
problems, so let me make it a bit more clear.  Here are the problems that
need to be fixed:

- Only one locale per process possible.

- Only one gettext-language per process possible.

- lc_collate and lc_ctype need to be held fixed in the entire cluster.

- Gettext relies on iconv character set conversion, which relies on
  lc_ctype, which leads to a complete screw-up in the server because of
  the previous item.

- Locale fixed per cluster, but encoding fixed per database, unware
  of each other, don't get along.

- No support for upper/lower with multibyte encoding.

- Implementation of Unicode horribly incomplete.

These are all dependent on each other and sort of flow into each other.

Here is a proposed ordering of steps toward improving the situation:

1. Take out the character set conversion routines from the backend and
make them a library of their own.  This could possibly be modelled after
iconv, but not necessarily.  Or we might conclude that we can just use
iconv in the first place.

2. Reimplement gettext to use 1. and allow switching of language and
encoding at run-time.

3. Implement Unicode collation algorithm and character classification
routines that are aware of 1.  Use that in place of system locale
routines.

4. Allow choice of locale per database.  (This should be fairly easy after
3.)

5. Allow choice of locale per column and implement collation coercion
according to SQL standard.

This could easily take a long time, but I feel that even if we have to
stop after 2., 3., or 4. at feature freeze, we'd be a lot farther.

Comments?  Anything else that needs fixing?

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to