Zdenek Kotala <[EMAIL PROTECTED]> writes:
> On Solaris I got following problematic locales: [...]
I tried this program on Mac OS X 10.4.10 (the current release) and found
out that what that OS mostly returns is the encoding portion of the
locale name, for instance
sv_SE.ISO8859-15 ... ISO8859-15 - OK
sv_SE.UTF-8 ... UTF-8 - OK
tr_TR ... - NO MATCH
tr_TR.ISO8859-9 ... ISO8859-9 - OK
tr_TR.UTF-8 ... UTF-8 - OK
uk_UA ... - NO MATCH
uk_UA.ISO8859-5 ... ISO8859-5 - OK
uk_UA.KOI8-U ... KOI8-U - NO MATCH
uk_UA.UTF-8 ... UTF-8 - OK
zh_CN ... - NO MATCH
zh_CN.eucCN ... eucCN - OK
zh_CN.GB18030 ... GB18030 - NO MATCH
zh_CN.GB2312 ... GB2312 - OK
zh_CN.GBK ... GBK - NO MATCH
zh_CN.UTF-8 ... UTF-8 - OK
zh_HK ... - NO MATCH
zh_HK.Big5HKSCS ... Big5HKSCS - NO MATCH
zh_HK.UTF-8 ... UTF-8 - OK
zh_TW ... - NO MATCH
zh_TW.Big5 ... Big5 - NO MATCH
zh_TW.UTF-8 ... UTF-8 - OK
C ... US-ASCII - NO MATCH
POSIX ... US-ASCII - NO MATCH
They didn't *quite* hard-wire it that way, as evidenced by the C/POSIX
results, but certainly the empty-string results are entirely useless.
Perhaps we should file a bug with Apple. However, some poking around
in /usr/share/locale indicates that there's a consistent interpretation
to be made:
g42:/usr/share/locale tgl$ ls -l ??_??/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 af_ZA/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
-r--r--r-- 1 root wheel 3272 Mar 20 2005 am_ET/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 be_BY/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 bg_BG/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 ca_ES/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 cs_CZ/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 da_DK/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 de_AT/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 de_CH/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 de_DE/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 el_GR/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 en_AU/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 17 Apr 26 2006 en_CA/LC_CTYPE@ ->
../UTF-8/LC_CTYPE
(etc etc)
The only one that's not actually a symlink to the standard UTF-8 ctype
is am_ET/LC_CTYPE, which is identical to am_ET.UTF-8/LC_CTYPE.
So I think we can get away with something like
#ifdef __darwin__
if (strlen(sys) == 0)
// assume UTF8
#endif
I suppose we'll need a few more hacks like this as the beta-test results
begin to roll in ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend