> JiangWei <[EMAIL PROTECTED]> writes:
> >         LANG=zh_CN.UTF-8
> > [ set client_encoding to LATIN1 and provoke an error ]
> 
> OK, I can reproduce the crash after initdb'ing with that LANG setting
> (in an nls-enabled build).  The postmaster log fills with a whole lot
> of occurrences of
> 
> ������:  ��������������������� UTF-8 ������ 0x00e9
> ������:  ��������������������� UTF-8 ������ 0x00e8
> ������:  ��������������������� UTF-8 ������ 0x00e8
> ������:  ��������������������� UTF-8 ������ 0x00e8
> ���������������������������������:  ERRORDATA_STACK_SIZE exceeded
> 
> Tracing through the dump shows that the error-handling code is
> recursively producing this warning while trying to translate the word
> WARNING to LATIN1.  The zh_CN.po file shows the translation as
> 
> #: utils/error/elog.c:1909
> msgid "WARNING"
> msgstr "����"
> 
> (which apparently is GB2312?)

It seems. zh_CN.po has the line:

"Content-Type: text/plain; charset=GB2312\n"

Which means at least someone who wrote the file intended to be it as
GB2312. However, please note that GB2312 is a character set, not an
encoding. The reality is that the file seems encoded in EUC-CN. Note
that I have confirmed this by just examining the bytes above
(����) are correct EUC-CN byte sequences. It is posibble
that the file is not written in EUC-CN, but I guess it's hardly
possible.

> and what's actually getting passed to
> utf8_to_iso8859_1() is
> 
> (gdb) x/6o str
> 0x8b89d8:       0350    0255    0246    0345    0221    0212
> 
> I have no idea if this is a correct UTF8 transliteration of the GB2312
> phrase --- can anyone confirm?

As fas as looking into utils/mb/Unicode/euc_cn_to_utf8.map, the
translation above seems to be correct. BTW, who does the translation
from EUC-CN to UTF-8? Maybe gettext()?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> But anyway, if this is Chinese then it's
> hardly surprising that there would be no LATIN1 equivalent.  And then
> trying to report the problem gets us into a new instance of the same
> problem.  Even the code that's supposed to stop error recursion doesn't
> get us out of it.
> 
> It seems to me that there basically is no graceful solution to this sort
> of mismatch.  It might be possible to kluge things so that we disable
> NLS once we've recursed too many times in error processing, but that's
> surely pretty ugly.  What would be a lot more user-friendly would be to
> refuse the attempt to set client_encoding to something that can't handle
> our error message encoding, but I don't know what a reasonable set of
> restrictions would be.
> 
> Comments?
> 
>                       regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to