> JiangWei <[EMAIL PROTECTED]> writes: > > LANG=zh_CN.UTF-8 > > [ set client_encoding to LATIN1 and provoke an error ] > > OK, I can reproduce the crash after initdb'ing with that LANG setting > (in an nls-enabled build). The postmaster log fills with a whole lot > of occurrences of > > ������: ��������������������� UTF-8 ������ 0x00e9 > ������: ��������������������� UTF-8 ������ 0x00e8 > ������: ��������������������� UTF-8 ������ 0x00e8 > ������: ��������������������� UTF-8 ������ 0x00e8 > ���������������������������������: ERRORDATA_STACK_SIZE exceeded > > Tracing through the dump shows that the error-handling code is > recursively producing this warning while trying to translate the word > WARNING to LATIN1. The zh_CN.po file shows the translation as > > #: utils/error/elog.c:1909 > msgid "WARNING" > msgstr "����" > > (which apparently is GB2312?)
It seems. zh_CN.po has the line: "Content-Type: text/plain; charset=GB2312\n" Which means at least someone who wrote the file intended to be it as GB2312. However, please note that GB2312 is a character set, not an encoding. The reality is that the file seems encoded in EUC-CN. Note that I have confirmed this by just examining the bytes above (����) are correct EUC-CN byte sequences. It is posibble that the file is not written in EUC-CN, but I guess it's hardly possible. > and what's actually getting passed to > utf8_to_iso8859_1() is > > (gdb) x/6o str > 0x8b89d8: 0350 0255 0246 0345 0221 0212 > > I have no idea if this is a correct UTF8 transliteration of the GB2312 > phrase --- can anyone confirm? As fas as looking into utils/mb/Unicode/euc_cn_to_utf8.map, the translation above seems to be correct. BTW, who does the translation from EUC-CN to UTF-8? Maybe gettext()? -- Tatsuo Ishii SRA OSS, Inc. Japan > But anyway, if this is Chinese then it's > hardly surprising that there would be no LATIN1 equivalent. And then > trying to report the problem gets us into a new instance of the same > problem. Even the code that's supposed to stop error recursion doesn't > get us out of it. > > It seems to me that there basically is no graceful solution to this sort > of mismatch. It might be possible to kluge things so that we disable > NLS once we've recursed too many times in error processing, but that's > surely pretty ugly. What would be a lot more user-friendly would be to > refuse the attempt to set client_encoding to something that can't handle > our error message encoding, but I don't know what a reasonable set of > restrictions would be. > > Comments? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org