Oops. The problem is here. TParserInit allocates twice less memory than needed. And it happens if sizeof(wchar_t) < sizeof(pg_wchar) and C-locale for non-Windows box. Also for Windows, encoding should be non-utf. So, all p_is* functions are broken in this case because they work with wrong data.pg_mb2wchar_with_len() converts server encoded strings to pg_wchar strings. But pg_wchar is typedef'd as unsigned int which is not the same as wchar_t at least on Windows (unsigned short).
.
mbstowcs/wcstombs doesn't work with C-locale in other OSes too, so that's not needed.I modified it corresponding to the change in char2wchar() so that wchar2char(char2wchar(x)) becomes x. Though I'm not sure if it is
I don't see a way to produce correct result of char2wchar with C-locale and sizeof(wchar_t) = 2.If there's an effective function like pg_wchar2mb_with_len() which converts wchar_t strings to server encoded strings, we had better simply call it for char2wchar().
In summary, I suggest to remove support of C-locale from char2wchar function and tsearch's parser should directly use pg_mb2wchar_with_len() in case of C-locale and multibyte encoding. In all other places char2wchar is called only for non-C locale.
Please, test attached patch. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/
clocale.patch.gz
Description: Unix tar archive
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers