On Wed, Jun 10, 2015 at 10:09:38AM -0400, Tom Lane wrote: > Noah Misch <n...@leadboat.com> writes: > > I can reproduce this with "initdb --locale=C", > > postgresql-9.4.3-1-windows-binaries.zip (32-bit), Windows 7 x64, and the > > Windows ANSI code page set to CP936. (Choose "Chinese (Simplified, PRC)" in > > Control Panel -> Region and Language -> Administrative -> Language for > > non-Unicode programs.) It is neither necessary nor sufficient to change > > Control Panel -> Region and Language -> Formats -> Format. Binaries from > > postgresql-9.4.3-1-windows-x64-binaries.zip do not exhibit the problem. > > Note > > that CP936 is a PG_ENCODING_IS_CLIENT_ONLY() encoding. > > Hm. I could understand getting encoding difficulties in that environment, > but it's hard to see why they'd manifest like this. Can you trace through > pg_perm_setlocale and figure out why it's reporting failure?
A faster test is to set LC_CTYPE=C in the environment and run "postgres --version". The root cause is a bug my commit 5f538ad introduced at the start of the 9.4 cycle. pg_perm_setlocale() now calls pg_bind_textdomain_codeset(), which calls setlocale(LC_CTYPE, NULL). POSIX permits that to clobber all previous setlocale() return values, which it did here[1]. The ensuing putenv("LC_CTYPE=<garbage bytes>") at the end of pg_perm_setlocale() fails under Windows ANSI code page 936, because the garbage bytes often aren't a valid CP936 string. I would expect the same symptom on other multibyte Windows locales. While Windows was the bellwether, harm potential is greater on non-Windows systems. pg_perm_setlocale() sets the LC_CTYPE environment variable to help PL/Perl avoid clobbering the process locale; see plperl_init_interp() comments. However, that function has bespoke code for Windows, on which setting the environment variable doesn't help. I don't know which other platforms invalidate previous setlocale() return values on setlocale(LC_CTYPE, NULL). Therefore, I propose committing the attached diagnostic patch and reverting it after about one buildfarm cycle. It will make affected configurations fail hard, and then I'll have a notion about the prevalence of damage to expect in the field. The actual fix is trivial, attached second. This is for back-patch to 9.4. [1] It does so in 32-bit "release" (non-debug), NLS builds done under Visual Studio 2012 and Visual Studio 2013. The official binaries used VS2013. The symptoms are slightly different under VS2012. I did not test earlier versions. Debug builds and 64-bit builds were unaffected.
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c index 4be735e..d33081b 100644 --- a/src/backend/utils/adt/pg_locale.c +++ b/src/backend/utils/adt/pg_locale.c @@ -58,6 +58,7 @@ #include "catalog/pg_collation.h" #include "catalog/pg_control.h" #include "mb/pg_wchar.h" +#include "utils/builtins.h" #include "utils/hsearch.h" #include "utils/memutils.h" #include "utils/pg_locale.h" @@ -148,6 +149,7 @@ pg_perm_setlocale(int category, const char *locale) char *result; const char *envvar; char *envbuf; + char orig_result[LC_ENV_BUFSIZE]; #ifndef WIN32 result = setlocale(category, locale); @@ -173,6 +175,7 @@ pg_perm_setlocale(int category, const char *locale) if (result == NULL) return result; /* fall out immediately on failure */ + strlcpy(orig_result, result, sizeof(orig_result)); /* * Use the right encoding in translated messages. Under ENABLE_NLS, let @@ -231,6 +234,15 @@ pg_perm_setlocale(int category, const char *locale) } snprintf(envbuf, LC_ENV_BUFSIZE - 1, "%s=%s", envvar, result); + if (strcmp(orig_result, result) != 0) + { + char hex[2 * LC_ENV_BUFSIZE + 1]; + + hex_encode(result, Min(1 + strlen(result), LC_ENV_BUFSIZE), hex); + hex[sizeof(hex) - 1] = '\0'; + elog(FATAL, "setlocale() result %s clobbered to 0x%s", + orig_result, hex); + } if (putenv(envbuf)) return NULL;
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c index 4be735e..84215e0 100644 --- a/src/backend/utils/adt/pg_locale.c +++ b/src/backend/utils/adt/pg_locale.c @@ -183,6 +183,12 @@ pg_perm_setlocale(int category, const char *locale) */ if (category == LC_CTYPE) { + static char save_lc_ctype[LC_ENV_BUFSIZE]; + + /* copy setlocale() return value before callee invokes it again */ + strlcpy(save_lc_ctype, result, sizeof(save_lc_ctype)); + result = save_lc_ctype; + #ifdef ENABLE_NLS SetMessageEncoding(pg_bind_textdomain_codeset(textdomain(NULL))); #else
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers