On Wed, Jun 10, 2015 at 10:09:38AM -0400, Tom Lane wrote:
> Noah Misch <n...@leadboat.com> writes:
> > I can reproduce this with "initdb --locale=C",
> > postgresql-9.4.3-1-windows-binaries.zip (32-bit), Windows 7 x64, and the
> > Windows ANSI code page set to CP936.  (Choose "Chinese (Simplified, PRC)" in
> > Control Panel -> Region and Language -> Administrative -> Language for
> > non-Unicode programs.)  It is neither necessary nor sufficient to change
> > Control Panel -> Region and Language -> Formats -> Format.  Binaries from
> > postgresql-9.4.3-1-windows-x64-binaries.zip do not exhibit the problem.  
> > Note
> > that CP936 is a PG_ENCODING_IS_CLIENT_ONLY() encoding.
> 
> Hm.  I could understand getting encoding difficulties in that environment,
> but it's hard to see why they'd manifest like this.  Can you trace through
> pg_perm_setlocale and figure out why it's reporting failure?

A faster test is to set LC_CTYPE=C in the environment and run "postgres
--version".  The root cause is a bug my commit 5f538ad introduced at the start
of the 9.4 cycle.  pg_perm_setlocale() now calls pg_bind_textdomain_codeset(),
which calls setlocale(LC_CTYPE, NULL).  POSIX permits that to clobber all
previous setlocale() return values, which it did here[1].  The ensuing
putenv("LC_CTYPE=<garbage bytes>") at the end of pg_perm_setlocale() fails
under Windows ANSI code page 936, because the garbage bytes often aren't a
valid CP936 string.  I would expect the same symptom on other multibyte
Windows locales.

While Windows was the bellwether, harm potential is greater on non-Windows
systems.  pg_perm_setlocale() sets the LC_CTYPE environment variable to help
PL/Perl avoid clobbering the process locale; see plperl_init_interp()
comments.  However, that function has bespoke code for Windows, on which
setting the environment variable doesn't help.  I don't know which other
platforms invalidate previous setlocale() return values on setlocale(LC_CTYPE,
NULL).  Therefore, I propose committing the attached diagnostic patch and
reverting it after about one buildfarm cycle.  It will make affected
configurations fail hard, and then I'll have a notion about the prevalence of
damage to expect in the field.

The actual fix is trivial, attached second.  This is for back-patch to 9.4.


[1] It does so in 32-bit "release" (non-debug), NLS builds done under Visual
Studio 2012 and Visual Studio 2013.  The official binaries used VS2013.  The
symptoms are slightly different under VS2012.  I did not test earlier
versions.  Debug builds and 64-bit builds were unaffected.
diff --git a/src/backend/utils/adt/pg_locale.c 
b/src/backend/utils/adt/pg_locale.c
index 4be735e..d33081b 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -58,6 +58,7 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_control.h"
 #include "mb/pg_wchar.h"
+#include "utils/builtins.h"
 #include "utils/hsearch.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
@@ -148,6 +149,7 @@ pg_perm_setlocale(int category, const char *locale)
        char       *result;
        const char *envvar;
        char       *envbuf;
+       char            orig_result[LC_ENV_BUFSIZE];
 
 #ifndef WIN32
        result = setlocale(category, locale);
@@ -173,6 +175,7 @@ pg_perm_setlocale(int category, const char *locale)
 
        if (result == NULL)
                return result;                  /* fall out immediately on 
failure */
+       strlcpy(orig_result, result, sizeof(orig_result));
 
        /*
         * Use the right encoding in translated messages.  Under ENABLE_NLS, let
@@ -231,6 +234,15 @@ pg_perm_setlocale(int category, const char *locale)
        }
 
        snprintf(envbuf, LC_ENV_BUFSIZE - 1, "%s=%s", envvar, result);
+       if (strcmp(orig_result, result) != 0)
+       {
+               char            hex[2 * LC_ENV_BUFSIZE + 1];
+
+               hex_encode(result, Min(1 + strlen(result), LC_ENV_BUFSIZE), 
hex);
+               hex[sizeof(hex) - 1] = '\0';
+               elog(FATAL, "setlocale() result %s clobbered to 0x%s",
+                        orig_result, hex);
+       }
 
        if (putenv(envbuf))
                return NULL;
diff --git a/src/backend/utils/adt/pg_locale.c 
b/src/backend/utils/adt/pg_locale.c
index 4be735e..84215e0 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -183,6 +183,12 @@ pg_perm_setlocale(int category, const char *locale)
         */
        if (category == LC_CTYPE)
        {
+               static char save_lc_ctype[LC_ENV_BUFSIZE];
+
+               /* copy setlocale() return value before callee invokes it again 
*/
+               strlcpy(save_lc_ctype, result, sizeof(save_lc_ctype));
+               result = save_lc_ctype;
+
 #ifdef ENABLE_NLS
                
SetMessageEncoding(pg_bind_textdomain_codeset(textdomain(NULL)));
 #else
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to