=?UTF-8?Q?Juan_Jos=C3=A9_Santamar=C3=ADa_Flecha?= <[email protected]>
writes:
> On Sun, Mar 29, 2020 at 3:29 AM Tom Lane <[email protected]> wrote:
>> The reason for the hack, per the comments, is that VS2015
>> omits a codepage field from the result of _create_locale();
>> and some optimism is expressed therein that Microsoft might
>> undo that oversight in future. Has this been fixed in more
>> recent VS versions? If not, can we find another, more robust
>> way to do it?
> While working on another issue I have seen this issue reproduce in VS2019.
> So no, it has not been fixed.
Oh well, I figured that was too optimistic :-(
> Please find attached a patch that provides a better detection of the "uft8"
> cases.
In general, I think the problem is that we might be dealing with a
Unix-style locale string, in which the encoding name might be quite
a few other things besides "utf8". But actually your patch works
for that too, since what's going to happen next is we'll search the
encoding_match_list[] for a match. I do suggest being a bit more
paranoid about what's a codepage number though, as attached.
(Untested, since I lack a Windows environment, but it's pretty
straightforward code.)
regards, tom lane
diff --git a/src/port/chklocale.c b/src/port/chklocale.c
index c9c680f..9e3c6db 100644
--- a/src/port/chklocale.c
+++ b/src/port/chklocale.c
@@ -239,25 +239,44 @@ win32_langinfo(const char *ctype)
{
r = malloc(16); /* excess */
if (r != NULL)
- sprintf(r, "CP%u", cp);
+ {
+ /*
+ * If the return value is CP_ACP that means no ANSI code page is
+ * available, so only Unicode can be used for the locale.
+ */
+ if (cp == CP_ACP)
+ strcpy(r, "utf8");
+ else
+ sprintf(r, "CP%u", cp);
+ }
}
else
#endif
{
/*
- * Locale format on Win32 is <Language>_<Country>.<CodePage> . For
- * example, English_United States.1252.
+ * Locale format on Win32 is <Language>_<Country>.<CodePage>. For
+ * example, English_United States.1252. If we see digits after the
+ * last dot, assume it's a codepage number. Otherwise, we might be
+ * dealing with a Unix-style locale string; Windows' setlocale() will
+ * take those even though GetLocaleInfoEx() won't, so we end up here.
+ * In that case, just return what's after the last dot and hope we can
+ * find it in our table.
*/
codepage = strrchr(ctype, '.');
if (codepage != NULL)
{
- int ln;
+ size_t ln;
codepage++;
ln = strlen(codepage);
r = malloc(ln + 3);
if (r != NULL)
- sprintf(r, "CP%s", codepage);
+ {
+ if (strspn(codepage, "0123456789") == ln)
+ sprintf(r, "CP%s", codepage);
+ else
+ strcpy(r, codepage);
+ }
}
}