On 21/06/2023 01:02, Joe Conway wrote:
On 6/19/23 19:30, Heikki Linnakangas wrote:
On 18/06/2023 21:27, Joe Conway wrote:
With the patch you're proposing, do we now have a coding rule that you
must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to
setlocale()? If so, you missed a few spots: pg_perm_setlocale,
pg_bind_textdomain_codeset, and cache_locale_time.

Well I was not proposing such a rule (trying to stay narrowly focused on
the demonstrated issue) but I suppose it might make sense. Anywhere we
use setlocale() we are depending on subsequent locale operations to use
the global locale. And uselocale(LC_GLOBAL_LOCALE) itself looks like it
ought to be pretty cheap.

The current locale affects a lot of other things than localeconv()
calls. For example, LC_MESSAGES affects all strerror() calls. Do we need
to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror()
calls too?

That seems heavy handed

Yet I think that's exactly where this is heading. See this case (for gettext() rather than strerror()):

postgres=# set lc_messages ='sv_SE.UTF8';
SET
postgres=# this prints syntax error in Swedish;
FEL:  syntaxfel vid eller nära "this"
LINE 1: this prints syntax error in Swedish;
        ^
postgres=# load 'plperl';
LOAD
postgres=# set lc_messages ='en_GB.utf8';
SET
postgres=# this *should* print syntax error in English;
FEL:  syntaxfel vid eller nära "this"
LINE 1: this *should* print syntax error in English;
        ^

I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after
returning from the perl interpreter, instead of before setlocale()
calls, if we want all Postgres code to run with the global locale. Not
sure how much performance overhead that would have.

I don't see how that is practical, or at least it does not really
address the issue. I think any loaded shared library could cause the
same problem by running newlocale() + uselocale() on init. Perhaps I
should go test that theory though.

Any shared library could do that, that's true. Any shared library could also call 'chdir'. But most shared libraries don't. I think it's the responsibility of the extension that loads the shared library, plperl in this case, to make sure it doesn't mess up the environment for the postgres backend.

Testing the patch, I bumped into this:

postgres=# create or replace function finnish_to_number() returns
numeric as $$ select to_number('1,23', '9D99'); $$ language sql set
lc_numeric to 'fi_FI.utf8';
CREATE FUNCTION
postgres=# DO LANGUAGE 'plperlu' $$
use POSIX qw(setlocale LC_NUMERIC);
use locale;

setlocale LC_NUMERIC, "fi_FI.utf8";

$n = 5/2;   # Assign numeric 2.5 to $n

spi_exec_query('SELECT finnish_to_number()');

$a = " $n"; # Locale-dependent conversion to string
elog(NOTICE, "half five is $n");       # Locale-dependent output
$$;
NOTICE:  half five is 2,5
DO
postgres=# select to_char(now(), 'Day');
WARNING:  could not determine encoding for locale "en_GB.UTF-8": codeset
is "ANSI_X3.4-1968"
     to_char
-----------
    Tuesday
(1 row)

Do you think that is because uselocale(LC_GLOBAL_LOCALE) pulls out the
rug from under perl?

libperl is fine in this case. But cache_locale_time() also calls setlocale(), and your patch didn't add the "uselocale(LC_GLOBAL_LOCALE)" there.

It's a valid concern that "uselocale(LC_GLOBAL_LOCALE)" could pull the rug from under perl. I tried to find issues like that, by calling locale-dependent functions in plperl, with SQL functions that call "uselocale(LC_GLOBAL_LOCALE)" via PGLC_localeconv() in between. But I couldn't find any case where the perl code would misbehave. I guess libperl calls uselocale() before any locale-dependent function, but I didn't look very closely.

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to