Branch: refs/heads/smoke-me/khw-env
  Home:   https://github.com/Perl/perl5
  Commit: 8749dd46d2e966183cc4d36ea2fb0c26bac0a01d
      
https://github.com/Perl/perl5/commit/8749dd46d2e966183cc4d36ea2fb0c26bac0a01d
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M embed.fnc
    M embed.h
    M embedvar.h
    M handy.h
    M inline.h
    M intrpvar.h
    M locale.c
    M makedef.pl
    M mg.c
    M perl.c
    M perl.h
    M pod/perlvar.pod
    M proto.h
    M sv.c

  Log Message:
  -----------
  Add ability to emulate thread-safe locale operations

Locale information was originally global for an entire process.  Later,
it was realized that different threads could want to be running in
different locales.  Windows added this ability, and POSIX 2008 followed
suit (though using a completely different API).  When available, perl
automatically uses these capabilities.

But many platforms have neither, or their implementation, such as on
Darwin, is buggy.  This commit adds the capability for Perl programs to
operate as if the platform were thread-safe.

This implementation is based on the observation that the underlying
locale matters only to relatively few libc calls, and only during their
execution.  It can be anything at all at any other time.  perl keeps
what the proper locale should be for each category in a a per-thread
array.  Each locale-dependent operation must be wrapped in mutex
lock/unlock operations.  The lock additionally compares what libc knows
the locale to be, and what it should be for this thread at this time,
and changes the actual locale to the proper value if necessary.  That's
all that is needed.

This commit adds macros to perl.h, for example "MBTOWC_LOCK_", that
expand to do the mutex lock, and change the global locale to the
expected value.  On perls built without this emulation capability, they
are no-ops.  All code in the perl core (unless I've missed something),
are changed to use these macros (there weren't actually many places that
needed this).  Thus, any pure perl program will automatically become
locale-thread-safe under this Configuration.

In order for XS code to also become locale-thread-safe, it must use
these macros to wrap calls to locale-dependent functions.  Relatively
few modules call such functions.  For example, the only one I found that
ships with the perl core is Time::Piece, and it has more fundamental
issues with running under threads than this.  I am preparing pull
requests for it.

Thus, this is not completely transparent to code like native-thread-safe
locale handling is.  Therefore ${^SAFE_LOCALES} returns 2 (instead of 1)
for this type of thread-safety.

Another deficiency compared to the native thread safety is when a thread
calls a non-perl library that accesses the locale.  The typical example is
Gtk (though this particular application can be configured to not be
problematic).  With the native safe threads, everything works as long as
only one such thread is used per Perl program.  That thread would then
be the only one operating in the global locale, hence there are no
conflicts.  With this emulation, all threads are operating in the global
locale, and mutexes would have to be used to prevent conflicts.  To
minimize those, the code added in this commit restores the global locale
when through to the state it was in when started.

A major concern is the performance impact.  This is after all trading
speed for accuracy.  lib/locale_threads.t is noticeably slower when this
is being used.  But that is doing multiple threads constantly using
locale-dependent operations.  I don't notice any change with the rest of
the test suite.  In pure perl, this only comes into play while in the
scope of 'use locale' or when using some of the few POSIX:: functions
that are locale-dependent.  And to some extent when formatting, but the
regular overhead there should dwarf what this adds.

This commit leaves this feature off by default.  The next commit changes
that for the next few 5.39 development releases, so we can see if there
is actually an issue.


  Commit: 533be7e935dadffc0138f89bf9b9b8433103a01f
      
https://github.com/Perl/perl5/commit/533be7e935dadffc0138f89bf9b9b8433103a01f
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  DEBUG Lv to U


  Commit: 0689950acba5f2bb99e831a857a0e91b692dbe51
      
https://github.com/Perl/perl5/commit/0689950acba5f2bb99e831a857a0e91b692dbe51
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  extra debug


  Commit: 3f14c5c7c1961a7e2b969e8f5c0c0085985235e0
      
https://github.com/Perl/perl5/commit/3f14c5c7c1961a7e2b969e8f5c0c0085985235e0
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  f some emul old implement


  Commit: bd02866a65f526baaca56ea02b6cd3e7bbd9a17b
      
https://github.com/Perl/perl5/commit/bd02866a65f526baaca56ea02b6cd3e7bbd9a17b
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  more emul locks


  Commit: cfc5e946e931ecf97c2f90305836c1cc823cde08
      
https://github.com/Perl/perl5/commit/cfc5e946e931ecf97c2f90305836c1cc823cde08
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Revert "more emul locks"

This reverts commit 4733a1674423ee47b33eb0ee1882e1bf39faa1a6.


  Commit: 5720c727567f65525fe863545750f7e8cf9834b6
      
https://github.com/Perl/perl5/commit/5720c727567f65525fe863545750f7e8cf9834b6
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  langinfo lock


  Commit: 83a0df5a79f260118e110f36fde6e67b87719a09
      
https://github.com/Perl/perl5/commit/83a0df5a79f260118e110f36fde6e67b87719a09
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Revert "langinfo lock"

This reverts commit acaff35d7ed83830fb36c149aafede5cdf400061.


  Commit: c6b37a681598c2e81300c7dcb6d575148510bd98
      
https://github.com/Perl/perl5/commit/c6b37a681598c2e81300c7dcb6d575148510bd98
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  lock mask


  Commit: c19a5302ea24f51c834dd44ef2a176760674144d
      
https://github.com/Perl/perl5/commit/c19a5302ea24f51c834dd44ef2a176760674144d
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Revert "lock mask"

This reverts commit 3fd528c9d5d5b9c05dc1c697e61570b81811fb95.


  Commit: ae22053e5893507f9f9a19a05ea858171ccf25e3
      
https://github.com/Perl/perl5/commit/ae22053e5893507f9f9a19a05ea858171ccf25e3
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  locale.c: Maybe comment'


  Commit: 32916021f5e825caa6a5e8c3e223597f80ab3aa4
      
https://github.com/Perl/perl5/commit/32916021f5e825caa6a5e8c3e223597f80ab3aa4
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  emul assertion


  Commit: 50254d2c9627eaaf5d13b3836ac96a16c58bee5d
      
https://github.com/Perl/perl5/commit/50254d2c9627eaaf5d13b3836ac96a16c58bee5d
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c
    M makedef.pl

  Log Message:
  -----------
  Experimentally enable per-thread locale emulation

This is set to end in 5.39.10, but will give us field experience in the
meantime.


  Commit: 9cf5146689e3163a1839938cfd32846248a2ba4e
      
https://github.com/Perl/perl5/commit/9cf5146689e3163a1839938cfd32846248a2ba4e
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M makedef.pl
    M perl.h

  Log Message:
  -----------
  Don't do thread-safe locales emulation on mingw

MingW when compiled with the Universal C runtime (UCRT) is thread-safe
with respect to locales, just as VS 2015 and later MSVCRT compilations
are.

However, versions not using UCRT cannot be compiled to emulate
thread-safe locale.  I'm pretty sure this is due to a bug in the libc
strftime() function, having spent a bunch of hours working on this.

It often fails lib/locale_threads.t when using the emulation, but not
always.  The failure is always in strftime().

What made me think it could be perl is another characteristic of the
failures.  lib/locale_threads.t works by, in each thread, setting each
available locale category to a locale, different from any other category
in that thread, and as different as possible from the locale for the
corresponding category in any other thread.  For example thread 0 might
have LC_CTYPE set to locale X, LC_NUMERIC to Y, LC_TIME to Z, etc.
Thread 1 would use a locale for LC_CTYPE, as different from X as
possible, meaning executing the same operation on thread 0 and thread 1
would yield different expected results.  (It goes to some lengths to
calculate the biggest distance in the results.)  Similarly LC_NUMERIC
would have something almost completely different from Y; and so on.

Then each thread executes a batch of iterations.  Each iteration runs
all the operations I could find that perl uses that apply to LC_TYPE,
and all the ones that apply to each of the other categories.  And
verifies that all the results are as expected.

Simultaneously, the other threads are executing their batch.  It is
verifying that there is no bleed-through from one thread to another.  If
the threads all have the same results as the other threads, we couldn't
detect if there is real bleed-through or not.  This is solved by making
the results for each category as different as possible from any other
thread currently executing.

However, this isn't good enough.  Every so many iterations, each thread
changes to use a new set of locales.  This verifies that the locales can
be changed in a thread without that bleeding through to other threads.

And thread 0 is special.  It harvests the other threads as they finish,
and keeps going for a while.  This is to catch bugs in thread
completion, of which we've had a few.

MingW's failures all occur, when they occur, on the first iteration
following a switch to a new set of locales.  That is suspiciously like
it is a race condition in cleaning up from the previous setting.  But it
isn't the first test of the set of the first iteration of the next set.
It can be the 10th or so test.  I added enough debugging statements to
convince me that it isn't perl.

This is the failing code in locale.c:

        STRFTIME_LOCK;
        int len = strftime(buf, bufsize, fmt, mytm);
        STRFTIME_UNLOCK;

The returned 'buf' is not always correct.
T
The LOCK/UNLOCK macros on MingW with thread-safe emulation enabled, call
EnterCriticalSection(), and set the locales for the categories that
affect strftime() to the proper locale.  Just to be sure. I tested
setting LC_ALL to the correct value.  While in its uninterruptible (by
other locale handling code anyway) section, strftime() fills buf with
the result for the current locale (which STRFTIME_LOCK has set).

I added print statements within the critical section thusly

        STRFTIME_LOCK;
        DEBUG_U(PerlIO_printf(Perl_debug_log,
                              "calling strftime(%s), LC_ALL=%s\n",
                              fmt, setlocale(LC_ALL, NULL)));
        int len = strftime(buf, bufsize, fmt, mytm);
        DEBUG_U(PerlIO_printf(Perl_debug_log,
                              "return=%s, LC_ALL=%s\n",
                              buf, setlocale(LC_ALL, NULL)));
        STRFTIME_UNLOCK;

On this platform, setlocale() expands to _wsetlocale(), a Windows libc
call.

Here's what they showed for one failure.

        calling strftime(%b), LC_ALL=Hungarian_Hungary.1250
        return=marc., LC_ALL=Hungarian_Hungary.1250

The 'a' in the Hungarian for March is supposed to be a U+00E1, with an
acute accent, so this is wrong.

strftime() also is passed a pointer to a struct tm, which is filled in
with various integers which indicate in this case which month the %b is
supposed to return.  That it is returning something very much like márch
indicates those integers are correct.

Not shown in the example above are the other prints I added to verify
that we are indeed in a critical section.  I didn't see a way to
actually test for this via a libc call, but one could use strace and
wade through the output.  But there are print statements that print out
immediately before entering a critical section, and immediately after
leaving it.  I verified that those prints indicate this code is in a
critical section.

I note that this box has actually not very many locales, so that the
distance between the results of various threads isn't all that large.
Pretty much all the locales are CP 1250, 1251, 1252, and 1257, and no
UTF-8 ones, so all locales are single byte.  None of them map \XE1 into
plain 'a', which is what we are seeing returned, so the cleanup theory
seems wrong.  Sometimes the return is '?' or a series of them,
indicating that the returned character is mojibake.

None of the locales I saw had 'marc\.' as a possible return.  It appears
only here in the entire trace of all threads.  This makes it again less
likely that it is a cleanup issue.  But where did it come from?.  I
don't know.  The value for the C locale is 'Mar', so it didn't come from
there.

The localeconv() function is also broken in this Configuration.  We long
ago figured out a workaround for that.  I tried that same workaround for
strftime(), and it didn't help.


  Commit: ad5b70d4620a2538818420a7261a8028549c3984
      
https://github.com/Perl/perl5/commit/ad5b70d4620a2538818420a7261a8028549c3984
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M makedef.pl
    M perl.h

  Log Message:
  -----------
  Revert "Don't do thread-safe locales emulation on mingw"

This reverts commit 4e4dfa1146e1f389110d001587ccb0fadec4323b.


  Commit: db79b2fb94f8a50218bb346e3f56af81915f1356
      
https://github.com/Perl/perl5/commit/db79b2fb94f8a50218bb346e3f56af81915f1356
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M perl.h

  Log Message:
  -----------
  XXX perl.h maybe drop


  Commit: aaae182fa4e74f21d146f8f5f0bc2aa01456a845
      
https://github.com/Perl/perl5/commit/aaae182fa4e74f21d146f8f5f0bc2aa01456a845
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M makedef.pl

  Log Message:
  -----------
  makedef.pl: PL_cur_locale_obj is only POSIX 2008 multiplicity


  Commit: 632b431d605f4a2fd1a0813b665a6752df561579
      
https://github.com/Perl/perl5/commit/632b431d605f4a2fd1a0813b665a6752df561579
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M lib/locale_threads.t

  Log Message:
  -----------
  locale_threads.t: Better handle weird locales

The previous code was generating bunches of uninitialized variable
warnings, due to 1) not checking for definedness early; 2) the loop
termination needs to be reevaluated each time because there is a
potential splice, shortening the array.

This only happens, I believe, on MingW not using UCRT.


  Commit: 7e5f8483e1ad555a5b174d6c307bd6b15960dfb3
      
https://github.com/Perl/perl5/commit/7e5f8483e1ad555a5b174d6c307bd6b15960dfb3
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M lib/locale_threads.t

  Log Message:
  -----------
  Revert "locale_threads.t: Skip on OpenBSD and DragonFly threaded builds"

This reverts commit 1d74e8214dd53cf0fa9e8c5aab3e6187685eadcd, as they
have been modified


  Commit: e8061c1277560da8ac6b9ab1140b313eb5cde0fd
      
https://github.com/Perl/perl5/commit/e8061c1277560da8ac6b9ab1140b313eb5cde0fd
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Debug uselocale


  Commit: 84620d7c1f56ae1d4c3c55247495715ed2586dbd
      
https://github.com/Perl/perl5/commit/84620d7c1f56ae1d4c3c55247495715ed2586dbd
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M pp.c

  Log Message:
  -----------
  pp_study: hook


  Commit: 1f94716672536c5b13a4342eb1eceeeee30adfeb
      
https://github.com/Perl/perl5/commit/1f94716672536c5b13a4342eb1eceeeee30adfeb
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M sv.c

  Log Message:
  -----------
  sv.c need to check for pv in sv in sv_setpvf


  Commit: 18838991d9aa9b816f43a95ef9bd64d21f937b0c
      
https://github.com/Perl/perl5/commit/18838991d9aa9b816f43a95ef9bd64d21f937b0c
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M sv.c

  Log Message:
  -----------
  perlapi: Add detail to sv_setpv_bufsize()


  Commit: d467aca704305b6962c029b1b0515e2270cd31c4
      
https://github.com/Perl/perl5/commit/d467aca704305b6962c029b1b0515e2270cd31c4
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M sv_inline.h

  Log Message:
  -----------
  Add comment to sv_setpv_freshbuf


  Commit: f63b2e13d40fc335e81962c50b88fb5ccb46f2de
      
https://github.com/Perl/perl5/commit/f63b2e13d40fc335e81962c50b88fb5ccb46f2de
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M perl.h

  Log Message:
  -----------
  locale mutexes: Win32 are general without simulating

We can get rid of the simulation needed for other platforms.


  Commit: 7138ebead8e7e906dd1452e85490e925ba31a644
      
https://github.com/Perl/perl5/commit/7138ebead8e7e906dd1452e85490e925ba31a644
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  locale.c: Two loop indices are confined to an enum

They don't take on all possible unsigned values.  Create a macro to do
the casting necessary for some compilers


  Commit: b4ab381516c01874da3d3ce33c8907b64d302f44
      
https://github.com/Perl/perl5/commit/b4ab381516c01874da3d3ce33c8907b64d302f44
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  locale.c: Use macro created in previous commit

This is a more general macro than these others, which can be rewritten
more simply in terms of it.


  Commit: 26d316e71da0aedbc7e571d3517b9e8fc67c3f37
      
https://github.com/Perl/perl5/commit/26d316e71da0aedbc7e571d3517b9e8fc67c3f37
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M perl.h

  Log Message:
  -----------
  perl.h: Clarify comment


  Commit: 1031e6609ca852996f0d480a01af4725aabd586b
      
https://github.com/Perl/perl5/commit/1031e6609ca852996f0d480a01af4725aabd586b
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  White


  Commit: 56f8657753b12555ab2f33def970e97e70b7f72f
      
https://github.com/Perl/perl5/commit/56f8657753b12555ab2f33def970e97e70b7f72f
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  fundamental


  Commit: 11054a81ce39318220fb27625fd13072f1727dde
      
https://github.com/Perl/perl5/commit/11054a81ce39318220fb27625fd13072f1727dde
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M embed.fnc
    M embed.h
    M locale.c
    M proto.h

  Log Message:
  -----------
  immediate use


  Commit: decc83b511188b73f517c4caeaff0eb7ab6d8577
      
https://github.com/Perl/perl5/commit/decc83b511188b73f517c4caeaff0eb7ab6d8577
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  more immed


  Commit: 804f633e518a2105f9b462aac7f3aa1ee718c836
      
https://github.com/Perl/perl5/commit/804f633e518a2105f9b462aac7f3aa1ee718c836
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M embed.fnc
    M embed.h
    M locale.c
    M proto.h

  Log Message:
  -----------
  add is_cur_locale_utf8


  Commit: 484eabfe0a2b9c4024434ebfc434f4a937b9a8c1
      
https://github.com/Perl/perl5/commit/484eabfe0a2b9c4024434ebfc434f4a937b9a8c1
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M embed.fnc
    M embed.h
    M proto.h

  Log Message:
  -----------
  embed.fnc: Don't compile undefined function

This function now is defined only when USE_LOCALE is #defined; move its
specification in embed.fnc accordingly


  Commit: c392c121cffe8f3723fbe3666f13cb1967d5a886
      
https://github.com/Perl/perl5/commit/c392c121cffe8f3723fbe3666f13cb1967d5a886
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  move subs around


  Commit: 520ad38d9fbce8338bc8123083cf43aaf8688479
      
https://github.com/Perl/perl5/commit/520ad38d9fbce8338bc8123083cf43aaf8688479
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  locale.c: Comments, white space


  Commit: fb8e184c45f6102623f614c9bc76a3d4e2b2c504
      
https://github.com/Perl/perl5/commit/fb8e184c45f6102623f614c9bc76a3d4e2b2c504
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  final


  Commit: ca693a0cbfcec33ec6cc4a236a69af9a388a69bc
      
https://github.com/Perl/perl5/commit/ca693a0cbfcec33ec6cc4a236a69af9a388a69bc
  Author: Karl Williamson <k...@cpan.org>
  Date:   2024-02-07 (Wed, 07 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  MULT


Compare: https://github.com/Perl/perl5/compare/cda83987bff8...ca693a0cbfce

Reply via email to