[Perl/perl5] 61ee02: perl.h: Make sure PERL_IMPLICIT_CONTEXT doesn't co...

2023-01-12 Thread Karl Williamson via perl5-changes
  Branch: refs/heads/smoke-me/khw-env
  Home:   https://github.com/Perl/perl5
  Commit: 61ee02bf8688aa6e23cad5cb213994ca7ad2c046
  
https://github.com/Perl/perl5/commit/61ee02bf8688aa6e23cad5cb213994ca7ad2c046
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M perl.h

  Log Message:
  ---
  perl.h: Make sure PERL_IMPLICIT_CONTEXT doesn't come back

This is an obsolete name, retained for back compat with cpan.  Make sure
the core doesn't have it defined.


  Commit: 97692bac670ba7ef25084a90fd1c205345ceb97e
  
https://github.com/Perl/perl5/commit/97692bac670ba7ef25084a90fd1c205345ceb97e
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M pp.c

  Log Message:
  ---
  pp.c: Need to lock NUMERIC category only

This was doing a general locale lock, but only LC_NUMERIC is needed, and
a future commit will want to know that it is specifically LC_NUMERIC
that is affected.


  Commit: a7f744d9597bc05ea183e3fad96a7584a8ce2e56
  
https://github.com/Perl/perl5/commit/a7f744d9597bc05ea183e3fad96a7584a8ce2e56
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M t/porting/customized.dat
M vutil.c

  Log Message:
  ---
  vutil.c: Clean up white space

Change tabs to blanks; Fix indentation; chomp trailing white space

Remove some blank lines that don't contribute to readability


  Commit: 74c25c9b86b3620a0d0863af17f8d657b851ed1d
  
https://github.com/Perl/perl5/commit/74c25c9b86b3620a0d0863af17f8d657b851ed1d
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M cpan/Archive-Tar/t/02_methods.t

  Log Message:
  ---
  XXX skip Archive-Tar because of symlinks


  Commit: 283da3c8036233dd901a36de5dfce030a894ec73
  
https://github.com/Perl/perl5/commit/283da3c8036233dd901a36de5dfce030a894ec73
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M t/porting/cmp_version.t

  Log Message:
  ---
  XXX skip cmp_version.t because of sym links


  Commit: f0beb5459e711f923d1907abbcefd9e46e50
  
https://github.com/Perl/perl5/commit/f0beb5459e711f923d1907abbcefd9e46e50
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M perl.h

  Log Message:
  ---
  XXX temp to test broken lconv on non-Windows


  Commit: 0ee842a6baa055195ac81ce876d6c5ce8c2cfc6e
  
https://github.com/Perl/perl5/commit/0ee842a6baa055195ac81ce876d6c5ce8c2cfc6e
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M cpan/Sys-Syslog/t/syslog-inet-udp.t
M cpan/Sys-Syslog/t/syslog.t

  Log Message:
  ---
  XXX skip syslog tests because fail without LC_TIME


  Commit: 7f4fb2860d57038141c06ab8d888fa50269b4d4c
  
https://github.com/Perl/perl5/commit/7f4fb2860d57038141c06ab8d888fa50269b4d4c
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M Configure

  Log Message:
  ---
  XXX Configure temporary to get no_nl, etc working


  Commit: 43ed0b2bf22553e7cd8043a829e5b7183d30d7d4
  
https://github.com/Perl/perl5/commit/43ed0b2bf22553e7cd8043a829e5b7183d30d7d4
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M Configure

  Log Message:
  ---
  Regenerate Configure after metaconfig backports applied


  Commit: 6857600ac705c345d9e1854db9d9b0caa631fc5c
  
https://github.com/Perl/perl5/commit/6857600ac705c345d9e1854db9d9b0caa631fc5c
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M Configure
M config_h.SH
M uconfig.h

  Log Message:
  ---
  Regenerate Configure after rm thread-safe nl_langinfo_l


  Commit: aa3f6f1d3f7faef378e8dfc0bb270b20d9003bad
  
https://github.com/Perl/perl5/commit/aa3f6f1d3f7faef378e8dfc0bb270b20d9003bad
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M Configure
M Cross/config.sh-arm-linux
M Cross/config.sh-arm-linux-n770
M Porting/config.sh
M config_h.SH
M configure.com
M metaconfig.h
M plan9/config_sh.sample
M uconfig.h
M uconfig.sh
M uconfig64.sh
M win32/config.gc
M win32/config.vc

  Log Message:
  ---
  Regenerate Configure after LC_ALL


  Commit: 5a4add6971a5424fe6d1d7f3c3929f08f86f6025
  
https://github.com/Perl/perl5/commit/5a4add6971a5424fe6d1d7f3c3929f08f86f6025
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M Configure
M Cross/config.sh-arm-linux
M Cross/config.sh-arm-linux-n770
M Porting/config.sh
M config_h.SH
M configure.com
M metaconfig.h
M plan9/config_sh.sample
M uconfig.h
M uconfig.sh
M uconfig64.sh
M win32/config.gc
M win32/config.vc

  Log Message:
  ---
  Revert "Regenerate Configure after LC_ALL"

This reverts commit 

[Perl/perl5] 52bccf: regcomp_internal.h: Fix leak in regex tests

2023-01-12 Thread Karl Williamson via perl5-changes
  Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: 52bccf632810b58fa7086ef36a1a71d732c5549c
  
https://github.com/Perl/perl5/commit/52bccf632810b58fa7086ef36a1a71d732c5549c
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regcomp_internal.h

  Log Message:
  ---
  regcomp_internal.h: Fix leak in regex tests

Commit fe5492d916201ce31a107839a36bcb1435fe7bf0 introduced leaks when a
regex compilation fails.  This commit uses the standard method we have
to deal with these kinds of things.




[Perl/perl5] 94e919: POSIX.pod: Clarify mbtowc(), wctomb() pod

2023-01-12 Thread Karl Williamson via perl5-changes
  Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: 94e919c3f7ce3a89aefc27850b210a9769c65681
  
https://github.com/Perl/perl5/commit/94e919c3f7ce3a89aefc27850b210a9769c65681
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M ext/POSIX/lib/POSIX.pod

  Log Message:
  ---
  POSIX.pod: Clarify mbtowc(), wctomb() pod




[Perl/perl5] 65be4a: Fix PerlEnv_putenv threaded compilation on Windows

2023-01-12 Thread Karl Williamson via perl5-changes
  Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: 65be4a0ded9e67be17d2c559c1c0dc8d3d746ed6
  
https://github.com/Perl/perl5/commit/65be4a0ded9e67be17d2c559c1c0dc8d3d746ed6
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M embed.fnc
M embed.h
M inline.h
M iperlsys.h
M proto.h

  Log Message:
  ---
  Fix PerlEnv_putenv threaded compilation on Windows

A second compilation of a workspace would fail.  The first one would
succeed because miniperl was being used, which isn't threaded.




[Perl/perl5] 2dc676: Don't panic if can't destroy mutex during global d...

2023-01-12 Thread Karl Williamson via perl5-changes
  Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: 2dc676e9ce23feb0ae948ea62f7e4301931d2188
  
https://github.com/Perl/perl5/commit/2dc676e9ce23feb0ae948ea62f7e4301931d2188
  Author: Karl Williamson 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M thread.h

  Log Message:
  ---
  Don't panic if can't destroy mutex during global destruction

It's going to be destroyed anyway; this just obscures what the real
failure might be.




[Perl/perl5] dcf3fa: Syntax highlight configpm

2023-01-12 Thread Elvin Aslanov via perl5-changes
  Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: dcf3fa421fa60f3cfa5535313b765af4dc49ba36
  
https://github.com/Perl/perl5/commit/dcf3fa421fa60f3cfa5535313b765af4dc49ba36
  Author: Elvin Aslanov 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M configpm

  Log Message:
  ---
  Syntax highlight configpm

For enhancing the source readability in GitHub:

cf. https://raw.githubusercontent.com/Perl/perl5/blead/PACKAGING




[Perl/perl5] 32c009: t/re/re_rests - extend test to show more buffers

2023-01-12 Thread Yves Orton via perl5-changes
  Branch: refs/heads/yves/curlyx_curlym
  Home:   https://github.com/Perl/perl5
  Commit: 32c009ba5d904b97fa291aa857234dd663694b2c
  
https://github.com/Perl/perl5/commit/32c009ba5d904b97fa291aa857234dd663694b2c
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M t/re/re_tests

  Log Message:
  ---
  t/re/re_rests - extend test to show more buffers

This is a tricky test, showing more buffers makes it a bit easier
to understand if you break it. (Guess what I did?)


  Commit: a560ea0be847f8d00ecae70b4894fd3fe7165737
  
https://github.com/Perl/perl5/commit/a560ea0be847f8d00ecae70b4894fd3fe7165737
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regcomp.c
M regcomp.h
M regcomp_internal.h
M t/re/pat.t
M t/re/reg_mesg.t

  Log Message:
  ---
  regcomp.c - increase size of CURLY nodes so the min/max is a I32

This allows us to resolve a test inconsistency between CURLYX and CURLY
and CURLYM. We use I32 because the existing count logic uses -1 and
this keeps everything unsigned compatible.


  Commit: cd38d640c233998e5a998a6f53ff668369fd3168
  
https://github.com/Perl/perl5/commit/cd38d640c233998e5a998a6f53ff668369fd3168
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regcomp_internal.h
M regcomp_study.c

  Log Message:
  ---
  regcomp_study.c - Add a way to disable CURLYX optimisations

Also break up the condition so there is one condition per line so
it is more readable, and fold repeated binary tests together. This
makes it more obvious what the expression is doing.


  Commit: 76e1f20f1d80d8d1bca7e1a4b7410dfe21354764
  
https://github.com/Perl/perl5/commit/76e1f20f1d80d8d1bca7e1a4b7410dfe21354764
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regcomp_debug.c
M regcomp_study.c
M t/re/pat_re_eval.t

  Log Message:
  ---
  regcomp_study.c - disable CURLYX optimizations when EVAL has been seen 
anywhere

Historically we disabled CURLYX optimizations when they
*contained* an EVAL, on the assumption that the optimization might
affect how many times, etc, the eval was called. However, this is
also true for CURLYX with evals *afterwards*. If the CURLYN or CURLYM
optimization can prune off the search space, then an eval afterwards
will be affected. An when you take into account GOSUB, it means that
an eval in front might be affected by an optimization after it.

So for now we disable CURLYN and CURLYM in any pattern with an EVAL.


  Commit: 995106349af81a044b298cf9c93b5903acf4670c
  
https://github.com/Perl/perl5/commit/995106349af81a044b298cf9c93b5903acf4670c
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regexec.c

  Log Message:
  ---
  regexec.c - rework CLOSE_CAPTURE() macro to take a rex argument

This allows it to be used in contexts where rex isn't set up under
this name.


  Commit: d8f65a38e2cd399eb371be91874931737919938b
  
https://github.com/Perl/perl5/commit/d8f65a38e2cd399eb371be91874931737919938b
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regcomp.c
M regcomp.h

  Log Message:
  ---
  regcomp.h - get rid of EXTRA_STEP defines

They are unused these days.


  Commit: 568942115c3335bae354da4a9a9e7d8f89eeeaee
  
https://github.com/Perl/perl5/commit/568942115c3335bae354da4a9a9e7d8f89eeeaee
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regcomp.c

  Log Message:
  ---
  regcomp.c - add whitespace to binary operation

The tight & is hard to read.


  Commit: 9998e79469c31c02a4a7fb5b394df5c57d6a299e
  
https://github.com/Perl/perl5/commit/9998e79469c31c02a4a7fb5b394df5c57d6a299e
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M regcomp_trie.c

  Log Message:
  ---
  regcomp_trie.c - use the indirect types so we are safe to changes

We shouldnt assume that a TRIEC is a regcomp_charclass. We have a per
opcode type exactly for this type of use, so lets use it.


  Commit: b223d00a98b4a766af19d523037f9b6a8789f43c
  
https://github.com/Perl/perl5/commit/b223d00a98b4a766af19d523037f9b6a8789f43c
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M pod/perldebguts.pod
M pp_ctl.c
M regcomp.c
M regcomp.h
M regcomp.sym
M regcomp_debug.c
M regexec.c
M regexp.h
M regnodes.h
M t/re/pat.t
M t/re/pat_rt_report.t
M t/re/re_tests

  Log Message:
  ---
  regcomp.c - Resolve issues clearing buffers in CURLYX (MAJOR-CHANGE)

CURLYX doesn't reset capture buffers properly. It is possible
for multiple buffers to be defined at once with values from
different iterations of the loop, which doesn't make sense really.

An example is this:

  "foobarfoo"=~/((foo)|(bar))+/

after this matches $1 should equal $2 and $3 should 

[Perl/perl5]

2023-01-12 Thread Yves Orton via perl5-changes
  Branch: refs/heads/yves/re_capture
  Home:   https://github.com/Perl/perl5


[Perl/perl5] 1052f3: test.pl - add support for rtriming fresh perl output

2023-01-12 Thread Yves Orton via perl5-changes
  Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: 1052f3d04f55d33ed5952af431fb91ccbcf6669f
  
https://github.com/Perl/perl5/commit/1052f3d04f55d33ed5952af431fb91ccbcf6669f
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M t/test.pl

  Log Message:
  ---
  test.pl - add support for rtriming fresh perl output

This makes it easier to do regexp debug tests, where we don't care
about trailing whitespace.

It also fixes the line number reporting for fresh_perl_is() and
fresh_perl_like() so that it shows the actual place where the line
number is located, and it changes the relevant code to work properly
with external $Level overrides.


  Commit: 17419a88100044035ee6dd9f8947f7d411d94863
  
https://github.com/Perl/perl5/commit/17419a88100044035ee6dd9f8947f7d411d94863
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M handy.h

  Log Message:
  ---
  handy.h - add NewCopy() macro to combine New and Copy.


  Commit: fe5492d916201ce31a107839a36bcb1435fe7bf0
  
https://github.com/Perl/perl5/commit/fe5492d916201ce31a107839a36bcb1435fe7bf0
  Author: Yves Orton 
  Date:   2023-01-12 (Thu, 12 Jan 2023)

  Changed paths:
M embed.fnc
M embed.h
M mg.c
M proto.h
M regcomp.c
M regcomp_debug.c
M regcomp_internal.h
M regexec.c
M regexp.h
M t/re/pat_advanced.t
M t/re/re_tests

  Log Message:
  ---
  regcomp.c etc - rework branch reset so it works properly

Branch reset was hacked in without much thought about how it might interact
with other features. Over time we added named capture and recursive patterns
with GOSUB, but I guess because branch reset is somewhat esoteric we didnt
notice the accumulating issues related to it.

The main problem was my original hack used a fairly simple device to give
multiple OPEN/CLOSE opcodes the same target buffer id. When it was introduced
this was fine. When GOSUB was added later however, we overlooked at that this
broke a key part of the book-keeping for GOSUB.

A GOSUB regop needs to know where to jump to, and which close paren to stop
at. However the structure of the regexp program can change from the time the
regop is created. This means we keep track of every OPEN/CLOSE regop we
encounter during parsing, and when something is inserted into the middle of
the program we make sure to move the offsets we store for the OPEN/CLOSE data.
This is essentially keyed and scaled to the number of parens we have seen.
When branch reset is used however the number of OPEN/CLOSE regops is more than
the number of logical buffers we have seen, and we only move one of the
OPEN/CLOSE buffers that is in the branch reset. Which of course breaks things.

Another issues with branch reset is that it creates weird artifacts like this:
/(?|(?a)|(?b))(?)(?)/ where the (?) actually maps to the (?a)
capture buffer because they both have the same id. Another case is that you
cannot check if $+{b} matched and $+{a} did not, because conceptually they
were the same buffer under the hood.

These bugs are now fixed. The "aliasing" of capture buffers to each other is
now done virtually, and under the hood each capture buffer is distinct. We
introduce the concept of a "logical parno" which is the user visible capture
buffer id, and keep it distinct from the true capture buffer id. Most of the
internal logic uses the "true parno" for its business, so a bunch of problems
go away, and we keep maps from logical to physical parnos, and vice versa,
along with a map that gives use the "next physical parno with the same
logical parno". Thus we can quickly skip through the physical capture buffers
to find the one that matched. This means we also have to introduce a
logical_total_parens as well, to complement the already existing total_parens.
The latter refers to the true number of capture buffers. The former represents
the logical number visible to the user.

It is helpful to consider the following table:

  Logical:$1  $2 $3   $2 $3 $4 $2 $5
  Physical:1   2  34  5  6  7  8
  Next:0   4  57  0  0  0  0
  Pattern:   /(pre)(?|(?a)(?b)|(?c)(?d)(?e)|(?))(post)/

The names are mapped to physical buffers. So $+{b} will show what is in
physical buffer 3. But $3 will show whichever of buffer 3 or 5 matched.
Similarly @{^CAPTURE} will contain 5 elements, not 8. But %+ will contain all
6 named buffers.

Since the need to map these values is rare, we only store these maps when they
are needed and branch reset has been used, when they are NULL it is assumed
that physical and logical buffers are identical.

Currently the way this change is implemented will likely break plug in regexp
engines because they will be missing the new logical_total_parens field at
the very least. Given that the perl internals code is somewhat poorly
abstracted from the regexp engine,