Re: [pcre-dev] Capture not reset inside recursion

2021-06-06 Thread Giuseppe D'Angelo via Pcre-dev
Hi ND,

On Sun, 6 Jun 2021 at 16:09, ND via Pcre-dev  wrote:
>
>
> On 2021-06-06 05:53, Zoltán Herczeg wrote:
> > ND I think you have found a pretty nice Perl bug, maybe you could report
> > it to them.
>
> Zoltan, thank you for great investigation.
> Now I sure it looks like a Perl bug.
>
> Everybody feel free to report it. My English is bad and I have much
> difficulties with reporting and further conversation.

I've done it here https://github.com/Perl/perl5/issues/18865 . Kudos
to you and Zoltán for the analysis, and thank you very much for
contributing to Perl and PCRE :)

Cheers,
-- 
Giuseppe D'Angelo

-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 


Re: [pcre-dev] Capture not reset inside recursion

2021-06-06 Thread ND via Pcre-dev


On 2021-06-06 05:53, Zoltán Herczeg wrote:
ND I think you have found a pretty nice Perl bug, maybe you could report  
it to them.


Zoltan, thank you for great investigation.
Now I sure it looks like a Perl bug.

Everybody feel free to report it. My English is bad and I have much  
difficulties with reporting and further conversation.


--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 


Re: [pcre-dev] Capture not reset inside recursion

2021-06-06 Thread Philip Hazel via Pcre-dev
I agree with Zoltan. I do not think this is a bug.

Regards,
Philip


On Sat, 5 Jun 2021 at 23:43, ND via Pcre-dev  wrote:

>
> Here is pcretest listing:
>
>
> PCRE2 version 10.35 2020-05-09
> /(?:(a)?\1)+/
> aaa
>   0: aaa
>
>
> Expected result:
>   0: aa
>
> Perl result:
>   0: aa
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
>
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 


Re: [pcre-dev] Capture not reset inside recursion

2021-06-06 Thread Zoltán Herczeg
I did more investigation:

Perl:
/(?:(?:(a)b)?\1)+/ matches abaa
/(?:(?:(ab))?\1)+/ does not match ababab

These pattern / input pairs match in PCRE2. I am pretty sure (?:(P))? is 
rewritten to ((?:P)?) in Perl, which is valid in some cases, but not in all 
cases. ND I think you have found a pretty nice Perl bug, maybe you could report 
it to them.

Regards,
Zoltan

 Eredeti levél 
Feladó: Zoltán Herczeg < hzmes...@freemail.hu (Link -> 
mailto:hzmes...@freemail.hu) >
Dátum: 2021 június 6 07:21:30
Tárgy: Re: [pcre-dev] Capture not reset inside recursion
Címzett: Pcre-dev@exim.org < nad...@mail.ru (Link -> mailto:nad...@mail.ru) >
The title is misleading, that feature is a JavaScript thing:
/(?:(a)b|\1)+/ matches aba in Perl, but not in JavaScript.
Anyway it looks like the problem here is ()? clears the capturing bracket in 
Perl when the empty case is selected while restores its previous value in PCRE2.
Matching /(?:(a)??b)+/ to abb also has this difference: the capturing bracket 
is empty in Perl, while set to a in PCRE2.
Even more interesting that /(?:(?:(a))??\1)+/ only matches to aa as well, while 
the body of the ?? should not be matched in the second iteration.
Let's do some debugging:
Match /(?:(?{ print "<$1>" })(?:(a))??(?{ print "[$1]" })\1)+/ to aaa
Output:
<>[][a][][a]
It the second iteration, the capturing bracket contains a before the ?? is 
executed, and reset to nothing after.
You will not belive this, but /(?:(?:(?{ print "!" })(a))?\1)+/ matches to aaa 
similar to PCRE2. The code block should have zero effect on the matching, still 
it disables something (probably an optimization) and works as expected.
Is this a perl bug?
Regards,
Zoltan
 
 Eredeti levél 
Feladó: ND via Pcre-dev < pcre-dev@exim.org (Link -> mailto:pcre-dev@exim.org) >
Dátum: 2021 június 6 00:44:08
Tárgy: [pcre-dev] Capture not reset inside recursion
Címzett: Pcre-dev@exim.org (Link -> mailto:Pcre-dev@exim.org)
Here is pcretest listing:
PCRE2 version 10.35 2020-05-09
/(?:(a)?\1)+/
aaa
0: aaa
Expected result:
0: aa
Perl result:
0: aa
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 


Re: [pcre-dev] Capture not reset inside recursion

2021-06-05 Thread Zoltán Herczeg
The title is misleading, that feature is a JavaScript thing:

/(?:(a)b|\1)+/ matches aba in Perl, but not in JavaScript.

Anyway it looks like the problem here is ()? clears the capturing bracket in 
Perl when the empty case is selected while restores its previous value in PCRE2.

Matching /(?:(a)??b)+/ to abb also has this difference: the capturing bracket 
is empty in Perl, while set to a in PCRE2.

Even more interesting that /(?:(?:(a))??\1)+/ only matches to aa as well, while 
the body of the ?? should not be matched in the second iteration.

Let's do some debugging:
Match /(?:(?{ print "<$1>" })(?:(a))??(?{ print "[$1]" })\1)+/ to aaa

Output:
<>[][a][][a]

It the second iteration, the capturing bracket contains a before the ?? is 
executed, and reset to nothing after.

You will not belive this, but /(?:(?:(?{ print "!" })(a))?\1)+/ matches to aaa 
similar to PCRE2. The code block should have zero effect on the matching, still 
it disables something (probably an optimization) and works as expected.

Is this a perl bug?

Regards,
Zoltan
 
 Eredeti levél 
Feladó: ND via Pcre-dev < pcre-dev@exim.org (Link -> mailto:pcre-dev@exim.org) >
Dátum: 2021 június 6 00:44:08
Tárgy: [pcre-dev] Capture not reset inside recursion
Címzett: Pcre-dev@exim.org (Link -> mailto:Pcre-dev@exim.org)
Here is pcretest listing:
PCRE2 version 10.35 2020-05-09
/(?:(a)?\1)+/
aaa
0: aaa
Expected result:
0: aa
Perl result:
0: aa
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 


[pcre-dev] Capture not reset inside recursion

2021-06-05 Thread ND via Pcre-dev



Here is pcretest listing:


PCRE2 version 10.35 2020-05-09
/(?:(a)?\1)+/
aaa
 0: aaa


Expected result:
 0: aa

Perl result:
 0: aa

--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev