[issue34294] re.finditer and lookahead bug

2019-01-19 Thread Ma Lin


Ma Lin  added the comment:

Serhiy Storchaka lost his sight.
Please stop any work and rest, because your left eye will have more burden, and 
your mental burden will make it worse.
Go to hospital ASAP.

If any other core developer want to review this patch, I would like to give a 
detailed explanation, the logic is not very compilcated.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin


Ma Lin  added the comment:

I tried to fix it, feel free to create a new PR if you don't want this one.

PR11546 has a small question, should `state->data_stack` be dealloced as well?

FYI, function `state_reset(SRE_STATE* state)` in file `_sre.c`:
https://github.com/python/cpython/blob/d4f9cf5545d6d8844e0726552ef2e366f5cc3abd/Modules/_sre.c#L340-L352

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch, patch, patch
pull_requests: +11162, 11163, 11164
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +11162
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch, patch
pull_requests: +11162, 11163
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin


Ma Lin  added the comment:

Simplify the test-case, it seem the `state` is not reset properly.

Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47)
>>> import re
>>> re.findall(r"(?=(<\w+>)(<\w+>)?)", "")
[('', ''), ('', '')]

Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28)
>>> import re
>>> re.findall(r"(?=(<\w+>)(<\w+>)?)", "")
[('', ''), ('', '')]

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2018-09-10 Thread Ma Lin


Ma Lin  added the comment:

This bug generates wrong results silently, so I suggest mark it as release 
blocker for 3.7.1

--
nosy: +Ma Lin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2018-07-31 Thread Karthikeyan Singaravelan

Karthikeyan Singaravelan  added the comment:

➜  cpython git:(70d56fb525) ✗ ./python.exe
Python 3.7.0a2+ (tags/v3.7.0a2-341-g70d56fb525:70d56fb525, Jul 31 2018, 
21:58:10)
[Clang 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
➜  cpython git:(70d56fb525) ✗ ./python.exe -c 'import re; print([m.groupdict() 
for m in re.finditer(r"(?=<(?P\w+)/?>(?:(?P.+?))?)", 
"")])'
[{'tag': 'test', 'text': ''}, {'tag': 'foo2', 'text': ''}]


➜  cpython git:(e69fbb6a56) ✗ ./python.exe
Python 3.7.0a2+ (tags/v3.7.0a2-340-ge69fbb6a56:e69fbb6a56, Jul 31 2018, 
22:12:06)
[Clang 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
➜  cpython git:(e69fbb6a56) ✗ ./python.exe -c 'import re; print([m.groupdict() 
for m in re.finditer(r"(?=<(?P\w+)/?>(?:(?P.+?))?)", 
"")])'
[{'tag': 'test', 'text': ''}, {'tag': 'foo2', 'text': None}]

Does this have something to do with 
70d56fb52582d9d3f7c00860d6e90570c6259371(bpo-25054, bpo-1647489) ?


Thanks

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2018-07-31 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2018-07-31 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
assignee:  -> serhiy.storchaka
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34294] re.finditer and lookahead bug

2018-07-31 Thread beardypig


New submission from beardypig :

I am experiencing and issue with the following regex when using finditer. 

(?=<(?P\w+)/?>(?:(?P.+?))?)", "

(I know it's not the best method of dealing with HTML, and this is a simplified 
version)

For example:

[m.groupdict() for m in 
re.finditer(r"(?=<(?P\w+)/?>(?:(?P.+?))?)", 
"")]

In Python 2.7, 3.5, and 3.6 it returns

[{'tag': 'test', 'text': ''}, {'tag': 'foo2', 'text': None}]

But starting with 3.7 it returns

[{'tag': 'test', 'text': ''}, {'tag': 'foo2', 'text': ''}]

The "text" group appears to be a copy of the previous "text" group.


Some other examples:

"Hello" => [{'tag': 'test', 'text': 'Hello'}, {'tag': 
'foo', 'text': 'Hello'}] (expected: [{'tag': 'test', 'text': 'Hello'}, {'tag': 
'foo', 'text': None}])
"Hello" => [{'tag': 'test', 'text': 'Hello'}, 
{'tag': 'foo', 'text': 'Hello'}, {'tag': 'foo', 'text': None}] (expected: 
[{'tag': 'test', 'text': 'Hello'}, {'tag': 'foo', 'text': None}, {'tag': 'foo', 
'text': None}])

--
components: Regular Expressions
messages: 322771
nosy: beardypig, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.finditer and lookahead bug
type: behavior
versions: Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com