Karthikeyan Singaravelan <tir.kar...@gmail.com> added the comment:

Copy paste of the contents in the text file

In the re module there is an experimental feature called Scanner.
Some unexpected behavior was found while working with it.
Here is an example:

>>> re.Scanner([('\w+=(\d+);', lambda s,g: s.match.group(1))]).scan('x=5;')
(['5;'], '')

The obvious error is the semicolon returned via capturing group 1.

Adding a dummy rule at the beginning, seems to solve that issue:

>>> re.Scanner([('z', None), ('\w+=(\d+);', lambda s,g: 
>>> s.match.group(1))]).scan('x=5;')
(['5'], '')

Adding a capturing group around \w+ also returns the correct answer:

>>> re.Scanner([('z', None), ('(\w+)=(\d+);', lambda s,g: 
>>> s.match.group(1))]).scan('x=5;')
(['x'], '')

But then, if I ask for the second group, the problem appears again:

>>> re.Scanner([('z', None), ('(\w+)=(\d+);', lambda s,g: 
>>> s.match.group(2))]).scan('x=5;')
(['5;'], '')

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40259>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to