On 23/04/2021 01:53, Andy AO wrote:
Upgrading from Python 3.6.8 to Python 3.9.0 and executing unit tests
revealed a significant change in the behavior of re.split().
but looking at the relevant documentation — Changelog <https://docs.
python.org/3/whatsnew/changelog.html> and re - Regular expression
operations - Python 3.9.4 documentation
<https://docs.python.org/3/library/re.html?highlight=re%20search#re.split>
yet no change is found.
number = '123'def test_Asterisk_quantifier_with_capture_group(self):
resultList = re.split(r'(\d*)', self.number)
if platform.python_version() == '3.6.8':
self.assertEqual(resultList,['', '123', ''])
else:
self.assertEqual(resultList,['', '123', '', '', ''])
Hi Andy,
That's interesting. The old result is less surprising, but of course
both are technically correct as the 4th element in the result matches
your regexp.
The oldest version of Python I had lying around to test is 3.7; that has
the same behaviour as 3.9.
I suspect that this behaviour is related to the following note in the
docs for re.split:
Changed in version 3.7: Added support of splitting on a pattern that
could match an empty string.
(your pattern can match an empty string, so I suppose it wasn't
technically supported in 3.6?)
-- Thomas
I feel that this is clearly not in line with the description of the
function in the split documentation, and it is also strange that after
replacing * with +, the behavior is still the same as in 3.6.8.
1. why is this change not in the documentation? Is it because I didn’t
find it?
2. Why did the behavior change this way? Was a bug introduced, or was it
a bug fix?
--
https://mail.python.org/mailman/listinfo/python-list