On 23/04/2021 01:53, Andy AO wrote:
Upgrading from Python 3.6.8 to Python 3.9.0 and executing unit tests
revealed a significant change in the behavior of re.split().

but looking at the relevant documentation — Changelog <https://docs.
python.org/3/whatsnew/changelog.html> and re - Regular expression
operations - Python 3.9.4 documentation
<https://docs.python.org/3/library/re.html?highlight=re%20search#re.split>
yet no change is found.

number = '123'def test_Asterisk_quantifier_with_capture_group(self):
     resultList = re.split(r'(\d*)', self.number)
     if platform.python_version() == '3.6.8':
         self.assertEqual(resultList,['', '123', ''])

     else:
         self.assertEqual(resultList,['', '123', '', '', ''])


Hi Andy,

That's interesting. The old result is less surprising, but of course both are technically correct as the 4th element in the result matches your regexp.

The oldest version of Python I had lying around to test is 3.7; that has the same behaviour as 3.9.

I suspect that this behaviour is related to the following note in the docs for re.split:


Changed in version 3.7: Added support of splitting on a pattern that could match an empty string.


(your pattern can match an empty string, so I suppose it wasn't technically supported in 3.6?)


-- Thomas




I feel that this is clearly not in line with the description of the
function in the split documentation, and it is also strange that after
replacing * with +, the behavior is still the same as in 3.6.8.

    1. why is this change not in the documentation? Is it because I didn’t
    find it?
    2. Why did the behavior change this way? Was a bug introduced, or was it
    a bug fix?

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to