Serhiy Storchaka added the comment:
This is old rule. \w{2,}-(?=\w{2,} -- single letter shouldn't be separated. But
there was a bug in such simple regex, it splits a word after non-word character
(in particular apostrophe or hyphen) if it followed by word characters and
hyphen. There were attempts to fix this bug in issue596434 and issue965425 but
they missed a cases when non-word character is occurred inside a word.
Originally I had assigned this issue only to 3.5 because I supposed that the
solution needs either new features in re or backward-incompatible changes to
word splitting algorithm. But found solution doesn't require 3.5-only features,
doesn't change interface, and fixes performance and behavior bugs. So I think
it should be applied to maintained releases too.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue22687>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com