Serhiy Storchaka added the comment:

This is old rule. \w{2,}-(?=\w{2,} -- single letter shouldn't be separated. But 
there was a bug in such simple regex, it splits a word after non-word character 
(in particular apostrophe or hyphen) if it followed by word characters and 
hyphen. There were attempts to fix this bug in issue596434 and issue965425 but 
they missed a cases when non-word character is occurred inside a word.

Originally I had assigned this issue only to 3.5 because I supposed that the 
solution needs either new features in re or backward-incompatible changes to 
word splitting algorithm. But found solution doesn't require 3.5-only features, 
doesn't change interface, and fixes performance and behavior bugs. So I think 
it should be applied to maintained releases too.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22687>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to