Serhiy Storchaka added the comment: This is old rule. \w{2,}-(?=\w{2,} -- single letter shouldn't be separated. But there was a bug in such simple regex, it splits a word after non-word character (in particular apostrophe or hyphen) if it followed by word characters and hyphen. There were attempts to fix this bug in issue596434 and issue965425 but they missed a cases when non-word character is occurred inside a word.
Originally I had assigned this issue only to 3.5 because I supposed that the solution needs either new features in re or backward-incompatible changes to word splitting algorithm. But found solution doesn't require 3.5-only features, doesn't change interface, and fixes performance and behavior bugs. So I think it should be applied to maintained releases too. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue22687> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com