Tim Peters <t...@python.org> added the comment:
We can't change defaults without superb reason - Python has millions of users, and changing the output of code "that works" is almost always a non-starter. Improvements to the docs are welcome. In your example, try running this code after using autojunk=True: pending = "" for ch in first: if ch in sm.bpopular: if pending: print(repr(pending)) pending = "" else: pending += ch print(repr(pending)) That shows how `first` is effectively broken into tiny pieces given that the "popular" chaaracters act like walls. Here's the start of the output: '\nUN' 'QUESTR' 'NG\nL' 'x' 'f' '.' 'L' 'b' "'" 'x' 'v' '1500' ',' and on & on. `QUESTER' is the longest common contiguous substring remaining. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue46667> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com