Tim Peters <[email protected]> added the comment:
We can't change defaults without superb reason - Python has millions of users,
and changing the output of code "that works" is almost always a non-starter.
Improvements to the docs are welcome.
In your example, try running this code after using autojunk=True:
pending = ""
for ch in first:
if ch in sm.bpopular:
if pending:
print(repr(pending))
pending = ""
else:
pending += ch
print(repr(pending))
That shows how `first` is effectively broken into tiny pieces given that the
"popular" chaaracters act like walls. Here's the start of the output:
'\nUN'
'QUESTR'
'NG\nL'
'x'
'f'
'.'
'L'
'b'
"'"
'x'
'v'
'1500'
','
and on & on. `QUESTER' is the longest common contiguous substring remaining.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue46667>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com