On Tue, 9 Dec 2008 at 22:15, eliben wrote:
On Dec 10, 4:12?am, [EMAIL PROTECTED] wrote:
On Mon, 8 Dec 2008 at 23:46, eliben wrote:
This is about Python 2.5.2 - I don't know if there were fixes to this
module in 2.6/3.0
I think I ran into a bug with difflib.SequenceMatcherclass.
Specifically, its ratio() method. The following:
SequenceMatcher(None, [4] + [10] * 500 + [5], [10] * 500 + [5]).ratio
()
returns 0.0
While the same with 500 replaced by 100 returns .99... something
Looking at the code ofSequenceMatcherthere's some caching going on
when the sequences are longer than 200 elements (and indeed, I can
reproduce the bug above 200 but not below). Can anyone confirm that
this misbehaves and suggest a workaround ?
Python 2.5.2 (r252:60911, Sep 29 2008, 20:34:04)
[GCC 4.3.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.>>> from
difflib importSequenceMatcher
SequenceMatcher(None, [4] + [10] * 500 + [5], [10] * 500 +
[5]).ratio()
0.99900299102691925
Strange. I could reproduce the problem both on ActiveState Python
2.5.2 for Windows, and in the online Try Python evaluator:
http://try-python.mired.org/
My system is Gentoo, which installs python from source. Maybe gentoo
applies patches that the binary releases don't have.
--RDM
--
http://mail.python.org/mailman/listinfo/python-list