New submission from floyd:

I guess a lot of users of difflib call the SequenceMatcher in the following way 
(where a and b often have different lengths):

if difflib.SequenceMatcher.quick_ratio(None, a, b) >= threshold:

However, for this use case the current quick_ratio is quite a performance loss. 
Therefore I propose to add an additional, optimized version quick_ratio_ge 
which would be called like this:

if difflib.SequenceMatcher.quick_ratio_ge(None, a, b, threshold):

As we are able to calculate upper bounds for threshold depending on the lengths 
of a and b this function would return much faster in a lot of cases.

An example of how quick_ratio_ge could be implemented is attached.

----------
components: Library (Lib)
files: difflib_SequenceMatcher_quick_ratio_ge.py
messages: 244840
nosy: floyd
priority: normal
severity: normal
status: open
title: difflib.SequenceMatcher faster quick_ratio with lower bound specification
type: enhancement
Added file: 
http://bugs.python.org/file39625/difflib_SequenceMatcher_quick_ratio_ge.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24384>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to