[
https://issues.apache.org/jira/browse/LUCENE-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12599799#action_12599799
]
Cédrik LIME commented on LUCENE-1183:
-------------------------------------
All of Bob's FuzzyTermEnum patch is in my patch. I only left some smallish
optimizations that didn't bring much but did hurt code readability. In other
words, should you commit my patch, you will have most of (99.9%) LUCENE-691.
I think this is an important patch for Lucene 2.4, as it brings vast
performance improvements in fuzzy search (no hard numbers, sorry).
> TRStringDistance uses way too much memory (with patch)
> ------------------------------------------------------
>
> Key: LUCENE-1183
> URL: https://issues.apache.org/jira/browse/LUCENE-1183
> Project: Lucene - Java
> Issue Type: Improvement
> Components: contrib/*
> Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3
> Reporter: Cédrik LIME
> Assignee: Otis Gospodnetic
> Priority: Minor
> Attachments: FuzzyTermEnum.patch, TRStringDistance.java,
> TRStringDistance.patch
>
> Original Estimate: 0.17h
> Remaining Estimate: 0.17h
>
> The implementation of TRStringDistance is based on version 2.1 of
> org.apache.commons.lang.StringUtils#getLevenshteinDistance(String, String),
> which uses an un-optimized implementation of the Levenshtein Distance
> algorithm (it uses way too much memory). Please see Bug 38911
> (http://issues.apache.org/bugzilla/show_bug.cgi?id=38911) for more
> information.
> The commons-lang implementation has been heavily optimized as of version 2.2
> (3x speed-up). I have reported the new implementation to TRStringDistance.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]