Morelikethis queries are very slow compared to other search types
-----------------------------------------------------------------
Key: LUCENE-1690
URL: https://issues.apache.org/jira/browse/LUCENE-1690
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/*
Affects Versions: 2.4.1
Reporter: Richard Marr
Priority: Minor
The MoreLikeThis object performs term frequency lookups for every query. From
my testing that's what seems to take up the majority of time for MoreLikeThis
searches.
For some (I'd venture many) applications it's not necessary for term statistics
to be looked up every time. A fairly naive opt-in caching mechanism tied to the
life of the MoreLikeThis object would allow applications to cache term
statistics for the duration that suits them.
I've got this working in my test code. I'll put together a patch file when I
get a minute. From my testing this can improve performance by a factor of
around 10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]