[
https://issues.apache.org/jira/browse/LUCENENET-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Digy updated LUCENENET-190:
---------------------------
Attachment: TermInfosReader.rar
With this patch, TermInfosReader's cache can be configured using app's config
file like
{code}
<configuration>
<appSettings>
<add key="EnableTermInfosReaderCache" value="false"/>
</appSettings>
</configuration>
{code}
DIGY
> 2.4.0 Performance in TermInfosReader term caching (New implementation of
> SimpleLRUCache)
> ----------------------------------------------------------------------------------------
>
> Key: LUCENENET-190
> URL: https://issues.apache.org/jira/browse/LUCENENET-190
> Project: Lucene.Net
> Issue Type: Improvement
> Environment: v2.4.0
> Reporter: Digy
> Priority: Minor
> Attachments: cache_Gen2.PNG, SimpleLRUCache.rar, TermInfosReader.rar
>
>
> Below is the mail from Michael Garski about the Performance in
> TermInfosReader term caching. It would be good to have a faster LRUCache
> implementation in Lucene.Net
> DIGY
> {quote}
> Doug did an amazing job of porting 2.4.0, doing it mostly on his own!
> Hooray Doug!
> We are using the committed version of 2.4.0 in production and I wanted to
> share a performance issue we discovered and what we've done to work around
> it. From the Java Lucene change log: "LUCENE-1195: Improve term lookup
> performance by adding a LRU cache to the TermInfosReader. In performance
> experiments the speedup was about 25% on average on mid-size indexes with
> ~500,000 documents for queries with 3 terms and about 7% on larger indexes
> with ~4.3M documents."
> The Java implementation uses a LinkedHashMap within the class
> org.apache.lucene.util.cache.SimpleLRUCache, which is very efficient at
> maintaining the cache. As there is no equivalent collection in .Net The
> current 2.4.0 port uses a combination of a LinkedList to maintain LRU state
> and a HashTable to provide lookups. While this implementation works,
> maintaining the LRU state via the LinkedList creates a fair amount of
> overhead and can result in a significant reduction of performance, most
> likely attributed to the LinkedList.Remove method being O(n). As each thread
> maintains its own cache of 1024 terms, these overhead in performing the
> removal is a drain on performance.
> At this time we have disabled the cache in the method
> TermInfosReader.TermInfo Get(Term term, bool useCache) by always setting the
> useCache parameter to false inside the body of the method. After doing this
> we saw performance return back to the 2.3.2 levels. I have not yet had the
> opportunity to experiment with other implementations within the
> SimpleLRUCache to address the performance issue. One approach that would
> might solve the issue is to use the HashedLinkedList<T> class provided in the
> C5 collection library [http://www.itu.dk/research/c5/].
> Michael
> Michael Garski
> Search Architect
> MySpace.com
> www.myspace.com/michaelgarski <http://%27www.myspace.com/mgarski>
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.