[jira] Commented: (LUCENE-2075) Share the Term -> TermInfo cache across threads

Michael McCandless (JIRA) Tue, 17 Nov 2009 17:37:04 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779247#action_12779247
 ]


Michael McCandless commented on LUCENE-2075:
--------------------------------------------

bq. We should prob compare with google's (its apache 2 licensed, so why not)

Well, that's just hosted on code.google.com (ie it's not "Google's"), and 
reading its description it sounds sort of experimental (though they do state 
that they created a "Production Version").  It made me a bit nervous... 
however, it does sound people use it in "production".

I think FastLRUCache is probably best for Lucene, because it scales up well w/ 
high number of threads?  My guess is it's slower cost for low hit rates is 
negligible to Lucene, but I'll run some perf tests.

It looks like ConcurrentLRUCache (used by FastLRUCache, but the latter does 
other solr-specific things) is the right low-level one to use for Lucene?

> Share the Term -> TermInfo cache across threads
> -----------------------------------------------
>
>                 Key: LUCENE-2075
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2075
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 3.1
>
>
> Right now each thread creates its own (thread private) SimpleLRUCache,
> holding up to 1024 terms.
> This is rather wasteful, since if there are a high number of threads
> that come through Lucene, you're multiplying the RAM usage.  You're
> also cutting way back on likelihood of a cache hit (except the known
> multiple times we lookup a term within-query, which uses one thread).
> In NRT search we open new SegmentReaders (on tiny segments) often
> which each thread must then spend CPU/RAM creating & populating.
> Now that we are on 1.5 we can use java.util.concurrent.*, eg
> ConcurrentHashMap.  One simple approach could be a double-barrel LRU
> cache, using 2 maps (primary, secondary).  You check the cache by
> first checking primary; if that's a miss, you check secondary and if
> you get a hit you promote it to primary.  Once primary is full you
> clear secondary and swap them.
> Or... any other suggested approach?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2075) Share the Term -> TermInfo cache across threads

Reply via email to