[ 
https://issues.apache.org/jira/browse/LUCENE-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782069#action_12782069
 ] 

Michael McCandless commented on LUCENE-2075:
--------------------------------------------

OK as best I can tell, the reason why linear scan shows so much faster
with the new cache, is some kind of odd GC problem when you use
LinkedHashMap.... the DoubleBarrelLRUCache doesn't tickle GC in this
way.

If you turn on -verbose:gc when running the bench, you see this
healthy GC behavior during warmup (running wildcard "*"):

{code}
[GC 262656K->15659K(1006848K), 0.0357409 secs]
[GC 278315K->15563K(1006848K), 0.0351360 secs]
[GC 278219K->15595K(1006848K), 0.0150112 secs]
[GC 278251K->15563K(1006848K), 0.0054310 secs]
{code}
All minor collections, all fairly fast, all rather effective (~270 MB
down to ~15 MB).

But then when the test gets to the the *N query:

{code}
[GC 323520K->33088K(1022272K), 0.0377057 secs]
[GC 338432K->78536K(990592K), 0.1830592 secs]
[GC 344776K->118344K(1006336K), 0.1205320 secs]
[GC 384584K->158080K(987264K), 0.2340810 secs]
[GC 400640K->194264K(979136K), 0.2139520 secs]
[GC 436824K->230488K(989760K), 0.2017131 secs]
[GC 463192K->266501K(969152K), 0.1932188 secs]
[GC 499205K->301317K(989632K), 0.1918106 secs]
[GC 530437K->335541K(990080K), 0.1907594 secs]
[GC 564661K->369749K(990528K), 0.1905007 secs]
[GC 599445K->404117K(990208K), 0.1922657 secs]
[GC 633813K->438477K(991680K), 0.1994350 secs]
[GC 670157K->474250K(991040K), 0.2073795 secs]
[GC 705930K->508842K(992832K), 0.2061273 secs]
[GC 742570K->543770K(991936K), 0.1980306 secs]
[GC 777498K->578634K(994560K), 0.1975961 secs]
[GC 815818K->614010K(993664K), 0.2042480 secs]
[GC 851194K->649434K(996096K), 0.1940145 secs]
[GC 889754K->686551K(995264K), 0.1991030 secs]
[Full GC 686551K->18312K(995264K), 0.1838671 secs]
[GC 258632K->54088K(997120K), 0.0735258 secs]
[GC 296456K->90280K(996288K), 0.1382187 secs]
[GC 332648K->126512K(998592K), 0.1427443 secs]
[GC 371888K->163096K(997824K), 0.1472803 secs]
{code}

The minor collections are not nearly as effective -- way too many
objects are for some reason being marked as live (even though they are
not) and promoted to the older generation, thus making the minor
collection much more costly and also requiring major collection every
so often.

Now here's the really crazy thing: if I move the *N query up to be the
first query the benchmark runs, GC is healthy:

{code}
[GC 323868K->17216K(1027840K), 0.0060598 secs]
[GC 322496K->17128K(1006016K), 0.0062586 secs]
[GC 322408K->17160K(1027712K), 0.0008879 secs]
[GC 321672K->17192K(1027776K), 0.0003269 secs]
[GC 321704K->18669K(1028608K), 0.0012964 secs]
[GC 324205K->18741K(1027968K), 0.0104134 secs]
[GC 324277K->18613K(1029632K), 0.0083720 secs]
[GC 326261K->18677K(1029056K), 0.0003520 secs]
{code}

And the query runs about as fast as w/ the new cache.

So..... somehow, running the other queries sets object state up to
confuse GC later.  I'm pretty sure it's the linking that the
LinkedHashMap (in SimpleLRUCache) is doing, because if I forcefully
turn off all caching, GC acts healthy again, and that query runs as
fast as it does w/ the patch.

DoubleBarrelLRUCache doens't tickle GC in this way, so the *N query
runs fast with it.

Sheesh!!


> Share the Term -> TermInfo cache across threads
> -----------------------------------------------
>
>                 Key: LUCENE-2075
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2075
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: ConcurrentLRUCache.java, LUCENE-2075.patch, 
> LUCENE-2075.patch, LUCENE-2075.patch, LUCENE-2075.patch, LUCENE-2075.patch, 
> LUCENE-2075.patch, LUCENE-2075.patch, LUCENE-2075.patch, LUCENE-2075.patch, 
> LUCENE-2075.patch
>
>
> Right now each thread creates its own (thread private) SimpleLRUCache,
> holding up to 1024 terms.
> This is rather wasteful, since if there are a high number of threads
> that come through Lucene, you're multiplying the RAM usage.  You're
> also cutting way back on likelihood of a cache hit (except the known
> multiple times we lookup a term within-query, which uses one thread).
> In NRT search we open new SegmentReaders (on tiny segments) often
> which each thread must then spend CPU/RAM creating & populating.
> Now that we are on 1.5 we can use java.util.concurrent.*, eg
> ConcurrentHashMap.  One simple approach could be a double-barrel LRU
> cache, using 2 maps (primary, secondary).  You check the cache by
> first checking primary; if that's a miss, you check secondary and if
> you get a hit you promote it to primary.  Once primary is full you
> clear secondary and swap them.
> Or... any other suggested approach?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to