[
https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573065#action_12573065
]
Michael Busch commented on LUCENE-1195:
---------------------------------------
{quote}
Unfortunately, it needs to be... no getting around it.
{quote}
You're right, and I'm stupid :)
Actually what I meant was that the get() and put() methods don't need to
be synchronized if the underlying data structure, i. e.the LinkedHashMap,
that I'm using is thread-safe, otherwise it might return inconsistent
data.
But the LinkedHashMap is not, unless I decorate it with
Collections.synchronizedMap(). Do you know what's faster? Using the
synchronized map or making get() and put() synchronized? Probably
there's not really a difference, because the decorator that
Collections.synchronizedMap() returns just does essentially the same?
> Performance improvement for TermInfosReader
> -------------------------------------------
>
> Key: LUCENE-1195
> URL: https://issues.apache.org/jira/browse/LUCENE-1195
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Michael Busch
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.4
>
> Attachments: lucene-1195.patch
>
>
> Currently we have a bottleneck for multi-term queries: the dictionary lookup
> is being done
> twice for each term. The first time in Similarity.idf(), where
> searcher.docFreq() is called.
> The second time when the posting list is opened (TermDocs or TermPositions).
> The dictionary lookup is not cheap, that's why a significant performance
> improvement is
> possible here if we avoid the second lookup. An easy way to do this is to add
> a small LRU
> cache to TermInfosReader.
> I ran some performance experiments with an LRU cache size of 20, and an
> mid-size index of
> 500,000 documents from wikipedia. Here are some test results:
> 50,000 AND queries with 3 terms each:
> old: 152 secs
> new (with LRU cache): 112 secs (26% faster)
> 50,000 OR queries with 3 terms each:
> old: 175 secs
> new (with LRU cache): 133 secs (24% faster)
> For bigger indexes this patch will probably have less impact, for smaller
> once more.
> I will attach a patch soon.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]