[jira] [Commented] (SOLR-12366) Avoid SlowAtomicReader.getLiveDocs -- it's slow

Yonik Seeley (JIRA) Sat, 02 Jun 2018 12:10:09 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499147#comment-16499147
 ]


Yonik Seeley commented on SOLR-12366:
-------------------------------------

Nice catch, this stuff has been broken forever!
 Looking back, I think not enough was exposed to be able to work per-segment, 
so Lucene's MultiReader.isDeleted(int doc) did a binary search each time. Once 
we gained the ability to operate per-segment, some code wasn't converted.
{quote}IMO some callers of SolrIndexSearcher.getSlowAtomicReader should change 
to use MultiFields to avoid the temptation to have a LeafReader that has many 
slow methods.
{quote}
MultiFields has slow methods as well, and if you look at the histories, many 
places used MultiFields.getDeletedDocs even before (and were replaced with the 
equivalent?)
 For example, commit 6ffc159b40 changed getFirstMatch to use 
MultiFields.getDeletedDocs (which may not have been a bug since it probably was 
equivalent at the time?)

Anyway, I think perhaps we should throw an exception for any place in 
SlowCompositeReaderWrapper that exposes code that does a binary search. We 
don't need a full Reader implementation here I think.

A variable name change for "SolrIndexSearcher.leafReader" would really be 
welcome too... it's a bad name.  We've been bit by the naming before as well: 
SOLR-9592

> Avoid SlowAtomicReader.getLiveDocs -- it's slow
> -----------------------------------------------
>
>                 Key: SOLR-12366
>                 URL: https://issues.apache.org/jira/browse/SOLR-12366
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>             Fix For: 7.4
>
>         Attachments: SOLR-12366.patch, SOLR-12366.patch, SOLR-12366.patch, 
> SOLR-12366.patch
>
>
> SlowAtomicReader is of course slow, and it's getLiveDocs (based on MultiBits) 
> is slow as it uses a binary search for each lookup.  There are various places 
> in Solr that use SolrIndexSearcher.getSlowAtomicReader and then get the 
> liveDocs.  Most of these places ought to work with SolrIndexSearcher's 
> getLiveDocs method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12366) Avoid SlowAtomicReader.getLiveDocs -- it's slow

Reply via email to