[jira] Commented: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

Shalin Shekhar Mangar (JIRA) Fri, 26 Jun 2009 03:25:32 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724460#action_12724460
 ]


Shalin Shekhar Mangar commented on LUCENE-1630:
-----------------------------------------------

Something is still wrong, I'm seeing:

SEVERE: java.lang.NullPointerException
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:242)
        at org.apache.lucene.search.Searcher.search(Searcher.java:173)
        at 
org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1103)
        at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:880)
        at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
        at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:176)

I did some digging around. It seems that before r787772 (LUCENE-1630), the 
BooleanQuery#scorer used to return a null scorer only if the subScorer was null 
*and* the clause was required. With trunk, the new BooleanQuery#scorer method 
returns null if the sub scorer is null without checking if the clause was 
required. But I'm not sure if that is the only problem. When I changed 
BooleanQuery to the old behavior, I still see the same NPE but this time it is 
coming from DisjunctionMaxQuery#scorer due to subScorer.nextDoc() != 
DocIdSetIterator.NO_MORE_DOCS always being false. The subScorer in this case 
was ExactPhraseScorer.

> Mating Collector and Scorer on doc Id orderness
> -----------------------------------------------
>
>                 Key: LUCENE-1630
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1630
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>         Attachments: LUCENE-1630-2.patch, LUCENE-1630.patch, 
> LUCENE-1630.patch, LUCENE-1630.patch, LUCENE-1630.patch, LUCENE-1630.patch, 
> LUCENE-1630.patch
>
>
> This is a spin off of LUCENE-1593. This issue proposes to expose appropriate 
> API on Scorer and Collector such that one can create an optimized Collector 
> based on a given Scorer's doc-id orderness and vice versa. Copied from 
> LUCENE-1593, here is the list of changes:
> # Deprecate Weight and create QueryWeight (abstract class) with a new 
> scorer(reader, scoreDocsInOrder), replacing the current scorer(reader) 
> method. QueryWeight implements Weight, while score(reader) calls 
> score(reader, false /* out-of-order */) and scorer(reader, scoreDocsInOrder) 
> is defined abstract.
> #* Also add QueryWeightWrapper to wrap a given Weight implementation. This 
> one will also be deprecated, as well as package-private.
> #* Add to Query variants of createWeight and weight which return QueryWeight. 
> For now, I prefer to add a default impl which wraps the Weight variant 
> instead of overriding in all Query extensions, and in 3.0 when we remove the 
> Weight variants - override in all extending classes.
> # Add to Scorer isOutOfOrder with a default to false, and override in BS to 
> true.
> # Modify BooleanWeight to extend QueryWeight and implement the new scorer 
> method to return BS2 or BS based on the number of required scorers and 
> setAllowOutOfOrder.
> # Add to Collector an abstract _acceptsDocsOutOfOrder_ which returns 
> true/false.
> #* Use it in IndexSearcher.search methods, that accept a Collector, in order 
> to create the appropriate Scorer, using the new QueryWeight.
> #* Provide a static create method to TFC and TSDC which accept this as an 
> argument and creates the proper instance.
> #* Wherever we create a Collector (TSDC or TFC), always ask for out-of-order 
> Scorer and check on the resulting Scorer isOutOfOrder(), so that we can 
> create the optimized Collector instance.
> # Modify IndexSearcher to use all of the above logic.
> The only class I'm worried about, and would like to verify with you, is 
> Searchable. If we want to deprecate all the search methods on IndexSearcher, 
> Searcher and Searchable which accept Weight and add new ones which accept 
> QueryWeight, we must do the following:
> * Deprecate Searchable in favor of Searcher.
> * Add to Searcher the new QueryWeight variants. Here we have two choices: (1) 
> break back-compat and add them as abstract (like we've done with the new 
> Collector method) or (2) add them with a default impl to call the Weight 
> versions, documenting these will become abstract in 3.0.
> * Have Searcher extend UnicastRemoteObject and have RemoteSearchable extend 
> Searcher. That's the part I'm a little bit worried about - Searchable 
> implements java.rmi.Remote, which means there could be an implementation out 
> there which implements Searchable and extends something different than 
> UnicastRemoteObject, like Activeable. I think there is very small chance this 
> has actually happened, but would like to confirm with you guys first.
> * Add a deprecated, package-private, SearchableWrapper which extends Searcher 
> and delegates all calls to the Searchable member.
> * Deprecate all uses of Searchable and add Searcher instead, defaulting the 
> old ones to use SearchableWrapper.
> * Make all the necessary changes to IndexSearcher, MultiSearcher etc. 
> regarding overriding these new methods.
> One other optimization that was discussed in LUCENE-1593 is to expose a 
> topScorer() API (on Weight) which returns a Scorer that its score(Collector) 
> will be called, and additionally add a start() method to DISI. That will 
> allow Scorers to initialize either on start() or score(Collector). This was 
> proposed mainly because of BS and BS2 which check if they are initialized in 
> every call to next(), skipTo() and score(). Personally I prefer to see that 
> in a separate issue, following that one (as it might add methods to 
> QueryWeight).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

Reply via email to