[ 
https://issues.apache.org/jira/browse/LUCENE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736431#comment-14736431
 ] 

Adrien Grand commented on LUCENE-6785:
--------------------------------------

I think the API is more usable this way. The only concern I might have is that 
we don't have as much flexibility for rewriting as we used to since we can't 
make decisions for rewriting based on the impl of the rewritten queries, since 
we only know about the rewritten queries once we have the weights and dropping 
weights could be problematic since they can be expensive to create due to 
statistics collection. I am not sure how much of an issue this would be but eg. 
I think Terry's patch on LUCENE-6787 would be impacted. An alternative to the 
current patch might be to do something like this:

{noformat}
Index: lucene/core/src/java/org/apache/lucene/search/Query.java
===================================================================
--- lucene/core/src/java/org/apache/lucene/search/Query.java    (revision 
1701820)
+++ lucene/core/src/java/org/apache/lucene/search/Query.java    (working copy)
@@ -57,18 +57,32 @@
   /**
    * Expert: Constructs an appropriate Weight implementation for this query.
    * <p>
-   * Only implemented by primitive queries, which re-write to themselves.
+   * This method will first {@link #rewrite(IndexReader)} the query and then
+   * construct a concrete {@link Weight} on the rewritten query using
+   * {@link #doCreateWeight(IndexSearcher, boolean)}.
    *
    * @param needsScores   True if document scores ({@link Scorer#score}) or 
match
    *                      frequencies ({@link Scorer#freq}) are needed.
    */
-  public Weight createWeight(IndexSearcher searcher, boolean needsScores) 
throws IOException {
-    throw new UnsupportedOperationException("Query " + this + " does not 
implement createWeight");
+  public final Weight createWeight(IndexSearcher searcher, boolean 
needsScores) throws IOException {
+    Query query = this;
+    final IndexReader reader = searcher.getIndexReader();
+    for (Query rewrittenQuery = query.rewrite(reader); rewrittenQuery != query;
+         rewrittenQuery = query.rewrite(reader)) {
+      query = rewrittenQuery;
+    }
+    return query.doCreateWeight(searcher, needsScores);
   }
 
-  /** Expert: called to re-write queries into primitive queries. For example,
-   * a PrefixQuery will be rewritten into a BooleanQuery that consists
-   * of TermQuerys.
+  /** Construct a Weight for this {@link Query}. In case this {@link Query} 
object
+   *  always {@link #rewrite(IndexReader) rewrites} to a different Query
+   *  implementation, it is safe to implement this method by just throwing an
+   *  exception as it will never get called. */
+  protected abstract Weight doCreateWeight(IndexSearcher searcher, boolean 
needsScores);
+
+  /** Internal: called to re-write queries into primitive queries. This method
+   *  only exists as an implementation detail of
+   *  {@link #createWeight(IndexSearcher, boolean)}. Do not call directly.
    */
   public Query rewrite(IndexReader reader) throws IOException {
     return this;
{noformat}

We would still have the same flexibility for rewriting and consumers of the 
Query API would not have the trap anymore that they need to rewrite queries 
first. However I agree it's not as clean... 

> Consider merging Query.rewrite() into Query.createWeight()
> ----------------------------------------------------------
>
>                 Key: LUCENE-6785
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6785
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>         Attachments: LUCENE-6785.patch
>
>
> Prompted by the discussion on LUCENE-6590.
> Query.rewrite() is a bit of an oddity.  You call it to create a query for a 
> specific IndexSearcher, and to ensure that you get a query implementation 
> that has a working createWeight() method.  However, Weight itself already 
> encapsulates the notion of a per-searcher query.
> You also need to repeatedly call rewrite() until the query has stopped 
> rewriting itself, which is a bit trappy - there are a few places (in 
> highlighting code for example) that just call rewrite() once, rather than 
> looping round as IndexSearcher.rewrite() does.  Most queries don't need to be 
> called multiple times, however, so this seems a bit redundant.  And the ones 
> that do currently return un-rewritten queries can be changed simply enough to 
> rewrite them.
> Finally, in pretty much every case I can find in the codebase, rewrite() is 
> called purely as a prelude to createWeight().  This means, in the case of for 
> example large BooleanQueries, we end up cloning the whole query structure, 
> only to throw it away immediately.
> I'd like to try removing rewrite() entirely, and merging the logic into 
> createWeight(), simplifying the API and removing the trap where code only 
> calls rewrite once.  What do people think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to