[ 
https://issues.apache.org/jira/browse/LUCENE-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704561#action_12704561
 ] 

Eks Dev commented on LUCENE-1518:
---------------------------------

imo, it is really not all that important to make Filter and Query the same 
(that is just one alternative to achieve goal). 

Basic problem we try  to solve is adding Filter directly to BoolenQuery, and 
making optimizations after that easier. Wrapping with CSQ is just adding anothe 
layer between Lucene search machinery and Filter, making these optimizations 
harder.

On the other hand, I must accept, conceptually FIter and Query are "the same", 
supporting together following options:
1. Pure boolean model: You do not care about scores (today we can do it only 
wia CSQ, as Filter does not enter BoolenQuery)
2. Mixed boolean and ranked: you have to define Filter contribution to the 
documents (CSQ)
3. Pure ranked: No filters, all gets scored (the same as 2.)

Ideally, as a user, I define only Query (Filter based or not) and for each 
clause in my Query define 
Query.setScored(true/false) or useConstantScore(double score); 

also I should be able to say, "Dear Lucene please materialize this 
"Query_Filter" for me as I would like to have it cached and please store only 
DocIds (Filter today).  Maybe open possibility to open possibility to cache 
scores of the documents as well. 

one thing is concept  and another is optimization. From optimization point of 
view, we have couple of decisions to make:

- DocID Set supports random access, yes or no (my "Materialized Query")
- Decide if clause should / should not be scored/ or should be constant

So, for each "Query" we need to decide/support:

- scoring{yes, no, constant} and 
- opening option to "materialize Query" (that is how we today create Filters 
today)
- these Materialized Queries (aka Filter) should be able to tell us if they 
support random access, if they cache only doc id's or scores as well


nothing usefull in this email, just  thinking aloud, sometimes helps :)






> Merge Query and Filter classes
> ------------------------------
>
>                 Key: LUCENE-1518
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1518
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Uwe Schindler
>             Fix For: 2.9
>
>         Attachments: LUCENE-1518.patch
>
>
> This issue presents a patch, that merges Queries and Filters in a way, that 
> the new Filter class extends Query. This would make it possible, to use every 
> filter as a query.
> The new abstract filter class would contain all methods of 
> ConstantScoreQuery, deprecate ConstantScoreQuery. If somebody implements the 
> Filter's getDocIdSet()/bits() methods he has nothing more to do, he could 
> just use the filter as a normal query.
> I do not want to completely convert Filters to ConstantScoreQueries. The idea 
> is to combine Queries and Filters in such a way, that every Filter can 
> automatically be used at all places where a Query can be used (e.g. also 
> alone a search query without any other constraint). For that, the abstract 
> Query methods must be implemented and return a "default" weight for Filters 
> which is the current ConstantScore Logic. If the filter is used as a real 
> filter (where the API wants a Filter), the getDocIdSet part could be directly 
> used, the weight is useless (as it is currently, too). The constant score 
> default implementation is only used when the Filter is used as a Query (e.g. 
> as direct parameter to Searcher.search()). For the special case of 
> BooleanQueries combining Filters and Queries the idea is, to optimize the 
> BooleanQuery logic in such a way, that it detects if a BooleanClause is a 
> Filter (using instanceof) and then directly uses the Filter API and not take 
> the burden of the ConstantScoreQuery (see LUCENE-1345).
> Here some ideas how to implement Searcher.search() with Query and Filter:
> - User runs Searcher.search() using a Filter as the only parameter. As every 
> Filter is also a ConstantScoreQuery, the query can be executed and returns 
> score 1.0 for all matching documents.
> - User runs Searcher.search() using a Query as the only parameter: No change, 
> all is the same as before
> - User runs Searcher.search() using a BooleanQuery as parameter: If the 
> BooleanQuery does not contain a Query that is subclass of Filter (the new 
> Filter) everything as usual. If the BooleanQuery only contains exactly one 
> Filter and nothing else the Filter is used as a constant score query. If 
> BooleanQuery contains clauses with Queries and Filters the new algorithm 
> could be used: The queries are executed and the results filtered with the 
> filters.
> For the user this has the main advantage: That he can construct his query 
> using a simplified API without thinking about Filters oder Queries, you can 
> just combine clauses together. The scorer/weight logic then identifies the 
> cases to use the filter or the query weight API. Just like the query 
> optimizer of a RDB.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to