[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700944#action_12700944 ] Jason Rutherglen commented on LUCENE-1345: -- Even though Paul's patch doesn't pass a test, it sounds like it can be benchmarked? Part of the goal of patch is to make certain queries faster? This could help with how we approach optimizing LUCENE-1345. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700961#action_12700961 ] Paul Elschot commented on LUCENE-1345: -- The interesting thing to benchmark is filtered queries. One could do this by adding the filter as a required clause to a BooleanQuery in IndexSearcher, and see whether filtered queries are faster with that implementation. This part should work normally with the current patch. In case that turns out to make a real difference, it might also be considered to deprecate all the Searcher methods that take a Filter argument, and indicate the preferred alternative implementation with a Filter as a clause to BooleanQuery in the javadocs. Now, if I could find the time to get this last bug out of the current patch... Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662907#action_12662907 ] Paul Elschot commented on LUCENE-1345: -- To add a Filter is as a clause to a BooleanQuery, I would prefer to not give it a Weight. Instead I'd like the addition of a required Filter to behave exactly like the current Searcher(Query, Filter) API. That also touches another point: backward compatibility with BooleanQuery and Searcher. It's certainly possible to add scoring behaviour to a Filter when it is added to a BooleanQuery. A default score value could be used, and also a default coordination behaviour. In principle it is also possible to add a disjunction of Filters to a BooleanQuery, even with a minimum number of required filters. For this case a score value does make sense. Required Filters and for prohibited Filters could be added to a BooleanQuery without scoring behaviour. In fact, for prohibited Queries, the score value is never used, so one might even constrain prohibited clauses to be Filters only. Most, if not all, of the scoring behaviour for Filters that was discussed so far can be obtained by using a ConstantScoreQuery based on a Filter and adding it to a BooleanQuery. So I think it would be cleaner to keep the scoring yes/no distinction between Queries and Filters. In case a simplified interface is desired this could then use any of the options available, for example always wrapping a Filter in a ConstantScoreQuery, and then composing a BooleanQuery only from Query clauses. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662984#action_12662984 ] Earwin Burrfoot commented on LUCENE-1345: - What about complete merge of filters/queries, and deciding whether to score/use constant score/don't score when adding a query to BooleanQuery (or AND/OR/NOT alternative)? Something along the lines of: boolQuery.add(new TermQuery(..), SHOULD, NO_SCORE) Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662993#action_12662993 ] Paul Elschot commented on LUCENE-1345: -- This: bq. boolQuery.add(new TermQuery(..), SHOULD, NO_SCORE) can be done (with the patch here applied) by: boolQuery.add(new QueryWrapperFilter(new TermQuery(..), SHOULD) . I'll post a working version of the patch within a few days. It's better to discuss on working code than on ideas only. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662998#action_12662998 ] Marvin Humphrey commented on LUCENE-1345: - (SHOULD cannot be used for filters as clauses). It doesn't have to be that way. In KS, QueryFilter is a Query, which you can add as a clause to an ORQuery or a RequiredOptionalQuery. Docs which match only the QueryFilter are fed to the HitCollector with a score of 0.0. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663021#action_12663021 ] Marvin Humphrey commented on LUCENE-1345: - Uwe Schindler: Maybe I should create an new JIRA issue out of my suggestion to merge Filters and Queries? In my opinion, this is something nice to have in 3.0. I agree with this tack, having taken it in KS. However, I don't think we have consensus as far as the best approach yet, so perhaps it would be beneficial to hash things out on the mailing list first. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663055#action_12663055 ] Doug Cutting commented on LUCENE-1345: -- Uwe Maybe I should create an new JIRA issue out of my suggestion to merge Filters and Queries? +1 to creating a new issue and +1 to the idea. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663070#action_12663070 ] Uwe Schindler commented on LUCENE-1345: --- I created and linked a new issue LUCENE-1518, that handles the merge suggestion. I also included all relevant comments from me about this. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663075#action_12663075 ] Paul Elschot commented on LUCENE-1345: -- Ok, I'll wait for LUCENE-1518. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662766#action_12662766 ] Uwe Schindler commented on LUCENE-1345: --- Here is a nice idea, how to merge Filters and Queries: Why not just combine ConstantScoreQuery and the current abstract Filter APIs to a new Filter class. This would make it possible, to use every filter as a query. The new abstract filter class would contain all methods of ConstantScoreQuery and it would even be backwards compatible. If somebody implements the filters getDocIdSet()/bits() methods he has nothing more to do, he could just use the filter as a normal query. For some performance improvements when combining more than one filter in a BooleanQuery (e.g. anding/oring the iterators, filtering,...) the code of BooleanQuery could use instanceof. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662782#action_12662782 ] Paul Elschot commented on LUCENE-1345: -- Uwe, The point here is to let BooleanQuery also take care of the filtering logic without doing any extra score computations. For example that involves changing ConjunctionScorer to not only accept Scorers, but also DocIdSetIterators, and use these DocIdSetIterators together with the Scorers to skip to the next matching document, but only use the Scorers to compute the score value. What is the point of adding a score value to Filters, when that score value has to be ignored during query search? Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662783#action_12662783 ] Uwe Schindler commented on LUCENE-1345: --- The idea behind the patch was to merge the code of filters and queries. Further optimizations now can remove the score calculation from the filter code. Using my patch you are now be able to add filters to BooleanQueries or directly execute them using Searcher.search, because they are subclasses of Query. Further optimizations now may remove the score computation in complete, if the given query extends Filter (if (query instanceof Filter) do something other). Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662786#action_12662786 ] Paul Elschot commented on LUCENE-1345: -- bq. Further optimizations now may remove the score computation in complete, if the given query extends Filter (if (query instanceof Filter) do something other) Such further optimization is precisily the idea of the original patch here, but without making Filter a subclass of Query. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662790#action_12662790 ] Uwe Schindler commented on LUCENE-1345: --- I know this. My idea was just to remove the burden of thinking about Filters and Queries for the developer of Lucene applications. In my opinion, the terms Query and Filter should be merged. Logic behind BooleanQuery or Searcher should simply think about the *best logic how to optimize what the user wants to do. Maybe I should create an new JIRA issue out of my suggestion to merge Filters and Queries? In my opinion, this is something nice to have in 3.0. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662795#action_12662795 ] Paul Elschot commented on LUCENE-1345: -- bq. In my opinion, the terms Query and Filter should be merged. There is clear distinction between the two terms. QueryWrapperFilter changes a Query into a Filter and ConstantScoreQuery changes a Filter into a Query. The first one removes the scoring by upcasting a Scorer to a DocIdSetIterator, and the second one adds a constant score to a DocIdSetIterator to create a Scorer. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662796#action_12662796 ] Uwe Schindler commented on LUCENE-1345: --- {quote} bq. In my opinion, the terms Query and Filter should be merged. There is clear distinction between the two terms. QueryWrapperFilter changes a Query into a Filter and ConstantScoreQuery changes a Filter into a Query. The first one removes the scoring by upcasting a Scorer to a DocIdSetIterator, and the second one adds a constant score to a DocIdSetIterator to create a Scorer. {quote} You are right, but for a Lucene user there is always the problem of the distiction between both terms. When combining both, the user would get less burden on thinking about both. It would make life easier, and would hide some work for the user. The problem are the fine differences between the both, but for the general user who does not have such large indexes where the difference between both counts, it would makte things easier. How about merging Filters and Queries and then thinking about optimizations in the code of BooleanQuery to identify use cases where the scoring can be removed and where a constant score is needed. There are two cases where the two different types make problems: - user (A) wants to use my contrib TrieRangeQuery/-Filter and just execute a Query that returns documents that match the Range. The problem for this user is: How to implement this? User a MatchAllDocsQuery and filter the results with TrieRangeFilter or use ConstantScoreQuery to combine both? What is faster? - user (B) wants to filter some documents using a normal Filter. If he uses the standard Query+Filter combination of Searcher.search() he must before distinguish what part of the combinations should be the filter and what should be the query. Maybe he got a TrieRangeQuery (the query one using a ConstantScore on the Filter) as query and want it combine with another query. With the new code that detects the type of both clauses, BooleanQuery code could choose to execute the TermQueries as normal scorer query and filter the results using the given Filter as clause. Both tasks could be easily combined if Query and Filter would be the same. The user (A) would not need to create a constant score query on the Trie filter, he could just use it with Searcher.search() as a Query. If he want to add some normal term queries from a query parser to it, he would use a BooleanQuery to combine both. The BooleanQuery code would then find out that one of the clauses is a Filter and would *not* use ConstantScore code to filter the result and just use the normal filter code. For the user it is simplier: He would always create a TrieRangeQueryFilter combination and would let BooleanQuery choose what query execution strategy to use. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662797#action_12662797 ] Uwe Schindler commented on LUCENE-1345: --- An additional case: User (A) uses a BooleanQuery and just adds the Filter to it and nothing more (no TermQueries and so on). In this case, ConstantScore algorithm must be used! But for the end user the API is always identical. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662799#action_12662799 ] Uwe Schindler commented on LUCENE-1345: --- Here some ideas how to implement search() with Query and Filter: - User runs Searcher.search() using a Filter as the only parameter. As every Filter is also a ConstantScoreQuery, the query can be executed and returns score 1.0 for all matching documents. - User runs Searcher.search() using a Query as the only parameter: No change, all is the same as before - User runs Searcher.search() using a BooleanQuery as parameter: If the BooleanQuery does not contain a Query that is subclass of Filter (the new Filter) everything as usual. If the BooleanQuery only contains exactly one Filter and nothing else the Filter is used as a constant score query. If BooleanQuery contains clauses with Queries and Filters the new algorithm could be used: The queries are executed and the results filtered with the filters. I hope this explains how I would implement the combined Filters and Queries. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662814#action_12662814 ] Paul Elschot commented on LUCENE-1345: -- In case there is no score value for each matching document it is not possible to order the results to be presented to a user. Because of that I don't want BooleanQuery to run correctly without at least one normal query clause to provide a score to order the results. Putting this in another way: TopDocs (and the meanwhile deprecated Hits) make no sense without at least one normal query clause. There are cases when all results are needed, and in that case for a Query one normally uses to the HitCollector API. For Filters one could provide a MatchCollector API that collects all hits providing only the document numbers, and no score (as in HitCollector). This was part of the earlier versions of the patches at LUCENE-584 that introduced the new Filter API, but it was dropped because it was new functionality that was not really needed at the time. The same MatchCollector API can also be provided for Query searching. At that point, there is no difference between Query and Filter. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662820#action_12662820 ] Uwe Schindler commented on LUCENE-1345: --- But where is the problem then: You only mean that BooleanQueries with only a Filter clause are not sortable. In my opinion this is not a real problem: I do for example use ConstantScoreQuery with a TrieRangeFilter as the only query constraint. All results are returned with score=1, this is similar to a classic SQL database. The default order of equal scores is index order, so it makes sense for TopDocs. And if you additionally add a SortField it makes more sense :) So why not implement the three cases as described in my last message and use a ConstantScoreQuery, if the Filter is alone in the BooleanQuery. In this case all options can be used with a simple API with no difference between Filters and Queries? The MatchCollector API in my opinion is also not needed. In the case of constant score (=1.0) why not simple call collect(dociId, 1.0f)? Why do we need a new API just because the score is not needed. Just define 1.0 as the score (like ConstantScoreQuery does). Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662821#action_12662821 ] Marvin Humphrey commented on LUCENE-1345: - Paul Elschot: To prepare searching in Lucene the following 'transformations' are done: Query - Weight - Scorer and Filter - DocIdSetIterator I've never seen the KS classes KS's Search classes used to be pretty direct ports from the Lucene Search hierarchy -- because when I was doing the work I had so much trouble grokking it that I felt there was no choice but to cargo cult. :) Since then, many changes have been made. Here are some that are germane to this discussion: * Filter has been eliminated, and filtering subclasses have been made into subclasses of Query. * Weight has been made a subclass of Query and renamed to Compiler. * BooleanQuery has been replaced by the foursome of ANDQuery, ORQuery, NOTQuery, and RequiredOptionalQuery, all of which descend from the common parent PolyQuery, and which compile to scorers roughly akin to those from Lucene's BooleanScorer2 (i.e. they implement Scorer.skipTo()).. By making both Filter and Weight/Compiler subclasses of Query, it became possible to simplify the Searchable/Searcher interface considerably. the downside of using ANDQuery (KS) for filtering is that it has to provide a score value, which somehow must be ignored during search. Good point, thanks! Right now, QueryFilter, NOTQuery, MatchAllQuery, and such all just provide a score of 0.0. ANDScorer adds the scores of its subscorers together, so there's no direct effect on the final score. However, the Similarity.coord() bonus is affected because the number of clauses has increased. That might be considered a bug. Beyond that, there's Scorer-compile-time optimization work to do along the lines of what Uwe proposes. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662823#action_12662823 ] Uwe Schindler commented on LUCENE-1345: --- Paul Elschot: Just for clarification: I do not want to completely convert Filters to ConstantScoreQueries. The idea was to combine Queries and Filters in such a way, that every Filter can automatically be used at all places where a Query can be used (e.g. also alone a search query without any other constraint). For that, the abstract Query methods must be implemented and return a default weight for Filters which is the current ConstantScore Logic. If the filter is used as a real filter (where the API wants a Filter), the getDocIdSet part could be directly used, the weight is useless (as it is currently, too). The constant score default implementation is only used when the Filter is used as a Query (e.g. as direct parameter to Searcher.search()). For the special case of BooleanQueries combining Filters and Queries the idea is, to optimize the BooleanQuery logic in such a way, that it detects if a BooleanClause is a Filter (using instanceof) and then directly uses the Filter API and not take the burden of the ConstantScoreQuery. There is only one special case: If the Filter is used alone in the BooleanQuery, then it must be executed as a ConstantScoreQuery, but only in this case. The problems with sorting are in my opinion not relevant: If score is identical (e.g. 1.0f) the results come in index order (this is how it appears to me). In this case TopDocs first list the docs with lower docIds. For the user this has the main advantage: That he can construct his query using a simplified API without thinking about Filters oder Queries, you can just combine clauses together. The scorer/weight logic then identifies the cases to use the filter or the query weight API. Just like the query optimizer of a RDB :-) Is this clear now? Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345-Filter+Query-merge.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662656#action_12662656 ] Marvin Humphrey commented on LUCENE-1345: - Paul Elschot, a while back: It would also allow to get rid of Filter in most of the search api, as any Filter can just be added to a BooleanQuery. In KS svn trunk (and potentially in Lucy), there is no Filter; all classes that perform filtering are just subclasses of Query which you're expected to apply using an ANDQuery. Can you think of any downside to that model? (Would it be possible to retrofit Lucene to use it in 3.0?) The motivation was the same as the one you articulate: to simplify the search API. (Hmm...Thinking out loud: DeletionsFilter as a subclass of Query...) Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662657#action_12662657 ] Michael McCandless commented on LUCENE-1345: {quote} (Hmm...Thinking out loud: DeletionsFilter as a subclass of Query...) {quote} +1 Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662673#action_12662673 ] John Wang commented on LUCENE-1345: --- Filters by definition (afaik) does not participate in scoring. Since score gathering is done at the BooleanQuery level, does it mean BooleanQuery would need to do instanceof check to see if it is a Filter? Or do we always hardcode filter with score 0? This is also dangerous if people do augment scores at hitcollector level or score gathering logic changes to something not as straightforward as summing. my two cents. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662685#action_12662685 ] Paul Elschot commented on LUCENE-1345: -- Marvin, bq. In KS svn trunk (and potentially in Lucy), there is no Filter; all classes that perform filtering are just subclasses of Query which you're expected to apply using an ANDQuery. Can you think of any downside to that model? In Lucene the class model is that Scorer extends DocIdSetIterator by some methods involved with document score values. To prepare searching in Lucene the following 'transformations' are done: Query - Weight - Scorer and Filter - DocIdSetIterator I've never seen the KS classes, but on the face of it, the downside of using ANDQuery (KS) for filtering is that it has to provide a score value, which somehow must be ignored during search. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662687#action_12662687 ] Paul Elschot commented on LUCENE-1345: -- John, Michael, bq. Given the perf number improvements we see, can we consider up the priority? I think most of the performance improvements that John posted can be moved into trunk without the addition of Filter as a clause to BooleanQuery, so I'd rather let these go first. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Fix For: 2.9 Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662627#action_12662627 ] John Wang commented on LUCENE-1345: --- Added perf comparisons with boolean set iterators with current scorers See patch System: Ubunto, java version 1.6.0_11 Intel core2 Duo 2.44ghz new milliseconds=470 new milliseconds=534 new milliseconds=450 new milliseconds=443 new milliseconds=444 new milliseconds=445 new milliseconds=449 new milliseconds=441 new milliseconds=444 new milliseconds=445 new total milliseconds=4565 old milliseconds=529 old milliseconds=491 old milliseconds=428 old milliseconds=549 old milliseconds=427 old milliseconds=424 old milliseconds=420 old milliseconds=424 old milliseconds=423 old milliseconds=422 old total milliseconds=4537 New/Old Time 4565/4537 (100.61715%) OrDocIdSetIterator milliseconds=1138 OrDocIdSetIterator milliseconds=1106 OrDocIdSetIterator milliseconds=1065 OrDocIdSetIterator milliseconds=1066 OrDocIdSetIterator milliseconds=1065 OrDocIdSetIterator milliseconds=1067 OrDocIdSetIterator milliseconds=1072 OrDocIdSetIterator milliseconds=1118 OrDocIdSetIterator milliseconds=1065 OrDocIdSetIterator milliseconds=1069 OrDocIdSetIterator total milliseconds=10831 DisjunctionMaxScorer milliseconds=1914 DisjunctionMaxScorer milliseconds=1981 DisjunctionMaxScorer milliseconds=1861 DisjunctionMaxScorer milliseconds=1893 DisjunctionMaxScorer milliseconds=1886 DisjunctionMaxScorer milliseconds=1885 DisjunctionMaxScorer milliseconds=1887 DisjunctionMaxScorer milliseconds=1889 DisjunctionMaxScorer milliseconds=1891 DisjunctionMaxScorer milliseconds=1888 DisjunctionMaxScorer total milliseconds=18975 Or/DisjunctionMax Time 10831/18975 (57.080368%) OrDocIdSetIterator milliseconds=1079 OrDocIdSetIterator milliseconds=1075 OrDocIdSetIterator milliseconds=1076 OrDocIdSetIterator milliseconds=1093 OrDocIdSetIterator milliseconds=1077 OrDocIdSetIterator milliseconds=1074 OrDocIdSetIterator milliseconds=1078 OrDocIdSetIterator milliseconds=1075 OrDocIdSetIterator milliseconds=1074 OrDocIdSetIterator milliseconds=1074 OrDocIdSetIterator total milliseconds=10775 DisjunctionSumScorer milliseconds=1398 DisjunctionSumScorer milliseconds=1322 DisjunctionSumScorer milliseconds=1320 DisjunctionSumScorer milliseconds=1305 DisjunctionSumScorer milliseconds=1304 DisjunctionSumScorer milliseconds=1301 DisjunctionSumScorer milliseconds=1304 DisjunctionSumScorer milliseconds=1300 DisjunctionSumScorer milliseconds=1301 DisjunctionSumScorer milliseconds=1317 DisjunctionSumScorer total milliseconds=13172 Or/DisjunctionSum Time 10775/13172 (81.80231%) AndDocIdSetIterator milliseconds=330 AndDocIdSetIterator milliseconds=336 AndDocIdSetIterator milliseconds=298 AndDocIdSetIterator milliseconds=299 AndDocIdSetIterator milliseconds=310 AndDocIdSetIterator milliseconds=298 AndDocIdSetIterator milliseconds=298 AndDocIdSetIterator milliseconds=334 AndDocIdSetIterator milliseconds=298 AndDocIdSetIterator milliseconds=299 AndDocIdSetIterator total milliseconds=3100 ConjunctionScorer milliseconds=332 ConjunctionScorer milliseconds=307 ConjunctionScorer milliseconds=302 ConjunctionScorer milliseconds=350 ConjunctionScorer milliseconds=300 ConjunctionScorer milliseconds=304 ConjunctionScorer milliseconds=305 ConjunctionScorer milliseconds=303 ConjunctionScorer milliseconds=303 ConjunctionScorer milliseconds=299 ConjunctionScorer total milliseconds=3105 And/Conjunction Time 3100/3105 (99.83897%) Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12662632#action_12662632 ] John Wang commented on LUCENE-1345: --- Given the perf number improvements we see, can we consider up the priority? Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: booleansetperf.txt, DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618513#action_12618513 ] Yonik Seeley commented on LUCENE-1345: -- Eks, I just tried your first TestIteratorPerf.java myself, and comparison with zero was faster (as expected) by about 8% I commented everything out except for testNew for simplicity. Original testNew: {code} $ java -server -cp . TestIteratorPerf new milliseconds=2883 new milliseconds=3289 new milliseconds=3148 new milliseconds=3195 new milliseconds=3149 new milliseconds=3179 new milliseconds=3180 new milliseconds=3164 new milliseconds=3179 new milliseconds=3164 new total milliseconds=31530 {code} Modified testNew: // while(-1!=(doc=it.next())){ while((doc=it.next()) = 0) {code} $ java -server -cp . TestIteratorPerf new milliseconds=2806 new milliseconds=2899 new milliseconds=2915 new milliseconds=2899 new milliseconds=2914 new milliseconds=2899 new milliseconds=2899 new milliseconds=3040 new milliseconds=2899 new milliseconds=2930 new total milliseconds=29100 {code} System: WinXP, Pentium4, java version 1.5.0_11 Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
than we conclude, comparison with 0 is faster :) Maybe something on my XP machine was doing something in background I have not noticed, stealing cycles, on Windows this can not be easily controlled. or when I tested it the other day, I used comparison with -1 while((doc=it.next()) -1) could that make any difference? looks like! I just read mails here. Wow, I can dump asm now, easily! this is fun... I will have to dig out my old x86 references, must admit, very very rusty on CPU development in past years (10+ )... I used to be cool a long, long time ago :) Only good news today, I learned something from you, I can dump asm from hotspot, we have Fieldable solved, ... great, I can go to sleep now :) - Original Message From: Yonik Seeley (JIRA) [EMAIL PROTECTED] To: java-dev@lucene.apache.org Sent: Wednesday, 30 July, 2008 11:25:31 PM Subject: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery [ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12618513#action_12618513 ] Yonik Seeley commented on LUCENE-1345: -- Eks, I just tried your first TestIteratorPerf.java myself, and comparison with zero was faster (as expected) by about 8% I commented everything out except for testNew for simplicity. Original testNew: {code} $ java -server -cp . TestIteratorPerf new milliseconds=2883 new milliseconds=3289 new milliseconds=3148 new milliseconds=3195 new milliseconds=3149 new milliseconds=3179 new milliseconds=3180 new milliseconds=3164 new milliseconds=3179 new milliseconds=3164 new total milliseconds=31530 {code} Modified testNew: // while(-1!=(doc=it.next())){ while((doc=it.next()) = 0) {code} $ java -server -cp . TestIteratorPerf new milliseconds=2806 new milliseconds=2899 new milliseconds=2915 new milliseconds=2899 new milliseconds=2914 new milliseconds=2899 new milliseconds=2899 new milliseconds=3040 new milliseconds=2899 new milliseconds=2930 new total milliseconds=29100 {code} System: WinXP, Pentium4, java version 1.5.0_11 Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.java, DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, OpenBitSetIteratorExperiment.java, TestIteratorPerf.java, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
as a matter of fact, you can, keeping literals on left hand side prevents some ugly accidental assignments, so at the end of day you have more time to speed things up instead of chasing bugs :) cheers Hoss, god to see you are following this - Original Message From: Chris Hostetter [EMAIL PROTECTED] To: java-dev@lucene.apache.org Sent: Tuesday, 29 July, 2008 3:13:12 AM Subject: Re: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery : Eks: just for grins, you can sometimes save a single cycle by changing : id==-1 to id0 (many operations on x86 automatically set status can you save anymore if you use 0id ? :) -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617726#action_12617726 ] Eks Dev commented on LUCENE-1345: - Yonik, this would probably work fine for int values (on my CPU), I have tried it on long values and this was significantly slower on this test... it boils down again to what is the CPU we are optimizing for :) Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617822#action_12617822 ] Yonik Seeley commented on LUCENE-1345: -- bq. I have tried it on long values and this was significantly slower on this test. Huh... I bet it's the test. It's probably so simple that everything is inlined and the comparison with -1 is being optimized away entirely (since a compare instruction is the same speed... doesn't matter if one is checking for equality or for less). Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617836#action_12617836 ] Eks Dev commented on LUCENE-1345: - bq. comparison with -1 is being optimized away entirely I do not think so, how compiler could optimize away the only condition that stops the loop? The loop would never finish, or am I misreading something here? Anyhow, the test is so simple that compiler can take completely other direction from the real case. I guess much better test (without too much effort!) would be to take something like OpenBitSetIterator and make one Iterator implementation with sentinel approach and then compare... this test is really just a dumb loop, but on the other side isolates the difference between two approaches... Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617603#action_12617603 ] Eks Dev commented on LUCENE-1345: - great! Will look into at at the weekend in more datails. I have moved this part to Constructor on my local copy, it passes all tests: +if (disiDocQueue == null) { + initDisiDocQueue(); +} it is in next() and skipTo() practically the same as reported in https://issues.apache.org/jira/browse/LUCENE-1145, with this, 1145 can be closed Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617604#action_12617604 ] Paul Elschot commented on LUCENE-1345: -- 20090729 is the date here, the attachment is dated 20080728, never mind. As to the sentinel for doc()/next() in the TestIteraratorPerf patch: this will need some real Scorers/DocIdSetIterators to see actual JIT compiler inlining in both cases. In the patch, the Old and New classes are local private classes, which are much easier to inline than separate, (non final) public classes. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617606#action_12617606 ] Paul Elschot commented on LUCENE-1345: -- Indeed, it makes sense to add the changes from LUCENE-1145 here. I remembered some discussion about this, but not that there was an issue open... Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
from what I can say, this just makes it harder for the new approach, but you newer know before you try it in production ... just wanted to see if it could lead anywhere before spending real time on it - Original Message From: Paul Elschot (JIRA) [EMAIL PROTECTED] To: java-dev@lucene.apache.org Sent: Tuesday, 29 July, 2008 12:44:31 AM Subject: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery [ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617604#action_12617604 ] Paul Elschot commented on LUCENE-1345: -- 20090729 is the date here, the attachment is dated 20080728, never mind. As to the sentinel for doc()/next() in the TestIteraratorPerf patch: this will need some real Scorers/DocIdSetIterators to see actual JIT compiler inlining in both cases. In the patch, the Old and New classes are local private classes, which are much easier to inline than separate, (non final) public classes. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617631#action_12617631 ] Yonik Seeley commented on LUCENE-1345: -- Eks: just for grins, you can sometimes save a single cycle by changing id==-1 to id0 (many operations on x86 automatically set status flags, hence comparison to zero can often be free). Not sure if the java optimizer will catch it though, and if it does if it would actually rise above the noise level. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, DisjunctionDISI.patch, LUCENE-1345.patch, LUCENE-1345.patch, TestIteratorPerf.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
: Eks: just for grins, you can sometimes save a single cycle by changing : id==-1 to id0 (many operations on x86 automatically set status can you save anymore if you use 0id ? :) -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617220#action_12617220 ] Paul Elschot commented on LUCENE-1345: -- Thanks for the DisjunctionDISI.patch. I had just continued, but I hadn't come that far yet. I'll be off quite irregularly in the next month, I'll try and attach here when there's real progress. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, LUCENE-1345.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery
Hi Paul, it sounds so familiar. I too like playing with lucene, makes fun, but I have not found formula to make 25 Hours day (waking up one hour earlier does not work for me for some strange reason) The only other person being so interested in this Filter-like issues is Yonik, but I guess he has also some big fish in Solr world to fry... Nobody is in hurry with this one, when it gets done, it will be finished ;) Anyway, I will have a look at what you did so far when I find a few hours to spare... cheers, eks - Original Message From: Paul Elschot (JIRA) [EMAIL PROTECTED] To: java-dev@lucene.apache.org Sent: Saturday, 26 July, 2008 11:50:31 PM Subject: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery [ https://issues.apache.org/jira/browse/LUCENE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12617220#action_12617220 ] Paul Elschot commented on LUCENE-1345: -- Thanks for the DisjunctionDISI.patch. I had just continued, but I hadn't come that far yet. I'll be off quite irregularly in the next month, I'll try and attach here when there's real progress. Allow Filter as clause to BooleanQuery -- Key: LUCENE-1345 URL: https://issues.apache.org/jira/browse/LUCENE-1345 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Paul Elschot Priority: Minor Attachments: DisjunctionDISI.patch, LUCENE-1345.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]