[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011402#comment-14011402 ] Peter Keegan commented on SOLR-5831: Hi Joel, I'm not sure why I didn't see this problem until now, but this PostFilter doesn't work after being cached. When the ScoreScaleFilter is retrieved from the cache, the docSet is null and a new PostFilter collector is created, but the Collector's 'setScorer' method isn't called. As a result, the 'collect' method throws NPE (scorer.score()). What do I need to do to keep the query from rerunning? Can the scorer be saved with the ScoreScaleFilter instead of the ScoreCollector? Thanks, Peter Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011568#comment-14011568 ] Joel Bernstein commented on SOLR-5831: -- If your Query class implements PostFilter and ScoreFilter then the SolrIndexSearcher will make sure the scorer is present during docset retrieval. ScoreFilter is just a flag, no methods. It's little things like this that make me believe this would be better as a pluggable collector. SOLR-5973 is now committed. You can also see a pluggable collector example with SOLR-6088. Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011635#comment-14011635 ] Peter Keegan commented on SOLR-5831: Thanks, that was a simple fix. But if the results are coming from the cache, why does the PostFilter collection have to be rerun? I totally agree that there are a lot of little details that make it tricky to implement a PostFilter. For the short term, we'll likely go to production with it, though, since we're running on 4.6.1. Can the pluggable collector framework be patched into 4.6.1? (when I looked at it a while ago, it didn't seem so) Peter Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011723#comment-14011723 ] Joel Bernstein commented on SOLR-5831: -- The main DocSet isn't cached. So, if you pull a DocList from the QueryResultCache, Solr needs to regenerate the DocSet for faceting etc... Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989486#comment-13989486 ] Joel Bernstein commented on SOLR-5831: -- Peter, We can add a test with multiple segments by committing between updates in the test case. The QueryResultCache issue I'lll have to review closer to see how your using it. Before I do that though... I'm getting close to committing SOLR-5973. How would you feel about contributing your score scaling Collector, instead of the PostFilter version. There are some compelling reasons for this: 1) Response times as a function of hit count is much better with the Collector. 2) It would make a great first example of how to use the new pluggable collector framework. Right now I don't have another example. Let me know your thoughts. Joel Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989491#comment-13989491 ] Joel Bernstein commented on SOLR-5831: -- Chris, I haven't had a chance to review the patch you're working on closely. I did see that it makes some changes in the lucene core library. One of the nice things about the approach is that ticket is that it accomplishes scoring scales without making any changes to Solr or lucene's core libraries. In my last post I mention using the a pluggable Collector rather then a PostFilter in conjuction with SOLR-5973. Using the collector seems to be very fast and scale very well. Joel Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989704#comment-13989704 ] Peter Keegan commented on SOLR-5831: Hi Joel, I'll work on adding some multi-segment unit tests soon. I haven't looked closely at SOLR-5973 yet, so I don't know how much effort it will be to change the score scaling collector from the old framework to the new. In the old framework, the plugin extends TopDocsCollector and overrides 'topDocs' to change the ordering, and is not particularly generic. I'll take a look, though. Peter Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987005#comment-13987005 ] Joel Bernstein commented on SOLR-5831: -- I went through the latest patch and it looks really good. The main pitfalls with the CollapsingQParserPlugin dealt with how the DummyScorer behaved with the different types of ranking collectors. You mentioned a secondary sort issue, that was fixed in the latest patch. I'm not sure if this was related to the use of the DummyScorer. To account for this lets add some tests that confirm proper result ordering under the different sorting conditions. I should have some time to install the patch and work with it next week. Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987037#comment-13987037 ] Chris Russell commented on SOLR-5831: - Hi. I have uploaded a patch that is designed to improve the performance of the scale function by allowing it to only score documents that match the filters. I am curious how my patch compares with your posted benchmarks from March 11th. Would you be willing to apply it and run them? LUCENE-5637 Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Fix For: 4.9 Attachments: SOLR-5831.patch, SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956630#comment-13956630 ] Joel Bernstein commented on SOLR-5831: -- Hi Peter, I haven't forgotten about this ticket. I've got one ticket ahead this to finish for Solr 4.8 and then I'll work with you to try to get this ticket ready for Solr 4.9. Joel Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Attachments: SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951285#comment-13951285 ] Peter Keegan commented on SOLR-5831: Here's a first pass at some unit tests. The (unscaled) Lucene scores are generated via a 'boost' query that uses a field containing a random value. 3 of the 4 PostFilter parameters are tested. I'll do the 4th next week. I just wanted to have something to post before the week ended. Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Attachments: SOLR-5831.patch, SOLR-5831.patch, TestScaleScoreQParserPlugin.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947115#comment-13947115 ] Joel Bernstein commented on SOLR-5831: -- Hi Peter, It looks like you haven't updated the patch? I'll take a look and see if I can find some random test examples. I'll also have some time in the next couple of days to help out with the tests. Joel Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Priority: Minor Attachments: SOLR-5831.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947250#comment-13947250 ] Peter Keegan commented on SOLR-5831: Hi Joel, I've updated the patch which includes fixes for hashCode() and isEqual(), and and I added unit tests to QueryEqualityTest.java. I'm still stuck on how to determine the proper result window size for the QueryResultCache I will work on unit tests, too, with your help, when you have the time. Thanks, Peter Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Assignee: Joel Bernstein Priority: Minor Attachments: SOLR-5831.patch, SOLR-5831.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938515#comment-13938515 ] Peter Keegan commented on SOLR-5831: Hi Joel, Thanks for the reviewing the code. I have a few questions about the issues you found: 1. I've fixed the hashCode() and equals() methods you referred to in your #1. Is this the same issue as your #3? 2. Do you have any suggestions on how to determine the proper result window size for the QueryResultCache? 3. Are there any unit tests that use randomized testing that I could study for use in this patch? Thanks, Peter Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Priority: Minor Attachments: SOLR-5831.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1399#comment-1399 ] Peter Keegan commented on SOLR-5831: Here is a comparison of ScaleScorePostFilter and a Configurable Collector that does the same score scaling with the function 'hard wired' in the collector. I had to hand merge some of the patch from SOLR-4465 into the 4.6.1 branch. My tests show that the Collector is faster: 1. SolrMeter test @ 20 QPS: Custom Collector with maxscalehits=1: Median response time: 25 ms Ave response time: 135 ms Load average: 2.3 PostFilter with maxscalehits=1: Median response time: 30 ms Ave response time: 190 ms Load average: 3.3 2. Typical response times as a function of hit count: # hits CollectorPostFilter -- - -- 80K 12 35 230K20 80 330K25 123 720K35 275 1.1M32 390 These difference in the response times is likely due to the hits being collected twice by the PostFilter (once by the PostFilter and once by the delegate collector), but the Custom Collector only collects the hits once. Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Priority: Minor Attachments: SOLR-5831.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930313#comment-13930313 ] Peter Keegan commented on SOLR-5831: How is this performing compared to using the scale() function? No comparison. I'm running Solr on a 4-vCPU EC2 instance and tested with SolrMeter. On a production index (1.6 million docs) and production queries at a leisurely rate of 10 QPS: 1. scale() with function query: Median response time: 3000 ms Ave response time: 8000 ms Load average: double digits 2. PostFilter with maxscalehits=0 (rows=50): Median response time: 18 ms Ave response time: 108 ms Load average: 1 3. PostFilter with maxscalehits=1: Median response time: 21 ms Ave response time: 120 ms Load average: 1 4. PostFiilter with maxscalehits=-1 (scale all hits) Worse than #1. Most queries timed-out. This is not surprising since the PriorityQueue size is often huge from high hit counts, and all hits are delegated. Regarding the QueryResultCache, are there any suggestions on how to determine its size in the context of the PostFilter? Thanks, Peter Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Priority: Minor Attachments: SOLR-5831.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925227#comment-13925227 ] Joel Bernstein commented on SOLR-5831: -- Peter, I was able to do a first review of the code before heading out on vacation. Very cool piece of code. How is this performing compared to using the scale() function? The following issues were in early versions of the CollaspingQParserPlugin so you can look at the most recent version to see how they were resolved: 1) The ScoreScaleFilter class needs to only have instance variables that are needed for the hashCode() and equals() method otherwise they'll be all kinds of bugs with the Solr caches. So any work you're doing in the constructor of this class and hanging onto needs to be moved to the getFilterCollector() method. 2) The DummyScore also needs to implement the docID() method. Pretty simple to do, check the latest CollapsingQParserPlugin to see how this is handled. 3) I think getting this working with the QueryResultCache will be important. Early versions of the CollapsingQParserPlugin didn't do this, but standard grouping didn't either, so it wasn't a downgrade in functionality for FieldCollapsing. But people who use this feature will be surprised if the QueryResultCache stops working. So hashCode() and equals() will need to be implemented. 4) The value source needs a proper context (rcontext in the code). Latest version of the CollapsingQParserPlugin demonstrates this as well. Also having good tests will be important and probably somewhat tricky to write. Using some form of randomized testing would be good to ensure that random scores get normalized properly. I'll checkin on this when I get back from vacation. Joel Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Priority: Minor Attachments: SOLR-5831.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5831) Scale score PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924693#comment-13924693 ] Joel Bernstein commented on SOLR-5831: -- Peter, Lot's of good stuff here. It's going to take me some time to fully review it. I'm on vacation next week but when I get I'll take a close look. Joel Scale score PostFilter -- Key: SOLR-5831 URL: https://issues.apache.org/jira/browse/SOLR-5831 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.7 Reporter: Peter Keegan Priority: Minor Attachments: SOLR-5831.patch The ScaleScoreQParserPlugin is a PostFilter that performs score scaling. This is an alternative to using a function query wrapping a scale() wrapping a query(). For example: select?qq={!edismax v='news' qf='title^2 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!query v=$qq} The problem with this query is that it has to scale every hit. Usually, only the returned hits need to be scaled, but there may be use cases where the number of hits to be scaled is greater than the returned hit count, but less than or equal to the total hit count. Sample syntax: fq={!scalescore+l=0.0 u=1.0 maxscalehits=1 func=sum(product(sscore(),0.75),product(field(myfield),0.25))} l=0.0 u=1.0 //Scale scores to values between 0-1, inclusive maxscalehits=1//The maximum number of result scores to scale (-1 = all hits, 0 = results 'page' size) func=... //Apply the composite function to each hit. The scaled score value is accessed by the 'score()' value source All parameters are optional. The defaults are: l=0.0 u=1.0 maxscalehits=0 (result window size) func=(null) Note: this patch is not complete, as it contains no test cases and may not conform to all the guidelines in http://wiki.apache.org/solr/HowToContribute. I would appreciate any feedback on the usability and implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org