[jira] [Commented] (SOLR-5831) Scale score PostFilter

Joel Bernstein (JIRA) Sun, 09 Mar 2014 09:05:53 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925227#comment-13925227
 ]


Joel Bernstein commented on SOLR-5831:
--------------------------------------

Peter,

I was able to do a first review of the code before heading out on vacation.

Very cool piece of code. How is this performing compared to using the scale() 
function?

The following issues were in early versions of the CollaspingQParserPlugin so 
you can look at the most recent version to see how they were resolved:

1) The ScoreScaleFilter class needs to only have instance variables that are 
needed for the hashCode() and equals() method otherwise they'll be all kinds of 
bugs with the Solr caches. So any work you're doing in the constructor of this 
class and hanging onto needs to be moved to the getFilterCollector() method.

2) The DummyScore also needs to implement the docID() method. Pretty simple to 
do, check the latest CollapsingQParserPlugin to see how this is handled.

3) I think getting this working with the QueryResultCache will be important. 
Early versions of the CollapsingQParserPlugin didn't do this, but standard 
grouping didn't either, so it wasn't a downgrade in functionality for 
FieldCollapsing. But people who use this feature will be surprised if the 
QueryResultCache stops working. So hashCode() and equals() will need to be 
implemented.

4) The value source needs a proper context (rcontext in the code). Latest 
version of the CollapsingQParserPlugin demonstrates this as well.

Also having good tests will be important and probably somewhat tricky to write. 
 Using some form of randomized testing would be good to ensure that random 
scores get normalized properly.

I'll checkin on this when I get back from vacation.

Joel

  




> Scale score PostFilter
> ----------------------
>
>                 Key: SOLR-5831
>                 URL: https://issues.apache.org/jira/browse/SOLR-5831
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 4.7
>            Reporter: Peter Keegan
>            Priority: Minor
>         Attachments: SOLR-5831.patch
>
>
> The ScaleScoreQParserPlugin is a PostFilter that performs score scaling.
> This is an alternative to using a function query wrapping a scale() wrapping 
> a query(). For example:
> select?qq={!edismax v='news' qf='title^2 
> body'}&scaledQ=scale(product(query($qq),1),0,1)&q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))&fq={!query
>  v=$qq}
> The problem with this query is that it has to scale every hit. Usually, only 
> the returned hits need to be scaled,
> but there may be use cases where the number of hits to be scaled is greater 
> than the returned hit count,
> but less than or equal to the total hit count.
> Sample syntax:
> fq={!scalescore+l=0.0 u=1.0 maxscalehits=10000 
> func=sum(product(sscore(),0.75),product(field(myfield),0.25))}
> l=0.0 u=1.0           //Scale scores to values between 0-1, inclusive 
> maxscalehits=10000    //The maximum number of result scores to scale (-1 = 
> all hits, 0 = results 'page' size)
> func=...                      //Apply the composite function to each hit. The 
> scaled score value is accessed by the 'score()' value source
> All parameters are optional. The defaults are:
> l=0.0 u=1.0
> maxscalehits=0 (result window size)
> func=(null)
>  
> Note: this patch is not complete, as it contains no test cases and may not 
> conform 
> to all the guidelines in http://wiki.apache.org/solr/HowToContribute. 
>  
> I would appreciate any feedback on the usability and implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5831) Scale score PostFilter

Reply via email to