Re: Payloads with Phrase queries

Raghuveer Kancherla Tue, 15 Dec 2009 03:31:53 -0800

The interesting thing I am noticing is that the scoring works fine for a
phrase query like "solr rocks".
This lead me to look at what query I am using in case of a single term.
Turns out that I am using PayloadTermQuery taking a cue from solr-1485
patch.


I changed this to BoostingTermQuery (i read somewhere that this is
deprecated .. but i was just experimenting) and the scoring seems to work as
expected now for a single term.

Now, the important question is what is the Payload version of a TermQuery?

Regards
Raghu


On Tue, Dec 15, 2009 at 12:45 PM, Raghuveer Kancherla <
raghuveer.kanche...@aplopio.com> wrote:

> Hi,
> Thanks everyone for the responses, I am now able to get both phrase queries
> and term queries to use payloads.
>
> However the the score value for each document (and consequently, the
> ordering of documents) are coming out wrong.
>
> In the solr output appended below, document 4 has a score higher than the
> document 2 (look at the debug part). The results section shows a wrong score
> (which is the payload value I am returning from my custom similarity class)
> and the ordering is also wrong because of this. Can someone explain this ?
>
> My custom query parser is pasted here http://pastebin.com/m9f21565
>
> In the similarity class, I return 10.0 if payload is 1 and 20.0 if payload
> is 2. For everything else I return 1.0.
>
> {
>  'responseHeader':{
>   'status':0,
>   'QTime':2,
>   'params':{
>       'fl':'*,score',
>       'debugQuery':'on',
>       'indent':'on',
>
>
>       'start':'0',
>       'q':'solr',
>       'qt':'aplopio',
>       'wt':'python',
>       'fq':'',
>       'rows':'10'}},
>  'response':{'numFound':5,'start':0,'maxScore':20.0,'docs':[
>
>
>       {
>        'payloadTest':'solr|2 rocks|1',
>        'id':'2',
>        'score':20.0},
>       {
>        'payloadTest':'solr|2',
>        'id':'4',
>        'score':20.0},
>
>
>       {
>        'payloadTest':'solr|1 rocks|2',
>        'id':'1',
>        'score':10.0},
>       {
>        'payloadTest':'solr|1 rocks|1',
>        'id':'3',
>        'score':10.0},
>
>
>       {
>        'payloadTest':'solr',
>        'id':'5',
>        'score':1.0}]
>  },
>  'debug':{
>   'rawquerystring':'solr',
>   'querystring':'solr',
>
>
>   'parsedquery':'PayloadTermQuery(payloadTest:solr)',
>   'parsedquery_toString':'payloadTest:solr',
>   'explain':{
>       '2':'\n7.227325 = (MATCH) fieldWeight(payloadTest:solr in 1), product 
> of:\n  14.142136 = (MATCH) btq, product of:\n    0.70710677 = 
> tf(phraseFreq=0.5)\n    20.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=1)\n',
>
>
>       '4':'\n11.56372 = (MATCH) fieldWeight(payloadTest:solr in 3), product 
> of:\n  14.142136 = (MATCH) btq, product of:\n    0.70710677 = 
> tf(phraseFreq=0.5)\n    20.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=3)\n',
>
>
>       '1':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 0), product 
> of:\n  7.071068 = (MATCH) btq, product of:\n    0.70710677 = 
> tf(phraseFreq=0.5)\n    10.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=0)\n',
>
>
>       '3':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 2), product 
> of:\n  7.071068 = (MATCH) btq, product of:\n    0.70710677 = 
> tf(phraseFreq=0.5)\n    10.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=2)\n',
>
>
>       '5':'\n0.578186 = (MATCH) fieldWeight(payloadTest:solr in 4), product 
> of:\n  0.70710677 = (MATCH) btq, product of:\n    0.70710677 = 
> tf(phraseFreq=0.5)\n    1.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=4)\n'},
>
>
>   'QParser':'BoostingTermQParser',
>   'filter_queries':[''],
>   'parsed_filter_queries':[],
>   'timing':{
>       'time':2.0,
>       'prepare':{
>        'time':1.0,
>
>
>        'org.apache.solr.handler.component.QueryComponent':{
>         'time':1.0},
>        'org.apache.solr.handler.component.FacetComponent':{
>         'time':0.0},
>        'org.apache.solr.handler.component.MoreLikeThisComponent':{
>
>
>         'time':0.0},
>        'org.apache.solr.handler.component.HighlightComponent':{
>         'time':0.0},
>        'org.apache.solr.handler.component.StatsComponent':{
>         'time':0.0},
>        'org.apache.solr.handler.component.DebugComponent':{
>
>
>         'time':0.0}},
>       'process':{
>        'time':1.0,
>        'org.apache.solr.handler.component.QueryComponent':{
>         'time':0.0},
>        'org.apache.solr.handler.component.FacetComponent':{
>
>
>         'time':0.0},
>        'org.apache.solr.handler.component.MoreLikeThisComponent':{
>         'time':0.0},
>        'org.apache.solr.handler.component.HighlightComponent':{
>         'time':0.0},
>
>
>        'org.apache.solr.handler.component.StatsComponent':{
>         'time':0.0},
>        'org.apache.solr.handler.component.DebugComponent':{
>         'time':1.0}}}}}
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Dec 10, 2009 at 5:48 PM, AHMET ARSLAN <iori...@yahoo.com> wrote:
>
>>
>> > I was looking through some lucene
>> > source codes and found the following class
>> > org.apache.lucene.search.payloads.PayloadSpanUtil
>> >
>> > There is a function named queryToSpanQuery in this class.
>> > Is this the
>> > preferred way to convert a PhraseQuery to
>> > PayloadNearQuery?
>>
>> queryToSpanQuery method does not return PayloadNearQuery type.
>>
>> You need to override getFieldQuery(String field, String queryText, int
>> slop) of SolrQueryParser or QueryParser.
>>
>> This code is modified from Lucene In Action Book (2nd edition) Chapter
>> 6.3.4 Allowing ordered phrase queries
>>
>> protected Query getFieldQuery(String field, String queryText, int slop)
>> throws ParseException {
>>
>>        Query orig = super.getFieldQuery(field, queryText, slop);
>>
>>        if (!(orig instanceof PhraseQuery)) return orig;
>>
>>        PhraseQuery pq = (PhraseQuery) orig;
>>        Term[] terms = pq.getTerms();
>>        SpanQuery[] clauses = new SpanQuery[terms.length];
>>
>>        for (int i = 0; i < terms.length; i++)
>>            clauses[i] = new PayloadTermQuery(terms[i], new
>> AveragePayloadFunction());
>>        return new PayloadNearQuery(clauses, slop, true);
>>
>>    }
>>
>>
>> > Also, are there any performance considerations while using
>> > a PayloadNearQuery instead of a PhraseQuery?
>>
>> I don't think there will be significant performance difference.
>>
>>
>>
>>
>

Re: Payloads with Phrase queries

Reply via email to