Hi List
Apologies for such a long message. I have tried to include everything, that
you might need to know to answer my question. 

I am having difficulties understanding how or what AveragePayloadFunction is
doing. Here is my example

Title:Human|9 pineal|5 luteinizing hormone receptors.
Text:The presence of luteinizing hormone receptors in human|9 pineal|5
glands from five females and three males, ranging in age from 61-89 yr, was
examined by in situ hybridization and immunocytochemistry. The results
demonstrated the presence of these receptors at the mRNA|7 and protein
levels in all the pineal|5 glands examined. Pineal|5 gland luteinizing
hormone receptors could potentially be involved in the regulation of
melatonin|7 synthesis.

3 is for class A
5 is for class B
7 is for class C
9 is for class D
These are the payloads stored in the index. But when I search, I use these
values for encoding term class, and then return 3 for selected class.

I am using WhiteSpaceTokenizer and LowerCaseFilter. In my PayloadSimilarity
class, I manipulate payload in a way so that, if I am interested in class A,
it will return payload value "x=3" only for terms in class A, I decide term
class by checking its payload value. 

Now, I query for "luteinizing hormone" using PayloadNearQuery with slop of
5. First I try with interest in class B and next with interest in class A.

*Result of Class A interest:*

Explain: 10.97332 = (MATCH) sum of:
  2.5589073 = (MATCH) weight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true) in 5362133), product of:
    0.68000716 = queryWeight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true)), product of:
      14.045828 = idf(AbstractText:  luteinizing=15481 hormone=164637)
      0.048413463 = queryNorm
    3.7630591 = (MATCH) fieldWeight(AbstractText:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
      2.4494898 = PayloadNearQuery, product of:
        0.8164966 = tf(phraseFreq=0.6666667)
        *3.0 = AveragePayloadFunction(...)*
      14.045828 = idf(AbstractText:  luteinizing=15481 hormone=164637)
      0.109375 = fieldNorm(field=AbstractText, doc=5362133)
  8.4144125 = (MATCH) weight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true) in 5362133), product of:
    0.7332054 = queryWeight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true)), product of:
      15.144659 = idf(ArticleTitle:  hormone=86980 luteinizing=9765)
      0.048413463 = queryNorm
    11.476201 = (MATCH) fieldWeight(ArticleTitle:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
      1.7320508 = PayloadNearQuery, product of:
        0.57735026 = tf(phraseFreq=0.33333334)
       * 3.0 = AveragePayloadFunction(...)*
      15.144659 = idf(ArticleTitle:  hormone=86980 luteinizing=9765)
      0.4375 = fieldNorm(field=ArticleTitle, doc=5362133)
---------------------------------------------------------------------

*Result of Class B Interest:*

Explain: 3.657773 = (MATCH) sum of:
  0.85296905 = (MATCH) weight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true) in 5362133), product of:
    0.68000716 = queryWeight(payloadNear([AbstractText:luteinizing,
AbstractText:hormone], 5, true)), product of:
      14.045828 = idf(AbstractText:  luteinizing=15481 hormone=164637)
      0.048413463 = queryNorm
    1.254353 = (MATCH) fieldWeight(AbstractText:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
      0.8164966 = PayloadNearQuery, product of:
        0.8164966 = tf(phraseFreq=0.6666667)
        *1.0 = AveragePayloadFunction(...)*
      14.045828 = idf(AbstractText:  luteinizing=15481 hormone=164637)
      0.109375 = fieldNorm(field=AbstractText, doc=5362133)
  2.804804 = (MATCH) weight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true) in 5362133), product of:
    0.7332054 = queryWeight(payloadNear([ArticleTitle:luteinizing,
ArticleTitle:hormone], 5, true)), product of:
      15.144659 = idf(ArticleTitle:  hormone=86980 luteinizing=9765)
      0.048413463 = queryNorm
    3.8254004 = (MATCH) fieldWeight(ArticleTitle:payloadNear([luteinizing,
hormone], 5, true) in 5362133), product of:
      0.57735026 = PayloadNearQuery, product of:
        0.57735026 = tf(phraseFreq=0.33333334)
       * 1.0 = AveragePayloadFunction(...)*
      15.144659 = idf(ArticleTitle:  hormone=86980 luteinizing=9765)
      0.4375 = fieldNorm(field=ArticleTitle, doc=5362133)

As I understand, when I am interested in class B, I should get 3 from
AveragePayloadFunction, where as I should get 1 for class A, as there is no
class A term in the text, hence everything will have payload 1. Whereas, if
I am interested in Class B, there is one term in "Title" field, hence
AveragePayloadFunction returned value will be 3.

I do not understand what is going on. May be I am not getting what
AveragePayloadFunction is doing exactly. 

My similarity class is as follows:

public class PayloadSearchSimilarity extends DefaultSimilarity {

        private static final long serialVersionUID = 1L;
        public static String semantic;
        
        @Override
    public float scorePayload(int docId,String fieldName, int start, int
end, byte[] bytes, int offset, int length) {
                //System.out.println("this is gett");
                if(bytes!=null)
                {
                float payload=PayloadHelper.decodeFloat(bytes, offset);
                //System.out.println("this is getting called, load:"+payload);
                        //i am now returning same payload for all semantic type 
so that we can
compare the score. it was changed after we showed it to Dietrich.
                        if(semantic.equals("A") && (payload==3))
                        {
                                //System.out.println("Doc id:"+docId+"field 
:"+fieldName+" Semantic:"+
semantic+" Payload:"+payload);
                                return 3;
                        }
                        else
                        {
                                if(semantic.equals("B") && (payload==5))
                                {
                                        //System.out.println("Doc 
id:"+docId+"field :"+fieldName+" Semantic:"+
semantic+" Payload:"+payload);
                                        return 3;
                                }
                                else
                                {
                                        if(semantic.equals("C") && (payload==7))
                                        {
                                                System.out.println("Semantic:"+ 
semantic);
                                                return 3;
                                        }
                                        else
                                        {
                                                
                                                if(semantic.equals("D") && 
(payload==9))
                                                {
                                                        
System.out.println("Semantic:"+ semantic);
                                                        return 3;
                                                }
                                                else
                                                {
                                                        
//System.out.println("happens when term class does not match with
semantic, Semantic:"+ semantic);
                                                        return 1;
                                                }
                                        }
                                }
                        }
                
        }//payload|bytes not null end
        else
        {
                //System.out.println("payload null");
                return 1;
        }
    }
}

I am really puzzled. It will be really helpful, if someone can help.

Look forward to hear from you.
Many Thanks
Shyama

--
View this message in context: 
http://lucene.472066.n3.nabble.com/PayloadNearQuery-and-AveragePayloadFunction-tp3710454p3710454.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to