No solutions to the problem? OK. I'll look for the changes in source code and if I succeed I'll share it here for feedback.
Thanks On Tue, Nov 8, 2011 at 5:06 PM, Samarendra Pratap <samarz...@gmail.com>wrote: > Hi Chris, > Thanks for the insight. > > 1. "omitTermFreqAndPositions" is very straightforward but if I avoid > positions I'll refuse to serve phrase queries. I had searched for this in > past as well but I finally reached to the conclusion that there is no thing > like "omitTermFreq" (only). Perhaps because frequency is the count of > positions of a term and we can not discard it if latter is present. :( . > Please point me out If I am wrong. And if I really am, that would be > exactly what I need. > > 2. Function query seemed nice (though strange because I never used it > before) and I gave it a few hours but that too did not seem to solve my > requirement. The "artificial" score we are generating is getting multiplied > into rest of the score which includes score due to "cat" field as well. (I > can not remove "cat" from "qf" as I have to search there). It is only that > I don't want this field's score on the basis of matching "tf". > > > To explain second point here is what I did. > I indexed 4 documents > doc 1 > > tile:chair, > cat:chair and chair > > doc 2 > > tile:table, > cat:chair and chair > > doc 3 > > tile:chair, > cat:chair and table > > doc 4 > > tile:table, > cat:chair and table > > > searching for a simple query > http://localhost:8983/solr/site1/select/?<http://localhost:8983/solr/site1/select/?q=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > q=*chair*&<http://localhost:8983/solr/site1/select/?q=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > qf=title&<http://localhost:8983/solr/site1/select/?q=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > qf=cat&<http://localhost:8983/solr/site1/select/?q=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > fl=title,cat,id,score&<http://localhost:8983/solr/site1/select/?q=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > pf=ttile<http://localhost:8983/solr/site1/select/?q=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > > gives 4 results (1,3,2,4) > > I want document 1 and 3 with equal score and 2 and 4 with similar score. > because the only difference within the pairs is only "cat" field's value > > After spending some hours on function queries I finally reached on > following query > http://localhost:8983/solr/site1/select/?<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > q={!boost%20b=$cat_boost%20v=$main_query}&<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > main_query={!dismax%20qf=%22title%20cat%22%20v=$qry}&<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > cat_boost={!func}map(query({!field%20f=cat%20v=$qry},-1),0,1000,1,0)&<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > qry=*chair*&<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > qf=title&<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > qf=cat&<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > fl=title,cat,displayid,score&<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > pf=ttile<http://localhost:8983/solr/site1/select/?q=%7B!boost%20b=$cat_boost%20v=$main_query%7D&main_query=%7B!dismax%20qf=%22title%20mcatnametext%22%20v=$qry%7D&cat_boost=%7B!func%7Dmap(query(%7B!field%20f=mcatnametext%20v=$qry%7D,-1),0,1000,1,0)&qry=chair&qf=title&qf=mcatnametext&fl=title,mcatnametext,displayid,score&pf=ttile&debugQuery=true&echoParams=all> > > > But debugging the query showed that the boost value ($cat_boost) is being > multiplied into a value which is generated with the help of "cat" field > thus resulting in different scores for 1 and 3 (similarly for 2 and 4). > > 1.2942866 = (MATCH) boost(+(title:chair | cat:chair)~0.01 > (),map(query(cat:chair,def=-1.0),0.0,1000.0,1.0)), product of: > 1.2942866 = (MATCH) sum of: > 1.2942866 = (MATCH) max plus 0.01 times others of: > 1.2876587 = (MATCH) weight(title:chair in 0), product of: > 0.9999818 = queryWeight(title:chair), product of: > 1.287682 = idf(docFreq=2, maxDocs=4) > 0.7765751 = queryNorm > 1.287682 = (MATCH) fieldWeight(title:chair in 0), product of: > 1.0 = tf(termFreq(title:chair)=1) > 1.287682 = idf(docFreq=2, maxDocs=4) > 1.0 = fieldNorm(field=title, doc=0) > 0.66279614 = (MATCH) weight(cat:chair in 0), product of: > 0.60328734 = queryWeight(cat:chair), product of: > 0.7768564 = idf(docFreq=4, maxDocs=4) > 0.7765751 = queryNorm > 1.0986409 = (MATCH) fieldWeight(cat:chair in 0), product of: > 1.4142135 = tf(termFreq(cat:chair)=2) > 0.7768564 = idf(docFreq=4, maxDocs=4) > 1.0 = fieldNorm(field=cat, doc=0) > * 1.0* = > map(query(cat:chair,def=-1.0)=1.0986409,min=0.0,max=1000.0,target=1.0) > > > > > Did I get you wrong? > I'll appreciate if you could point out any mistake (or my > misinterpretation) in the mail above. > > > I was thinking there should be some hook or plugin (or anything) which > could just change the score calculation formula *for a particular field*. > There is a function in DefaultSimilarity class - *public float tf(float > freq)* but that does not mention the field name. Is there a possibility > to look into this direction? > > > Thank you very much. > > > > > On Tue, Nov 8, 2011 at 6:23 AM, Chris Hostetter > <hossman_luc...@fucit.org>wrote: > >> >> : You can write your custom similarity implementation, and override the >> : /lengthNorm()/ method to return a constant value. >> >> The postered already said (twice!) that they have already set >> omitNorms=true, so lengthNorm won't even be used >> >> omiting norms (or mucking with norms by modifying the lengthNorm function) >> only affects the norms portion of the scoring -- the problem being >> described here is when a document matches the input term more then once: >> that is an issue of the "term freuency". >> >> Setting omitTermFreqAndPositions="true" on your field type will eliminate >> the term frequency from the equation, and it will become a simple "match >> or not" factor in your scoring. >> >> From the "more then one way to do it" standpoint, another option is to >> wrap the query in a function that flattens the scores (more fine grained >> control, and doesn't require re-indexing, but probably less efficient) >> >> q={!boost b=$cat_boost v=$main_query} >> main_query=... >> cat_boost={!func}map(map(query({!field f=cat >> v=$cat},-1),0,10000,5)-1,-1,1) >> cat=... >> >> (note: used nested maps so that non-matches would result in a 1x >> multipler, while matches result in a 5x multiplier) >> >> -Hoss >> > > > > -- > Regards, > Samar > -- Regards, Samar