One more thing, instead of extending the BooleanQuery class to remove the coord factor, can I also extend the Similarity class to do it ?
Still the other question is open: just to be sure, if I disable the coord factor I can finally compare my BooleanQuery results ? thanks > > > > On 28 March 2011 10:11, Uwe Schindler <u...@thetaphi.de> wrote: > >> Hi Patrick, >> >> You can disable the coord factor in the constructor of BooleanQuery. >> >> Uwe >> >> ----- >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >> > -----Original Message----- >> > From: Patrick Diviacco [mailto:patrick.divia...@gmail.com] >> > Sent: Monday, March 28, 2011 10:09 AM >> > To: java-user@lucene.apache.org >> > Subject: Re: comparing lucene scores across queries >> > >> > Hi, thanks for reply. >> > >> > Yeah, I've read the Similarity class documentation several times, but I >> need >> > some tip. >> > >> > My queries are BooleanQueries but they always have the same structure >> > (the same structure of the docs, they are actually docs from >> collection): >> 3 >> > fields. >> > >> > What if I simplify the similarity scores, by removing coord factor and >> just >> > leaving the cosine similarity which is comparable ? >> > >> > I want to underline the fact that my boolean queries are just a >> combination >> > of "field:term" items, and I always have the same 3 fields with >> different >> > terms obviously. >> > >> > Thanks >> > >> > >> > >> > >> > On 28 March 2011 10:03, Uwe Schindler <u...@thetaphi.de> wrote: >> > >> > > No, scores are in general not comparable between different queries. >> > > The problem lies in many things: >> > > - Each query has a norm factor that makes it more compareable if they >> > > are sub clauses of a BooleanQuery. But you are right, this norm factor >> > > should be the same. >> > > - Some queries like FuzzyQuery rely on the terms in index and those >> > > matches the query >> > > - Inside Boolean queries, there is also a coord-factor involved >> > > >> > > If you are always using the same simple type of query (e.g. simple >> > > TermQuery, only with different term) on the same index, you can >> > > compare the scores. As soon as you are using complex queries (e.g >> > > several terms compared in a BooleanQuery as QueryParser produces), the >> > > scores are no longer comparable. >> > > >> > > You can read more on all factors that are included in scoring: >> > > >> > > >> > http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/ >> > > Simila >> > > rity.html >> > > >> > > ----- >> > > Uwe Schindler >> > > H.-H.-Meier-Allee 63, D-28213 Bremen >> > > http://www.thetaphi.de >> > > eMail: u...@thetaphi.de >> > > >> > > >> > > > -----Original Message----- >> > > > From: Patrick Diviacco [mailto:patrick.divia...@gmail.com] >> > > > Sent: Monday, March 28, 2011 9:44 AM >> > > > To: java-user@lucene.apache.org >> > > > Subject: comparing lucene scores across queries >> > > > >> > > > Hi, >> > > > >> > > > sorry I've already asked few days ago, but I got no reply and I >> > > > really >> > > need >> > > > some help on this.. >> > > > >> > > > I'm running several queries against a doc collection. The queries >> > > > are documents of the collection itself, I need to measure how >> > > > similar is each document to the rest of the collection. >> > > > >> > > > Now, Lucene returns me a score per query, but I've been told such >> > > > score >> > > is >> > > > not comparable across queries. Is this correct ? >> > > > >> > > > For example, arem't these scores comparable ? >> > > > query1, score:8.324234 >> > > > query2, score:3.324238 >> > > > >> > > > If so, why not ? Isn't the cosine similarity between the query >> > > > vector and collection docs vectors ? I really need a comparable >> measure. >> > > > >> > > > thanks >> > > >> > > >> > > --------------------------------------------------------------------- >> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > >> > > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >