Hi Anshum,

Even that document says that higher frequency implied higher score. My doubt is if the score is based only on the frequency, won't it be inappropriate for Internet based search? For example, if Google did the same thing, when I search for "Microsoft", there is a chance that Google may give some false page as the first one if the frequency of the word "Microsoft" in that page is more than that in www.microsoft.com

That being the case, how can we extend it to be appropriate for such searches (whose result score should not depend merely on the frequency)?

- Ram

Anshum wrote:
Hi msr,

Perhaps this could be useful for you. Lucene implements a modified vector
space model in short.
http://jayant7k.blogspot.com/2006/07/document-scoringcalculating-relevance_08.html

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Wed, Jan 21, 2009 at 4:45 PM, MSR <ms...@iitk.ac.in> wrote:

Hi,

Does Lucene take into consideration anything other than the frequency of
the query words in a document? If it does, what are the other
considerations?
If it is purely based on word frequency, is it appropriate for Internet
based search (where we need to consider reference count also)?

Thank you,
Ram

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to