I think this document is very interesting: http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm
peter > -----Original Message----- > From: Doug Cutting [mailto:[EMAIL PROTECTED]] > Sent: Monday, February 25, 2002 7:15 PM > To: 'Lucene Users List' > Subject: RE: Googlifying lucene querys > > > If you put the title in a separate field from the contents, > and search both > fields, matches in the title will usually be stronger, > without explicit > boosting. This is because the scores are normalized by the > length of the > field, and the title tends to be much shorter than the > contents. So even > without boosting, title matches usually come before contents matches. > > Doug > > > -----Original Message----- > > From: Spencer, Dave [mailto:[EMAIL PROTECTED]] > > Sent: Monday, February 25, 2002 10:22 AM > > To: Lucene Users List > > Subject: RE: Googlifying lucene querys > > > > > > I'm pretty sure google gives priority to the words appearing in the > > title and URL. > > > > I believe sect 4.2.5 says this here: > > http://citeseer.nj.nec.com/cache/papers/cs/13017/http:zSzzSzww > > w-db.stanf > > ord.eduzSzpubzSzpaperszSzgoogle.pdf/brin98anatomy.pdf > > from here: > > http://citeseer.nj.nec.com/brin98anatomy.html > > > > So you have to have Lucene store the title as a separate field. > > > > This is then what you'd have if like me you boost (the caret > > is "boost") > > the title by *5 and the URL by *2: > > > > +(title:george^5.0 url:george^2.0 contents:george) +(title:bush^5.0 > > url:bush^2.0 contents:bush) +(title:white^5.0 url:white^2.0 > > contents:white) +(title:house^5.0 url:house^2.0 contents:house) > > > > > > -----Original Message----- > > From: Ian Lea [mailto:[EMAIL PROTECTED]] > > Sent: Saturday, February 23, 2002 8:15 AM > > To: Lucene Users List > > Subject: Re: Googlifying lucene querys > > > > > > +george +bush +white +house > > > > > > -- > > Ian. > > > > Jari Aarniala wrote: > > > > > > Hello, > > > > > > Despite of the confusing subject ;) my question is > simple. I'm just > > > trying out Lucene for the first time and would like to > know how one > > > would go on implementing the search on the index with the > same logic > > > that Google uses. > > > For example, if the user input is "george bush > white house", > > how > > > do I easily construct a query that searches ALL of the > > words above? If > > I > > > have understood correctly, passing the search string above to the > > > queryParser creates a query that search for ANY of the > words above. > > > > > > Thanks for any help, > > > > -- > > To unsubscribe, e-mail: > > <mailto:[EMAIL PROTECTED]> > > For additional commands, e-mail: > > <mailto:[EMAIL PROTECTED]> > > > > > > -- > > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>