You cannot, in general, structure a Lucene query such that it will yield the same document rankings that Google would for that (query, document set). The reason for this is that Google employs a scoring algorithm that includes information about the topology of the pages (i.e., how the pages are linked together). (An overview of what Google does in this regard may be found at http://www.google.com/technology/index.html .) Thus, in order to get Lucene to do "what Google does", you'd have to rewrite large chunks of it.
Joshua [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden Joshua Madden: Information Scientist, Musician, Philosopher-At-Tall It's that moment of dawning comprehension that I live for--Bill Watterson My opinions are too rational and insightful to be those of any organization. On Mon, 25 Feb 2002, Spencer, Dave wrote: > I'm pretty sure google gives priority to the words appearing in the > title and URL. > > I believe sect 4.2.5 says this here: > http://citeseer.nj.nec.com/cache/papers/cs/13017/http:zSzzSzwww-db.stanf > ord.eduzSzpubzSzpaperszSzgoogle.pdf/brin98anatomy.pdf > from here: > http://citeseer.nj.nec.com/brin98anatomy.html > > So you have to have Lucene store the title as a separate field. > > This is then what you'd have if like me you boost (the caret is "boost") > the title by *5 and the URL by *2: > > +(title:george^5.0 url:george^2.0 contents:george) +(title:bush^5.0 > url:bush^2.0 contents:bush) +(title:white^5.0 url:white^2.0 > contents:white) +(title:house^5.0 url:house^2.0 contents:house) > > > -----Original Message----- > From: Ian Lea [mailto:[EMAIL PROTECTED]] > Sent: Saturday, February 23, 2002 8:15 AM > To: Lucene Users List > Subject: Re: Googlifying lucene querys > > > +george +bush +white +house > > > -- > Ian. > > Jari Aarniala wrote: > > > > Hello, > > > > Despite of the confusing subject ;) my question is simple. I'm just > > trying out Lucene for the first time and would like to know how one > > would go on implementing the search on the index with the same logic > > that Google uses. > > For example, if the user input is "george bush white house", > how > > do I easily construct a query that searches ALL of the words above? If > I > > have understood correctly, passing the search string above to the > > queryParser creates a query that search for ANY of the words above. > > > > Thanks for any help, > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>