Hi,

We are implementing a search engine for a huge dataset (approximately 50
million html pages).
We have indexed various field related information, such as Title, Body ,
Meta text, H1, URL  etc.
Lucene provides the setBoost() function to give weightage to these fields.
What should be the values for these fields?
Should they be relative?
Are there any standard values?

We've also computed Page Rank for those web pages, what can be the best way
to combine
the page rank information with the lucene's  document score?

--
Kushal Dave

Reply via email to