Using the doc-id itself as a recency metric is smart thinking. But the weight is actually a sigmoidal function based on the oldness(i.e. currentTime-documentIndexingTime), hence just cant use the doc-id itself. What is the JIRA BUGid for the lazy fiekd capability. Woudl like to know more about this feature. thanks for the help, Prasen -----Original Message----- From: Chuck Williams <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Sun, 18 Jun 2006 07:47:40 -1000 Subject: Re: Recency weightage in Lucene
[EMAIL PROTECTED] wrote on 06/17/2006 10:52 PM: > I am thinking of modifying lucene's current ranking algorithm to include the document's recency-weightage. So that the latest modified documents gets preference over earlier modified documents, which makes sense for news search. > > (I believe) To do this I have to tinker with TermScorer.score() method, and calculate document-score in its while (doc < end) {..} loop. The requirement is that document's lastModifiedTime is stored in the doc's field, and extracting this value could be quite expensive for every iteration in its posting stream. One approach could be to store it in a separate file (like Normalization) to avoid field-lookup. > > Any other ideas/suggestions.. Or if anyone has already implemented this ? > Does recency correlate with the order in which documents are added to you index? If so, then perhaps you can use doc-id as a measure of recency and thereby avoid accessing a stored field. I'm not certain, but based on a quick perusal of the relevant code, it appears that both index opening and segment merging preserve the order of doc-ids. If you take this approach, you should verify. If you end up needed a stored field, then be sure to use the lazy fields capability (recently committed) to access it. Chuck --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ________________________________________________________________________ Check out AOL.com today. Breaking news, video search, pictures, email and IM. All on demand. Always Free.