Recency weightage in Lucene

2006-06-18 Thread PrasenjitM
I am thinking of modifying lucene's current ranking algorithm to include the document's recency-weightage. So that the latest modified documents gets preference over earlier modified documents, which makes sense for news search. (I believe) To do this I have to tinker with TermScorer.score()

[jira] Commented: (LUCENE-605) Make Explanation include information about match/non-match

2006-06-18 Thread paul.elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-605?page=comments#action_12416658 ] paul.elschot commented on LUCENE-605: - I like the Boolean for indicating the match. The demo-fix.patch applies cleanly on my working copy, and all tests pass with it.

[jira] Commented: (LUCENE-605) Make Explanation include information about match/non-match

2006-06-18 Thread paul.elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-605?page=comments#action_12416660 ] paul.elschot commented on LUCENE-605: - I tried removing the Explanation constructor that is deprecated in the demo-fix.patch. One of the uses of this constructor is in

Re: Results (Re: Survey: Lucene and Java 1.4 vs. 1.5)

2006-06-18 Thread Vic Bancroft
Robert Engels wrote: Do you have any hard numbers to support this? The last time I checked, gcj had minimal improvement over JVM 1.5. In terms of speed, there is not much difference between native code and classes (see sample timings). However, the pragmatic availability of java 5

Re: Soccer-themed question: null fields?

2006-06-18 Thread Chuck Williams
JMA wrote on 06/17/2006 10:16 PM: 1) Is there a way to find a document that has null fields? For example, if I have two fields (FIRST_NAME, LAST_NAME) for World Cup players: FIRST_NAME: Brian LAST_NAME: McBride FIRST_NAME: Agustin LAST_NAME: Delgado FIRST_NAME: Zinha

RE: Results (Re: Survey: Lucene and Java 1.4 vs. 1.5)

2006-06-18 Thread Robert Engels
Are you sure about the JVM numbers? I would think that user + sys must always be real (unless maybe the multiprocessor affects this - i.e. sums the processor time used on each). -Original Message- From: Vic Bancroft [mailto:[EMAIL PROTECTED] Sent: Sunday, June 18, 2006 11:55 AM To:

Re: Recency weightage in Lucene

2006-06-18 Thread prasenjitm
Using the doc-id itself as a recency metric is smart thinking. But the weight is actually a sigmoidal function based on the oldness(i.e. currentTime-documentIndexingTime), hence just cant use the doc-id itself. What is the JIRA BUGid for the lazy fiekd capability. Woudl like to know more