date:20041126

Are similarity scores computed when using sort?

2004-11-26 Thread Aphinyanaphongs, Yindalon

I have an search application that is very performance conscious.  I've looked 
through the IndexSearcher code, and haven't been able to clarify whether a 
similarity score is calculated if the results are sorted by some numerical 
field value? Basically, it would be preferable to not incur the computational 
cost of generating a similarity score if it is never used.

Thanks
Yin

Re: Dutch Analyzer dictionary format?

2004-11-26 Thread Otis Gospodnetic

Judging from everything you've said, the answer is yes.  I don't use
Dutch Analyzer, so I'm not 100% sure about this, but it sounds easy
enough to try.

Otis

--- Twan Kogels <[EMAIL PROTECTED]> wrote:

> Hello all,
> 
> I'm using lucene to search through a couple of documents to find 
> interesting documents. Most documents are in Dutch language. I saw
> that the 
> default snowball stemmer wasn't doing well on text written in a
> foreign 
> language. Lucky i found a Dutch text analyzer in de lucene sandbox
> project.
> 
> I've read the javadoc and found out it needs a stemdictionary. You
> can load 
> this dictionary with the following function:
> DutchAnalyzer.setStemDictionary(File f)
> 
> The format needs to be a tab separator list (word [tab] stem).
> 
> To be sure i do everything correctly i've got a question about the
> dictonary:
> Can i just get:
> 
> and convert it to a tab separated list and then "feed" it to the 
> setStemDictionary() function?
> 
> Kind regards,
> Twan Kogels 
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Dutch Analyzer dictionary format?

2004-11-26 Thread Twan Kogels

Hello all,
I'm using lucene to search through a couple of documents to find 
interesting documents. Most documents are in Dutch language. I saw that the 
default snowball stemmer wasn't doing well on text written in a foreign 
language. Lucky i found a Dutch text analyzer in de lucene sandbox project.

I've read the javadoc and found out it needs a stemdictionary. You can load 
this dictionary with the following function:
DutchAnalyzer.setStemDictionary(File f)

The format needs to be a tab separator list (word [tab] stem).
To be sure i do everything correctly i've got a question about the dictonary:
Can i just get:

and convert it to a tab separated list and then "feed" it to the 
setStemDictionary() function?

Kind regards,
Twan Kogels 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Too many open files issue

2004-11-26 Thread Doug Cutting

John Wang wrote:
In the Lucene code, I don't see where the reader speicified when
creating a field is closed. That holds on to the file.
I am looking at DocumentWriter.invertDocument()
It is closed in a finally clause on line 170, when the TokenStream is 
closed.

Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Are similarity scores computed when using sort?

Re: Dutch Analyzer dictionary format?

Dutch Analyzer dictionary format?

Re: Too many open files issue

4 matches

Site Navigation

Mail list logo

Footer information