Re: Too many open files issue
John Wang wrote: In the Lucene code, I don't see where the reader speicified when creating a field is closed. That holds on to the file. I am looking at DocumentWriter.invertDocument() It is closed in a finally clause on line 170, when the TokenStream is closed. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Dutch Analyzer dictionary format?
Hello all, I'm using lucene to search through a couple of documents to find interesting documents. Most documents are in Dutch language. I saw that the default snowball stemmer wasn't doing well on text written in a foreign language. Lucky i found a Dutch text analyzer in de lucene sandbox project. I've read the javadoc and found out it needs a stemdictionary. You can load this dictionary with the following function: DutchAnalyzer.setStemDictionary(File f) The format needs to be a tab separator list (word [tab] stem). To be sure i do everything correctly i've got a question about the dictonary: Can i just get: http://snowball.tartarus.org/dutch/diffs.txt and convert it to a tab separated list and then feed it to the setStemDictionary() function? Kind regards, Twan Kogels - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Dutch Analyzer dictionary format?
Judging from everything you've said, the answer is yes. I don't use Dutch Analyzer, so I'm not 100% sure about this, but it sounds easy enough to try. Otis --- Twan Kogels [EMAIL PROTECTED] wrote: Hello all, I'm using lucene to search through a couple of documents to find interesting documents. Most documents are in Dutch language. I saw that the default snowball stemmer wasn't doing well on text written in a foreign language. Lucky i found a Dutch text analyzer in de lucene sandbox project. I've read the javadoc and found out it needs a stemdictionary. You can load this dictionary with the following function: DutchAnalyzer.setStemDictionary(File f) The format needs to be a tab separator list (word [tab] stem). To be sure i do everything correctly i've got a question about the dictonary: Can i just get: http://snowball.tartarus.org/dutch/diffs.txt and convert it to a tab separated list and then feed it to the setStemDictionary() function? Kind regards, Twan Kogels - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Are similarity scores computed when using sort?
I have an search application that is very performance conscious. I've looked through the IndexSearcher code, and haven't been able to clarify whether a similarity score is calculated if the results are sorted by some numerical field value? Basically, it would be preferable to not incur the computational cost of generating a similarity score if it is never used. Thanks Yin