I am using lucene 1.2 (Java 1.4 on Solaris 7) and the xml indexer to index ~24000 small xml documents. The finished and optimized index uses around 340 MB disk space. The documents are reindexed once a week and this has worked without any trouble for months. Recently the free space on the hard drive was down to 1.36 GB and the optimization crashed due to "no space left on device". Deleting the index directory freed up 1.36 GB.
Question 1) Is it normal for the optimization process to require this much extra space?
2) Did I miss an option somewhere to limit the space usage of the optimization process?
3) More philosophically, do I really need the optimization?
Also, in the archives I came across a message talking about an Ispell-based stemmer to which Doug Cutting replied
I have not found the code anywhere on the lucene site and the link to the code above does not work any more. Does someone have this code or could the original author please repost the code? I am using the french stemmer from snowball and it does some strange things, like stemming paris to par and not stemming many verbs properly. I would like to try a different stemmer to see whether it is more useable.������� ������� wrote: > http://www.halyava.ru/do/org.apache.lucene.analysis.zipThis looks great! If I understand correctly, it can be used to quickly build stemmers for lots of languages. For example, the following page lists the location of ispell dictionaries for over 30 languages! http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html This page should probably be referenced from the documentation.
I would also like to take this opportunity to thank the lucene developers for their work.
Konrad Scherer
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@;jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@;jakarta.apache.org>
