Konrad Scherer wrote:
I am using lucene 1.2 (Java 1.4 on Solaris 7) and the xml indexer to index ~24000 small xml documents. The finished and optimized index uses around 340 MB disk space. The documents are reindexed once a week and this has worked without any trouble for months. Recently the free space on the hard drive was down to 1.36 GB and the optimization crashed due to "no space left on device". Deleting the index directory freed up 1.36 GB.Optimization (and indexing in general) works by copying, so it requires around twice the space that the index occupies when optimized. Due to file fragmentation and to the index format, an unoptimized index will actually occupy slightly more space than the same index when optimized.
Question 1) Is it normal for the optimization process to require this much extra space?
Another thing which could complicate matters is, if the index is being searched while it is modified and optimized, then there could be three copies: one being searched, one the penultimate copy before the optimized index, and, finally, the optimized index. Things could be even worse if there are many searchers that were opened on different versions of the index. As long as a version of the index is open its space cannot be freed.
2) Did I miss an option somewhere to limit the space usage of the optimization process?If you're searching concurrently, try closing searchers more promptly. You should only need to keep a single searcher open at a time, shared by all queries.
It will definitely make searches faster, but if search performance is not an issue, I wouldn't bother.3) More philosophically, do I really need the optimization?
Doug
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@;jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@;jakarta.apache.org>
