All: We are using java lucene 2.3.2 to index a fairly large number of documents (roughly 400,000 per day). We have divided the time history into various depths.
Our first stage covers 8 days and our next stage covers 22. The index directory for the first stage is approximately 20G when fully optimized. The index directory of our second stage is over 250GB when optimized. Our third stage (which is 60 days) is only ~80GB when optimized. The second stage index failed an optimization with a disk full exception (I had to move it to another lucene machine with a larger disk partition to complete the optimization. Is there a reason why a 22 day index would be 10x the size of an 8 day index when the document indexing rate is fairly constant? Also, is there a way to shrink the index without regenerating it? Any help/pointers would be greatly appreciated. Thanks and Regards, Dan Dan O'Connor SVP, Engineering Acquire Media<http://www.acquiremedia.com/> e: docon...@acquiremedia.com<mailto:docon...@acquiremedia.com>