Hi Yannis,

I don't think there is anything of that sort in Lucene, but this shouldn't be 
hard to do with a process outside Lucene.  Of course. optimizing an index 
increases its size temporarily, so your external process would have to take 
that into account and play it safe.  You could also set mergeFactor to 1, which 
should keep your index in a fully optimized state if you don't do any deletions 
and near-optimized state if you do deletions.

You should discuss this on java-user list, though, so I'm CCing that list where 
you can continue the discussion.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Yannis Pavlidis <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Monday, March 24, 2008 7:33:26 PM
Subject: how to control the disk size of the indices


Hi all,

I wanted to ask the list whether there is an easy and efficient way to manage 
the size (in bytes) of a lucene index stored on disk.

Basically I would like to limit lucene storing only 100 GB of information. When 
lucene reaches that limit then I would delete the documents (using an LRU 
algorithm based on timestaps) but in no case the disk space occupied by Lucene 
should exceed 100GB.

I experimented with lucene 2.3.1 and the only I could accomplish that was by 
calling the optimize method (after the index size exceeded the max size) on the 
IndexWriter. I was looking for a more performant way to "perhaps control" 
Lucene on when to merge the segments so as to not exceed the pre-set limit.

Any ideas or suggestions would be highly appreciated.

Thanks in advance,

Yannis.




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to