When writing a unit test that comapres RAMDirectory and FSDirectory performance 
for Lucene in Action I had a very hard time showing that RAMDirectory really is 
faster. :)  Just set your maxBufferedDocs to as high a number as your RAM/heap 
will let you, and pick a mergeFactor that is high, but doesn't get you in 
trouble with open files.

Otis

----- Original Message ----
From: Dan Armbrust <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, June 7, 2006 4:05:49 PM
Subject: Re: IndexWriter.addIndexes & optimization

Benjamin Stein wrote:

> 
> I could probably store the little RAMDirectories to disk as many
> FSDirectories, and then addIndexes() of *all* the FSDirectories at the end
> instead of every time.  That would probably be smart.
> 
> Glad I asked myself!
> 

That was what I was going to suggest - you may also want to benchmark to 
see if the RAMDirectory is buying you anything.  With the data that I am 
indexing on my hardware, I found it to be faster to index to a regular 
FSDirectory that it is to use the RAMDirectory.  Especially if you tweak 
the performance knobs on the indexer so it does its own caching before 
it writes to the Directory.

I do batches of documents to FSDirectories - and then merge all of the 
FSDirectories into a new master index at the end - so I never have to 
optimize during the indexing process.

Dan


-- 
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to