batch indexing

Chandan Tamrakar Sun, 29 Apr 2007 04:53:10 -0700

I am trying to index a huge documents on batches   . Batch size is
parameterized to the application  say X docs , that means it will hold X no.
of


Docs in the RAM before I flush to file system using
IndexWriter.addIndexes(Directory[]) method

 

My question is :

 

Do I need to set mergefactor ? , will it hold default mergefactor docs in
memory before it is written to disk as segment .

(But my application will call indexwriter.addindexes function only after X
no of documents are in memory)

 

If the index sizes are big , at some point of time there might be a out of
memory exceptions , ( yes I could check a memory before another ramdirectory
is being created) But what would be the best solution  ? Is FSDirectory is
better option than Ramdirectory for huge text indexing ? I have roughly 50
GB of fulltext to index?

 

 

Thks in advance.

batch indexing

Reply via email to