Hello all, I am trying to use Lucene for doing incremental indexing of the order of million of records daily using a single machine (P4 2.4Ghz 1 GB RAM). I do get messages updated every few minutes based on which I need to update the index.
I am using a StandardAnalyzer and writing documents using IndexWriter (FSDirectory) using the following structure: Document document = new Document(); document.add(Field.Keyword(INDEX_MESG_ID, LongField.longToString(mesg_id))); document.add(Field.UnStored(INDEX_MESG_TITLE, mesgTitle)); document.add(Field.UnStored(INDEX_MESG_DESCRIPTION,mesgDescription)); document.add(Field.Keyword(INDEX_MESG_DATE,mesgDate)); mesgTitle is string of the order of 20-50 words mesgDescription is a string of the order of 500-1000 words. I profiled my application and the index writer is able to write 16 messages per second, which is too low for the order that I want to achieve. I have tried various options: - increasing mergeFactor and minMergeDocs from 10 to 100 to 1000 didn't help. - Changed indexwriter from FSDirectory to RAMDirectory and later synchronizing changes to FSDirectory. Even this didn't help. It will be great if someone can give me pointers for increasing the efficiency of index writer process. Can someone give me practical examples of how much maximum efficiency can be achieved for similar process on a single machine? Thanks Regards Sunil --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]