On Fri, 5 Jan 2007, Terry Jones wrote:

| While there is a certain overhead with transactions and opening and closing
| an index for every addition, I did notice that there was a fair amount of
| thrashing around in the Lucene directory I/O and got things to be
| considerably faster by batching all updates and doing them in a
| RAMDirectory before adding the RAMDirectory contents to the DBDirectory via
| the addIndexes API.

I've thought about this (after reading the suggestion in Lucene in Action).
I considered having an open RAMDirectory that is always being written to
and which is merged into a FSDirectory whenever a search takes place. That
would be ok for some cases, but not in general. Also, buffering approaches
using RAMDirectory seem not to support transactions - at least not at the
level of single additions to the RAMDirectory. That's something of a
problem for me, but adding some sort of transaction mechanism might work.

The transaction is used with DBDirectory. Use a RAMDirectory to batch all changes for a given thread and once done, merge the RAMDirectory into a DBDirectory within a transaction.

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to