Re: Transactional Directories

Doug Cutting Mon, 14 Feb 2005 14:05:06 -0800

[ Please ignore my previous message. I somehow hit "Send" before typing anything! ]

Oscar Picasso wrote:

However with a relatively high number of random insertions, the cost of the
"new IndexWriter / index.close()" performed for each insertion is two high.

Did you measure that? How much slower was it? Did you perform any profiling? Perhaps one could improve this by, e.g., disabling document index buffering, so that indexes are written directly to the final directory in this case, rather than first bufferred in a RAMDirectory.

Unfortunately this it is a common case for some kind of applications and it is
where a transactional directory would the most useful.

In such a case you would like to do something like that:
-- case B --
<pseudo-code>
new IndexWriter
 ...
+begin transaction-1
 create/update/delete objects in the database
 index.addDocument (related to the objects)
+ commit
...
+begin transaction-2
 create/update/delete objects in the database
 index.addDocument (related to the objects)
+ commit
...
indexWriter.close()
</pseudo-code>

The benefits would be to protect individual insertions while avoiding the cost
of using each time a new IndexWriter.

It doesn't work however. Here is my understanding.

Suppose that in case B, transaction-1 fails and transaction-2 succeeds.

So you've got multiple threads? Or are you proceeding in the face of exceptions? Otherwise I would expect that if transaction-1 fails then you'd avoid transaction-2, no?

Also, you'd want to add an flush() call after each addDocument(), since document additions are bufferred. But a flush() is just what IndexWriter.close() does, so then things would not be any faster than creating a new IndexWriter for each document.

The bottom line is that there are optimizations to be made when batching additions. Lucene's API is designed to encourage batching, so that these optimizations may be used. If you don't batch, things will be somewhat slower.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Transactional Directories

Reply via email to