Re: Deleting a document with an IndexWriter open

2004-07-20 Thread Dmitry Serebrennikov
Doug Cutting wrote: Then you need to ensure that you leave the index has no deletions, and optimize it if it has any, to remove them. This is probably most safely done as the first step, rather than the last. Good point. I didn't think about this. I'm not sure this method has many advantages ove

Re: Deleting a document with an IndexWriter open

2004-07-20 Thread Giulio Cesare Solaroli
So, I was not thinking that much different. :-] Giulio Cesare On Tue, 20 Jul 2004 14:37:11 +0200, Christoph Goller <[EMAIL PROTECTED]> wrote: > Giulio Cesare Solaroli wrote: > > Hi all, > > > > I would like to submit a "think different" approach to this problem > > for evaluation for you develope

Re: Deleting a document with an IndexWriter open

2004-07-20 Thread Christoph Goller
Giulio Cesare Solaroli wrote: Hi all, I would like to submit a "think different" approach to this problem for evaluation for you developers. Would it be possible to just mark the relevant documents as "deleted" (instead of deleting them altogether) with an IndexWriter used for inserting new documen

Re: Deleting a document with an IndexWriter open

2004-07-20 Thread Giulio Cesare Solaroli
Hi all, I would like to submit a "think different" approach to this problem for evaluation for you developers. Would it be possible to just mark the relevant documents as "deleted" (instead of deleting them altogether) with an IndexWriter used for inserting new documents? "marking" a document as

Re: Deleting a document with an IndexWriter open

2004-07-19 Thread Doug Cutting
Dmitry Serebrennikov wrote: Doug Cutting wrote: Dmitry Serebrennikov wrote: So here's a modified sequence of operations, perhaps a bit more efficient than proposed by Christoph: 1) Open an IndexReader for searching - S. Keep it open until the transaction is committed. 2) Open a second IndexReader

Re: Deleting a document with an IndexWriter open

2004-07-19 Thread Dmitry Serebrennikov
Doug Cutting wrote: Dmitry Serebrennikov wrote: So here's a modified sequence of operations, perhaps a bit more efficient than proposed by Christoph: 1) Open an IndexReader for searching - S. Keep it open until the transaction is committed. 2) Open a second IndexReader for deletions - D. 3) Creat

Re: Deleting a document with an IndexWriter open

2004-07-19 Thread Doug Cutting
Dmitry Serebrennikov wrote: So here's a modified sequence of operations, perhaps a bit more efficient than proposed by Christoph: 1) Open an IndexReader for searching - S. Keep it open until the transaction is committed. 2) Open a second IndexReader for deletions - D. 3) Create a filter bitset F

Re: Deleting a document with an IndexWriter open

2004-07-19 Thread Doug Cutting
Christoph Goller wrote: Giulio Cesare Solaroli wrote: is there any architectural reason while an IndexWriter could not delete a document? There are such reasons. Maybe Doug can give additional insight. No, you did a great job of describing the issues. Thanks! Doug -

Re: Deleting a document with an IndexWriter open

2004-07-19 Thread Christoph Goller
Dmitry Serebrennikov wrote: Another solution that works well in some applications is to rely on document number. This number will remain the same for the life of an IndexReader. This number is also always larger for documents added later. So given two documents with the same ID, the one with the

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Dmitry Serebrennikov
Another solution that works well in some applications is to rely on document number. This number will remain the same for the life of an IndexReader. This number is also always larger for documents added later. So given two documents with the same ID, the one with the highest document number is

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Christoph Goller
Giulio Cesare Solaroli wrote: I have been thinking about this for a while, but could not find out a reasonable solution. The basic problems are: - where do I (safely) store the index of the documents that needs to be deleted? - how can I uniquely identify the Lucene documents that I have to delete,

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Giulio Cesare Solaroli
On Fri, 16 Jul 2004 15:07:11 +0200, Christoph Goller <[EMAIL PROTECTED]> wrote: > Giulio Cesare Solaroli wrote: > > This is the main problem; in my current arrangement, it is quite > > difficult to find out the documents that needs to be updated in > > advance; it would have been much easier to fin

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Christoph Goller
Giulio Cesare Solaroli wrote: This is the main problem; in my current arrangement, it is quite difficult to find out the documents that needs to be updated in advance; it would have been much easier to find out whether every single document where a new entry or a document already present, and thus

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Christoph Goller
Christiaan Fluit wrote: Christoph Goller wrote: 1) Keep an IndexReader/Searcher open on your index in order to guarantee reed access and a consistent index during the whole process. 2) Open a new IndexReader and delete all the documents that you want to update. 3) Close the IndexReader (makes the d

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Christiaan Fluit
Christoph Goller wrote: 1) Keep an IndexReader/Searcher open on your index in order to guarantee reed access and a consistent index during the whole process. 2) Open a new IndexReader and delete all the documents that you want to update. 3) Close the IndexReader (makes the deletions visible for any

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Giulio Cesare Solaroli
Hi Christoph, On Fri, 16 Jul 2004 13:50:51 +0200, Christoph Goller <[EMAIL PROTECTED]> wrote: >[snip on good reasons why an IndexWriter can not delete documents] > >> > If you want to do several updates at the same time, the most efficient > way would be to: > > 1) Keep an IndexReader/Searcher

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Christoph Goller
Giulio Cesare Solaroli wrote: Dear developers, is there any architectural reason while an IndexWriter could not delete a document? There are such reasons. Maybe Doug can give additional insight. Here is what I think: One reason I see is that there is no such thing as a unique document id in Lucene.

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Giulio Cesare Solaroli
On Fri, 16 Jul 2004 10:00:08 +0200, Christiaan Fluit <[EMAIL PROTECTED]> wrote: > [snip] > > That's exactly what we do. It's not optimal but it works. > > My guess would be that the chosen architecture makes it possible to > query the index while it is simultaneously being updated. I believe > (c

Re: Deleting a document with an IndexWriter open

2004-07-16 Thread Christiaan Fluit
Giulio Cesare Solaroli wrote: is there any architectural reason while an IndexWriter could not delete a document? [snip] In this situation, I try to keep the same IndexWriter open as much as possible, in order to avoid any unnecessary fragmentation of the index. Before indexing any document, I can

Deleting a document with an IndexWriter open

2004-07-16 Thread Giulio Cesare Solaroli
Dear developers, is there any architectural reason while an IndexWriter could not delete a document? I understand that the IndexReader (besides its strange naming for this feature) is the right class to use to delete a document, but this raises a huge problem for me. We add almost 50.000 documen