Good question....

So far, this method has not been carried over to IndexWriter because in general it's not really safe, since there's no way to get an accurate docID from IndexWriter itself.

You can't really "know" when IndexWriter does merges that compacts deletes and thus changes docIDs. So, if you open a reader on the side, get a docID you want to delete, and then go and ask IndexWriter to delete that docID, you may in fact delete the wrong document. In 2.3, where segment merges are now done with a background thread, it's even worse, because a merge could complete and be committed, thus changing docIDs, at any time...

See complex discussion here:

    http://markmail.org/message/wxqel3gd6cmavk5a

As of 2.3, the low level infrastructure was added to IndexWriter for deleting by document ID, but this is not exposed publicly (this was a side effect of LUCENE-1112). It's only used, internally, to delete a document if an exception is hit while indexing it. In theory, you could then subclass IndexWriter and tap into this infrastructure to delete by docID, but, you're entering dangerous territory!

Do you have a specific use case in mind here? I think we'd like to make this option available someday in IndexWriter, but doing so now (when there is no way to get a "reliable" docID) seems too dangerous...

Mike

Cam Bazz wrote:

Hello,

How do I delete a specific document from an indexwriter? I understand there is deleteDocuments(term) which deletes all the documents matching the term.
But what if I want to delete a document that has more then one term in
specific. I can search the document with a boolean query, and then get the
doc id.
I know that doc ids are temporary, but can I not use it for delete?

IndexReader has a delete by doc id method, but I am not sure how to use this
when using an indexwriter.

Best,
C.B.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to