Hi Cesar, On 02/11/2008 at 2:19 PM, Cesar Ronchese wrote: > I'm running problems with document deletion. > [...] > This simply doesn't delete anything from the Index. > > //see the code sample: > //"theFieldName" was previously stored as Field.Store.YES and > Field.Index.TOKENIZED. > Term t = new Terms("theFieldName", "theFieldContent"); > objIndexReader.DeleteDocuments(t);
(You have two typos here - "new Term/s/" and /D/eleteDocuments() - I assume that this is just a transcription error, since you must have gotten this code to run...) When you construct a Term instance, no analysis will be performed on "theFieldContent". Since "theFieldName" is TOKENIZED, it was analyzed, and this is likely where the mismatch is occurring. From <http://lucene.apache.org/java/2_3_0/api/org/apache/lucene/index/IndexReader.html#deleteDocuments(org.apache.lucene.index.Term)>: This is useful if one uses a document field to hold a unique ID string for the document. If you're trying to delete documents based on a document ID held as the entire value of a field, then you should be using Field.Index.UN_TOKENIZED. From http://lucene.apache.org/java/2_3_0/api/org/apache/lucene/document/Field.Index.html#UN_TOKENIZED>: Index the field's value without using an Analyzer, so it can be searched. As no analyzer is used the value will be stored as a single term. This is useful for unique Ids like product numbers. > 2) DeleteDocument(numDoc) <== this problem is a woot problem > > [...] > > I mean, if I call objIndexReader.DeleteDocument(0), it will > delete the first document from the entire INDEX, not the > first document in the Hits collection. So, it deleted the > first documents I have inserted some days ago, in previous > indexing sessions. Yes, this is how this method is designed to function. The javadoc description is perhaps too brief: "Deletes the document numbered 'docNum'". As you have discovered, "docNum" is the one-up number assigned internally by Lucene to each document as it is added to the index. > I ask: is there a way to get the correct docNum from the > document retrieved in the Hits collection? Check out Hits.id(int): <http://lucene.apache.org/java/2_3_0/api/org/apache/lucene/search/Hits.html#id(int)> The "id" returned by Hits.id(int) is the same thing as the "docNum" parameter to IndexReader.deleteDocument(int). It sounds like the documentation could benefit from some more discussion of the "docNum"/document "id" feature... Steve --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]