Hi
U can try out the following code to delete document based on KeyWords
import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.Term; import org.apache.lucene.index.TermDocs; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.search.Hits; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.index.IndexReader; import org.apache.lucene.search.Query; import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.Searcher;
public class LuceneDelete { private static final String[] strSTOP_WORDS = { "and", "are" }; private void test() throws Exception { Analyzer objAnalyzer = new StandardAnalyzer(); IndexWriter index = new IndexWriter("index",objAnalyzer, true );
Document objDocument = new Document();
objDocument.add( Field.Keyword("name","Ebrahim Faisal")); objDocument.add( Field.Text("address","Chennai")); objDocument.add( Field.Keyword("designation","Software Engineer")); objDocument.add( Field.UnIndexed("xyz","123 IndexWriter index"));
index.addDocument( objDocument );
objDocument = new Document();
objDocument.add( Field.Keyword("name","John Smith")); objDocument.add( Field.Text("address","Delhi")); objDocument.add( Field.Keyword("designation","Sr. Software Engineer")); objDocument.add( Field.UnIndexed("xyz","456 StandardAnalyzer true"));
index.addDocument( objDocument );
index.optimize(); index.close();
//Logic for deleting
IndexReader objIndexReader = IndexReader.open("index");
TermDocs objTermDocs = objIndexReader.termDocs(new Term("name","Ebrahim Faisal"));
while( objTermDocs.next() ) { int docNum = objTermDocs.doc(); objDocument = objIndexReader.document( docNum ); if( objDocument.get("designation").equalsIgnoreCase("Software Engineer")) { objIndexReader.delete( docNum ); } } objIndexReader.close();
Searcher objIndexSearcher = new IndexSearcher("index");
Query objQuery = null;
objQuery = QueryParser.parse("Delhi", "address" , objAnalyzer);
Hits objHits = objIndexSearcher.search(objQuery);
System.out.println(" objHits "+objHits.length());
for (int nStart = 0; nStart < objHits.length(); nStart++) { objDocument = objHits.doc(nStart); System.out.println(" address "+objDocument.get("address")); } objIndexSearcher.close(); objIndexSearcher = null;
} public static void main(String[] args) throws Exception { new LuceneDelete().test(); } }
E.FAISAL
From: mahaveer jain <[EMAIL PROTECTED]>
Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
To: Lucene Users List <lucene-user@jakarta.apache.org>, Paul <[EMAIL PROTECTED]>
Subject: Re: Deleting index for DB indexing
Date: Thu, 30 Dec 2004 21:17:48 -0800 (PST)
Thanks Paul,
You idea seems to be good. I ll try that. I have one more question. Should the new key what I create have to be keyword ? or Can it be just a column in the index ?
Mahaveer
Paul <[EMAIL PROTECTED]> wrote:
On Thu, 30 Dec 2004 08:36:04 -0800 (PST), mahaveer jain
wrote:
> I am indexing more that 5 tables. And each for them have autoincrement and
> that is the primary key. So if I do find DocNum, it may so happen that it
> may delete document I don't want to delete.
you need to create your own global ID, I had the same problem (but I used a MD5 hashvalue). One solution ist to give each of your tables an internal number and when creating your lucene-documents you add an additional field with something like "dbInternalId*100+dbNumber" so that db-record 5 in table 3 results in 503. when documents from your DB are deleted and you need to update the index you simple create a term which's value is calculated the same way and delete the document with the IndexReader.delete(Term) Instead of calculating you can do string concatenating as well :)
Paul
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
_________________________________________________________________
The MS Office product suite. Make efficiency a habit. http://www.microsoft.com/india/office/experience/ Simplify your life.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]