Hi

U can try out the following code to delete document based on KeyWords


import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.Term; import org.apache.lucene.index.TermDocs; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.search.Hits; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.index.IndexReader; import org.apache.lucene.search.Query; import org.apache.lucene.search.TermQuery;

import org.apache.lucene.search.Searcher;

public class LuceneDelete
{
        private static final String[] strSTOP_WORDS =
       {
                        "and",
                        "are"
                         };
        private void test() throws Exception
        {
                Analyzer objAnalyzer = new StandardAnalyzer();
                IndexWriter index = new IndexWriter("index",objAnalyzer, true );


Document objDocument = new Document();

                objDocument.add( Field.Keyword("name","Ebrahim Faisal"));
                objDocument.add( Field.Text("address","Chennai"));
                objDocument.add( Field.Keyword("designation","Software 
Engineer"));
                objDocument.add( Field.UnIndexed("xyz","123 IndexWriter 
index"));

                index.addDocument( objDocument );

                objDocument = new Document();

                objDocument.add( Field.Keyword("name","John Smith"));
                objDocument.add( Field.Text("address","Delhi"));
                objDocument.add( Field.Keyword("designation","Sr. Software 
Engineer"));
                objDocument.add( Field.UnIndexed("xyz","456 StandardAnalyzer 
true"));

                index.addDocument( objDocument );

                index.optimize();
                index.close();

                //Logic for deleting

                IndexReader objIndexReader = IndexReader.open("index");

TermDocs objTermDocs = objIndexReader.termDocs(new Term("name","Ebrahim Faisal"));

                while( objTermDocs.next() )
                {
                        int docNum = objTermDocs.doc();
                        objDocument = objIndexReader.document( docNum );
                        if( 
objDocument.get("designation").equalsIgnoreCase("Software Engineer"))
                        {
                                objIndexReader.delete( docNum );
                        }
                }
                objIndexReader.close();


Searcher objIndexSearcher = new IndexSearcher("index");

                Query objQuery = null;

                objQuery = QueryParser.parse("Delhi", "address"
             , objAnalyzer);


Hits objHits = objIndexSearcher.search(objQuery);

                System.out.println(" objHits "+objHits.length());

                for (int nStart = 0; nStart < objHits.length(); nStart++)
                {
                        objDocument = objHits.doc(nStart);
                        System.out.println(" address 
"+objDocument.get("address"));
                }
                objIndexSearcher.close();
                objIndexSearcher = null;


} public static void main(String[] args) throws Exception { new LuceneDelete().test(); } }




E.FAISAL




From: mahaveer jain <[EMAIL PROTECTED]>
Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
To: Lucene Users List <lucene-user@jakarta.apache.org>, Paul <[EMAIL PROTECTED]>
Subject: Re: Deleting index for DB indexing
Date: Thu, 30 Dec 2004 21:17:48 -0800 (PST)


Thanks Paul,

You idea seems to be good. I ll try that. I have one more question. Should the new key what I create have to be keyword ? or Can it be just a column in the index ?

Mahaveer

Paul <[EMAIL PROTECTED]> wrote:
On Thu, 30 Dec 2004 08:36:04 -0800 (PST), mahaveer jain
wrote:
> I am indexing more that 5 tables. And each for them have autoincrement and
> that is the primary key. So if I do find DocNum, it may so happen that it
> may delete document I don't want to delete.


you need to create your own global ID, I had the same problem (but I
used a MD5 hashvalue). One solution ist to give each of your tables an
internal number and when creating your lucene-documents you add an
additional field with something like "dbInternalId*100+dbNumber" so
that db-record 5 in table 3 results in 503. when documents from your
DB are deleted and you need to update the index you simple create a
term which's value is calculated the same way and delete the document
with the IndexReader.delete(Term)
Instead of calculating you can do string concatenating as well :)

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

_________________________________________________________________
The MS Office product suite. Make efficiency a habit. http://www.microsoft.com/india/office/experience/ Simplify your life.



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to