Hi All; I have hit a snag in my Lucene integration and don't know what to do.
My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING ERROR: Unable to index new content Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock |ENTER|UpdateCacheEventProcessor.visit(ContentNodeDeleteEvent) Here is my code. You will recognize it pretty much as the IndexHTML class from the Lucene demo written by Doug Cutting. I have put a ton of comments in a attempt to understand what is going on. Any help would be appreciated. Luke package com.fbhm.bolt.search; /* * Created on Nov 11, 2004 * * This class will create a single index file for the Content * Management System (CMS). It contains logic to ensure * indexing is done "intelligently". Based on IndexHTML.java * from the demo folder that ships with Lucene */ import java.io.File; import java.io.IOException; import java.util.Arrays; import java.util.Date; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.index.TermEnum; import org.pdfbox.searchengine.lucene.LucenePDFDocument; import org.apache.lucene.demo.HTMLDocument; import com.alaia.common.debug.Trace; import com.alaia.common.util.AppProperties; /** * @author lshannon Description: <br * This class is used to index a content folder. It contains logic to * ensure only new or documents that have been modified since the last * search are indexed. <br * Based on code writen by Doug Cutting in the IndexHTML class found in * the Lucene demo */ public class Indexer { //true during deletion pass, this is when the index already exists private static boolean deleting = false; //object to read existing indexes private static IndexReader reader; //object to write to the index folder private static IndexWriter writer; //this will be used to write the index file private static TermEnum uidIter; /* * This static method does all the work, the end result is an up-to-date index folder */ public static void Index() { //we will assume to start the index has been created boolean create = true; //set the name of the index file String indexFileLocation = AppProperties.getPropertyAsString("bolt.search.siteIndex.index.root"); //set the name of the content folder String contentFolderLocation = AppProperties.getPropertyAsString("site.root"); //manage whether the index needs to be created or not File index = new File(indexFileLocation); File root = new File(contentFolderLocation); //the index file indicated exists, we need an incremental update of the // index if (index.exists()) { Trace.TRACE("INDEXING INFO: An index folder exists at: " + indexFileLocation); deleting = true; create = false; try { //this version of index docs is able to execute the incremental // update indexDocs(root, indexFileLocation, create); } catch (Exception e) { //we were unable to do the incremental update Trace.TRACE("INDEXING ERROR: Unable to execute incremental update " + e.getMessage()); } //after exiting this loop the index should be current with content Trace.TRACE("INDEXING INFO: Incremental update completed."); } try { //create the writer writer = new IndexWriter(index, new StandardAnalyzer(), create); //configure the writer writer.mergeFactor = 10000; writer.maxFieldLength = 100000; try { //get the start date Date start = new Date(); //call the indexDocs method, this time we will add new // documents Trace.TRACE("INDEXING INFO: Start Indexing new content."); indexDocs(root, indexFileLocation, create); Trace.TRACE("INDEXING INFO: Indexing new content complete."); //optimize the index writer.optimize(); //close the writer writer.close(); //get the end date Date end = new Date(); long totalTime = end.getTime() - start.getTime(); Trace.TRACE("INDEXING INFO: All Indexing Operations Completed in " + totalTime + " milliseconds"); } catch (Exception e1) { //unable to add new documents Trace.TRACE("INDEXING ERROR: Unable to index new content " + e1.getMessage()); } } catch (IOException e) { Trace.TRACE("INDEXING ERROR: Unable to create IndexWriter " + e.getMessage()); } } /* * Walk directory hierarchy in uid order, while keeping uid iterator from /* * existing index in sync. Mismatches indicate one of: (a) old documents to /* * be deleted; (b) unchanged documents, to be left alone; or (c) new /* * documents, to be indexed. */ private static void indexDocs(File file, String index, boolean create) throws Exception { //the index already exists we do an incremental update if (!create) { Trace.TRACE("INDEXING INFO: Incremental Update Request Confirmed"); //open existing index reader = IndexReader.open(index); //this gets an enummeration of uid terms uidIter = reader.terms(new Term("uid", "")); //jump to the index method that does the work //this will use the Iteration above and does //all the "smart" indexing indexDocs(file); //this will be true everytime the index already existed //we are not going to delete documents that are old if (deleting) { Trace.TRACE("INDEXING INFO: Deleting Old Content Phase Started. All Deleted Docs will be listed."); while (uidIter.term() != null && uidIter.term().field() == "uid") { //basically we are deleting all the document we have // indexed before Trace.TRACE("INDEXING INFO: Deleting document " + HTMLDocument.uid2url(uidIter.term().text())); //delete the term from the reader reader.delete(uidIter.term()); //go to the nextfield uidIter.next(); } Trace.TRACE("INDEXING INFO: Deleting Old Content Phase Completed"); //turn off the deleting flag deleting = false; }//close the deleting branch //close the enummeration uidIter.close(); // close uid iterator //close the reader reader.close(); // close existing index } //we go here is the index already existed else { Trace.TRACE("INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index"); // don't have exisiting indexDocs(file); } } private static void indexDocs(File file) throws Exception { //check if we are at the top of a directory if (file.isDirectory()) { //get a list of the files String[] files = file.list(); //sort them Arrays.sort(files); //index each file in the directory recursively //we keep repeating this logic until we hit a //file for (int i = 0; i < files.length; i++) //pass in the parent directory and the current file //into the file constructor and index indexDocs(new File(file, files[i])); } //we have an actual file, so we need to consider the //file extensions so the correct Document is created else if (file.getPath().endsWith(".html") || file.getPath().endsWith(".htm") || file.getPath().endsWith(".txt") || file.getPath().endsWith(".doc") || file.getPath().endsWith(".xml") || file.getPath().endsWith(".pdf")) { //if this is reached it means we were in the midst //of an incremental update if (uidIter != null) { //get the uid for the document we are on String uid = HTMLDocument.uid(file); //now compare this document to the one we have in the //enummeration of terms. //if the term in the enummeration is less than the //term we are on it must be deleted (if we are indeed //doing an incrementatal update) Trace.TRACE("INDEXING INFO: Beginnging Incremental update comparisions"); while (uidIter.term() != null && uidIter.term().field() == "uid" && uidIter.term().text().compareTo(uid) < 0) { //delete stale docs if (deleting) { reader.delete(uidIter.term()); } uidIter.next(); } //if the terms are equal there is no change with this document //we keep it as is if (uidIter.term() != null && uidIter.term().field() == "uid" && uidIter.term().text().compareTo(uid) == 0) { uidIter.next(); } //if we are not deleting and the document was not there //it means we didn't have this document on the last index //and we should add it else if (!deleting) { if (file.getPath().endsWith(".pdf")) { Document doc = LucenePDFDocument.getDocument(file); Trace.TRACE("INDEXING INFO: Adding new document to the existing index: " + doc.get("url")); writer.addDocument(doc); } else if (file.getPath().endsWith(".xml")) { Document doc = XMLDocument.Document(file); Trace.TRACE("INDEXING INFO: Adding new document to the existing index: " + doc.get("url")); writer.addDocument(doc); } else { Document doc = HTMLDocument.Document(file); Trace.TRACE("INDEXING INFO: Adding new document to the existing index: " + doc.get("url")); writer.addDocument(doc); } } }//end the if for an incremental update //we are creating a new index, add all document types else { if (file.getPath().endsWith(".pdf")) { Document doc = LucenePDFDocument.getDocument(file); Trace.TRACE("INDEXING INFO: Adding a new document to the new index: " + doc.get("url")); writer.addDocument(doc); } else if (file.getPath().endsWith(".xml")) { Document doc = XMLDocument.Document(file); Trace.TRACE("INDEXING INFO: Adding a new document to the new index: " + doc.get("url")); writer.addDocument(doc); } else { Document doc = HTMLDocument.Document(file); Trace.TRACE("INDEXING INFO: Adding a new document to the new index: " + doc.get("url")); writer.addDocument(doc); }//close the else }//close the else for a new index }//close the else if to handle file types }//close the indexDocs method } ----- Original Message ----- From: "Craig McClanahan" <[EMAIL PROTECTED] To: "Jakarta Commons Users List" <[EMAIL PROTECTED] Sent: Thursday, November 11, 2004 6:13 PM Subject: Re: avoiding locking In order to get any useful help, it would be nice to know what you are trying to do, and (most importantly) what commons component is giving you the problem :-). The traditional approach is to put a prefix on your subject line -- for commons package "foo" it would be: [foo] avoiding locking It's also generally helpful to see the entire stack trace, not just the exception message itself. Craig On Thu, 11 Nov 2004 17:27:19 -0500, Luke Shannon <[EMAIL PROTECTED] wrote: What can I do to avoid locking issues? Unable to execute incremental update Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock Thanks, Luke --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]