Hey all, Actually, using shutdown hooks might not be the best idea since Lucene is very often used in server-side Java environments. Many app-servers throw security errors when trying to add shutdown hooks, and I've seen Weblogic crash before when having them in a webapp. Has anyone else run into this?
This all brings up a key issue with Lucene, which is that there is little way to recover from errors gracefully. I'd love to see a number of checked exceptions added. For example: IndexNotFoundException -- when trying to open an index that doesn't exist IndexLockedException -- when a lock file prevents you from getting an index IndexCorruptException -- maybe this would be thrown when an index appears to be broken? At the moment, Lucene throws many undocumented IOExceptions and even NullPointerExceptions when an error case comes up. I catch these in my app, but there's really not an intelligent way to recover from them. Adding checked exceptions would be a change of the API, but it seems worth it. I'd be happy to make a more specific proposal if other people feel like this would be a worthwhile direction to go in. Regards, Matt Quoting "Spencer, Dave" <[EMAIL PROTECTED]>: > Runtime.addShutdownHook: > > > > http://java.sun.com/j2se/1.3/docs/api/java/lang/Runtime.html#addShutdown > Hook(java.lang.Thread) > > -----Original Message----- > From: Otis Gospodnetic [ mailto:[EMAIL PROTECTED]] > Sent: Sunday, March 17, 2002 12:06 AM > To: Lucene Users List > Subject: Re: corrupted index > > > Oh, I just thought of something (wine does body good). > Perhaps one could use Runtime (the class) to catch the JVM shutdown and > do whatever is needed to prevent index corruption. I believe there are > some shutdown hook methods in there that may let you do that. I'm too > lazy to look up the API docs now, but I rememeber reading about that > once, and perhaps it was even mentioned on one of the 2 Lucene mailing > lists. > > On the other hand, it would be great to have a tool that can verify an > existing index. I don't know enough about the actual file structure > yet to write something like that, but maybe somebody else has done that > already or would like to contribute. > > Otis > > > --- "Steven J. Owens" <[EMAIL PROTECTED]> wrote: > > Otis, > > > > > You can remove the .lock file and try re-indexing or continuing > > > indexing where you left off. > > > I am not sure about the corrupt index. I have never seen it > > happen, > > > and I believe I recall reading some messages from Doug Cutting > > saying > > > that index should never be left in an inconsistent state. > > > > Obviously never "should" be, but if something's pulling the rug > > out from under his JRE, changes could be only partially written, > > right? > > > > Or is the writing format in some sense transactionally safe? > > I've never worked directly on something like this, but I worked at a > > database software company where they used transaction semantics and a > > journaling scheme to fake a "bulletproof" file system. Is this how > > the index-writing code is implemented? > > > > In general, I can guess Doug's response - just torch the old > > index directory and rebuild it; Lucene's indexing is fast enough that > > you don't need to get clever. This seems to be Doug's stance in > > general (i.e. "don't get fancy, I already put all the fanciness > > you'll > > need into extremely fast indexing and searching"). So far, it seems > > to work :-). > > > > > I could be making this up, though, so I suggest you search through > > > lucene-user and lucene-dev archives on www.mail-archive.com. > > > A search for "corrupt" should do it. > > > Once you figure things out maybe you can post a summary here. > > > > I got a little curious, so I went and did the searches. There > > is > > exactly one message in each list archive (dev and users) with the > > keyword "corrupt" in it. The lucene-users instance is irrelevant: > > > > > http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg00557.html > > > > The lucene-dev instance is more useful: > > > > > http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00157.html > > > > It's a post from Doug, dated sept 27, 2001, about adding not > > just > > thread-safety but process-safety: > > > > It should be impossible to corrupt an index through the Lucene API. > > However if a Lucene process exits unexpectedly it can leave the > > index > > locked. The remedy is simply to, at a time when it is certain that > > no > > processes are accessing the index, remove all lock files. > > > > So it sounds like it's worth trying just removing the lock > > files. > > Hm, is there a way to come up with a "sanity check" you can run on an > > index to make sure it's not corrupted? This might be an excellent > > thing to reassure yourself with: something went wrong? Run a sanity > > check, if it fails just reindex. > > > > Steven J. Owens > > [EMAIL PROTECTED] > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Sports - live college hoops coverage > http://sports.yahoo.com/ > > -- > To unsubscribe, e-mail: < > mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: < > mailto:[EMAIL PROTECTED]> > > > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>