hi Patrick, Mike is the expert in this, but until he gets in, can you add details on the update pattern - note that the DeletionPolicy you describe below is not (afaik) related to the write lock time-out issues you are facing. The DeletionPolicy manages better the interaction between an IndexWriter that deletes old files, and an IndexReader that might still use this file. The write lock, on the hand, just synchronizes between multiple IndexWriter objects attempting to open the same index for write. So, do you have multiple writers? Can you print/describe the writers timing scenario when this time-out problem occur, e.g, something like this w1.open w1.modify w1.close w2.open w2.modify w2.close w3.open w3.modify w3.close w2.open ..... time-out... but w3 closed the index.... so the lock-file was supposed to be removed, why wasn't it? Can write attempt come from different nodes in the cluster? Can you make sure that when "the" writer gets the lock time-out there is indeed no other active writer?
Doron "Patrick Kimber" <[EMAIL PROTECTED]> wrote on 29/06/2007 02:01:08: > Hi, > > We are sharing a Lucene index in a Linux cluster over an NFS > share. We have > multiple servers reading and writing to the index. > > I am getting regular lock exceptions e.g. > Lock obtain timed out: > NativeFSLock@/mnt/nfstest/repository/lucene/lock/lucene-2d3d31fa7f19eabb73d692df44087d81- > n-write.lock > > - We are using Lucene 2.2.0 > - We are using kernel NFS and lockd is running. > - We are using a modified version of the ExpirationTimeDeletionPolicy > found in the > Lucene test suite: > http://svn.apache. > org/repos/asf/lucene/java/trunk/src/test/org/apache/lucene/index/TestDeletionPolicy. > java > I have set the expiration time to 600 seconds (10 minutes). > - We are using the NativeFSLockFactory with the lock folder being > within the index > folder: > /mnt/nfstest/repository/lucene/lock/ > - I have implemented a handler which will pause and retry an > update or delete > operation if a LockObtainFailedException or StaleReaderException is > caught. The > handler will retry the update or delete once every second for > 1 minute before > re-throwing the exception and aborting. > > The issue appears to be caused by a lock file which is not deleted. > The handlers > keep retrying... the process holding the lock eventually aborts... > this deletes the > lock file - any applications still running then continue normally. > > The application does not throw these exceptions when it is run on a > standard Linux > file system or Windows workstation. > > I would really appreciate some help with this issue. The > chances are I am doing > something stupid... but I cannot think what to try next. > > Thanks for your help > > Patrick > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]