Curious, I guess I don't understand the BSD disclaimer. The application should not need to track any of this. The OS should be tracking open FD and locks for the process, and when it closes a FD on behalf of a process it should also remove the locks.
-----Original Message----- >From: "Marvin Humphrey (JIRA)" <[EMAIL PROTECTED]> >Sent: Jan 23, 2007 10:56 PM >To: java-dev@lucene.apache.org >Subject: [jira] Commented: (LUCENE-710) Implement "point in time" searching >without relying on filesystem semantics > > > [ > https://issues.apache.org/jira/browse/LUCENE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466911 > ] > >Marvin Humphrey commented on LUCENE-710: >---------------------------------------- > >On Jan 23, 2007, at 2:19 PM, Michael McCandless (JIRA) wrote: > >> First do no harm. > >If that was really your guiding philosophy, you would never change anything. > >> And Sun's Javadocs on the equivalent Java method, File.createNewFile, has a >> warning about not relying on this for locking: >> >> http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html#createNewFile() > >That page recommends that you use FileLock instead, which maps to Fcntl on >some systems. The FreeBSD manpage on Fcntl uses less delicate language than >Sun in pointing out the drawbacks: > > This interface follows the completely stupid semantics of System V and > IEEE Std 1003.1-1988 (``POSIX.1'') that require that all locks associated > with a file for a given process are removed when any file descriptor for > that file is closed by that process. This semantic means that applica- > tions must be aware of any files that a subroutine library may access. > >Trying to guarantee that kind of discipline from library code severely limits >your options. > >> This warning is why we created the NativeFSLockFactory for Directory locking >> in the first place. > >Take a look at this bug, which explains how that warning got added. > >http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4676183 > >Read the comment below -- the problem with the "protocol" they warn you >against using is with deleteOnExit(), not createNewFile(). I think you're >better off with dot-locks. > >> OK. You could implement this in Lucene as a custom deletion policy once we >> get this commmitted (I think this is 6 proposals now for "deletion policy" >> for NFS), plus a wrapper around IndexReader. > >This was the response I got on the KinoSearch list: > > We do not enable NFS writes, only reads (which is why Slashdot is able to > reliably use NFS for its heavy load :-). So I don't think that will work, > if I understand you correctly. > >Lack of bulletproof support for NFS ain't gonna hold up my next release any >longer. What a freakin' nightmare... > >> Implement "point in time" searching without relying on filesystem semantics >> --------------------------------------------------------------------------- >> >> Key: LUCENE-710 >> URL: https://issues.apache.org/jira/browse/LUCENE-710 >> Project: Lucene - Java >> Issue Type: Improvement >> Components: Index >> Affects Versions: 2.1 >> Reporter: Michael McCandless >> Assigned To: Michael McCandless >> Priority: Minor >> >> This was touched on in recent discussion on dev list: >> http://www.gossamer-threads.com/lists/lucene/java-dev/41700#41700 >> and then more recently on the user list: >> http://www.gossamer-threads.com/lists/lucene/java-user/42088 >> Lucene's "point in time" searching currently relies on how the >> underlying storage handles deletion files that are held open for >> reading. >> This is highly variable across filesystems. For example, UNIX-like >> filesystems usually do "close on last delete", and Windows filesystem >> typically refuses to delete a file open for reading (so Lucene retries >> later). But NFS just removes the file out from under the reader, and >> for that reason "point in time" searching doesn't work on NFS >> (see LUCENE-673 ). >> With the lockless commits changes (LUCENE-701 ), it's quite simple to >> re-implement "point in time searching" so as to not rely on filesystem >> semantics: we can just keep more than the last segments_N file (as >> well as all files they reference). >> This is also in keeping with the design goal of "rely on as little as >> possible from the filesystem". EG with lockless we no longer re-use >> filenames (don't rely on filesystem cache being coherent) and we no >> longer use file renaming (because on Windows it can fails). This >> would be another step of not relying on semantics of "deleting open >> files". The less we require from filesystem the more portable Lucene >> will be! >> Where it gets interesting is what "policy" we would then use for >> removing segments_N files. The policy now is "remove all but the last >> one". I think we would keep this policy as the default. Then you >> could imagine other policies: >> * Keep past N day's worth >> * Keep the last N >> * Keep only those in active use by a reader somewhere (note: tricky >> how to reliably figure this out when readers have crashed, etc.) >> * Keep those "marked" as rollback points by some transaction, or >> marked explicitly as a "snaphshot". >> * Or, roll your own: the "policy" would be an interface or abstract >> class and you could make your own implementation. >> I think for this issue we could just create the framework >> (interface/abstract class for "policy" and invoke it from >> IndexFileDeleter) and then implement the current policy (delete all >> but most recent segments_N) as the default policy. >> In separate issue(s) we could then create the above more interesting >> policies. >> I think there are some important advantages to doing this: >> * "Point in time" searching would work on NFS (it doesn't now >> because NFS doesn't do "delete on last close"; see LUCENE-673 ) >> and any other Directory implementations that don't work >> currently. >> * Transactional semantics become a possibility: you can set a >> snapshot, do a bunch of stuff to your index, and then rollback to >> the snapshot at a later time. >> * If a reader crashes or machine gets rebooted, etc, it could choose >> to re-open the snapshot it had previously been using, whereas now >> the reader must always switch to the last commit point. >> * Searchers could search the same snapshot for follow-on actions. >> Meaning, user does search, then next page, drill down (Solr), >> drill up, etc. These are each separate trips to the server and if >> searcher has been re-opened, user can get inconsistent results (= >> lost trust). But with, one series of search interactions could >> explicitly stay on the snapshot it had started with. > >-- >This message is automatically generated by JIRA. >- >You can reply to this email to add a comment to the issue online. > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]