I think this is the relevant section:

  A8. What is close-to-open cache consistency?

A. Perfect cache coherency among disparate NFS clients is very expensive to achieve, so NFS settles for something weaker that satisfies the requirements of most everyday types of file sharing. Everyday file sharing is most often completely sequential: first client A opens a file, writes something to it, then closes it; then client B opens the same file, and reads the changes.

So, when an application opens a file stored in NFS, the NFS client checks that it still exists on the server, and is permitted to the opener, by sending a GETATTR or ACCESS operation. When the application closes the file, the NFS client writes back any pending changes to the file so that the next opener can view the changes. This also gives the NFS client an opportunity to report any server write errors to the application via the return code from close(). This behavior is referred to as close-to-open cache consistency.

Linux implements close-to-open cache consistency by comparing the results of a GETATTR operation done just after the file is closed to the results of a GETATTR operation done when the file is next opened. If the results are the same, the client will assume its data cache is still valid; otherwise, the cache is purged.

Close-to-open cache consistency was introduced to the Linux NFS client in 2.4.20. If for some reason you have applications that depend on the old behavior, you can disable close-to-open support by using the "nocto" mount option.

There are still opportunities for a client's data cache to contain stale data. The NFS version 3 protocol introduced "weak cache consistency" (also known as WCC) which provides a way of checking a file's attributes before and after an operation to allow a client to identify changes that could have been made by other clients. Unfortunately when a client is using many concurrent operations that update the same file at the same time, it is impossible to tell whether it was that client's updates or some other client's updates that changed the file.

For this reason, some versions of the Linux 2.6 NFS client abandon WCC checking entirely, and simply trust their own data cache. On these versions, the client can maintain a cache full of stale file data if a file is opened for write. In this case, using file locking is the best way to ensure that all clients see the latest version of a file's data.

A system administrator can try using the "noac" mount option to achieve attribute cache coherency among multiple clients. Almost every client operation checks file attribute information. Usually the client keeps this information cached for a period of time to reduce network and server load. When "noac" is in effect, a client's file attribute cache is disabled, so each operation that needs to check a file's attributes is forced to go back to the server. This permits a client to see changes to a file very quickly, at the cost of many extra network operations.

Be careful not to confuse "noac" with "no data caching." The "noac" mount option will keep file attributes up-to-date with the server, but there are still races that may result in data incoherency between client and server. If you need absolute cache coherency among clients, applications can use file locking, where a client purges file data when a file is locked, and flushes changes back to the server before unlocking a file; or applications can open their files with the O_DIRECT flag to disable data caching entirely.

For a better understanding of the compromises faced in the design of NFS caching, see Callaghan's "NFS Illustrated."

On Jan 9, 2007, at 12:25 PM, Michael McCandless (JIRA) wrote:


[ https://issues.apache.org/jira/browse/LUCENE-767? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#action_12463358 ]

Michael McCandless commented on LUCENE-767:
-------------------------------------------


Carrying over from the java-dev list:


Grant Ingersoll wrote:

Can you explain in more detail on this bug why this makes you nervous?

Well ... the only specific example I have is NFS (always my favorite
example!).

As I understand it, the NFS client typically uses a separate cache to
hold the "attributes" of the file, including file length.  This cache
often has weaker or maybe just "different" guarantees than the "data
cache" that holds the file contents.  So basically you can ask what
the file length is and get a wrong (stale) answer.  EG see
http://nfs.sourceforge.net, which describes Linux's NFS client
approach.  The NFS client on Apple's OS X seems to be even worse!

I think very likely Lucene may not trip up on this specifically since
a reader would only ask for this file's length for the first time once
the file is done being written (ie the commit of segments_N has
occurred) and so hopefully it's not in the attribute cache yet?

I think there may very well be cases of other filesystems where
"checking file length" is risky (that we all just don't know about
(yet!)), which is why I favor using explicit values instead of relying
on file system semantics, whenever possible.

Maybe I'm just too paranoid :)

But for all the places / devices Lucene has gone and will go, relying
on the bare minimum set of IO operations I think will maximize our
overall portability.  Every filesystem has its quirks.


maxDoc should be explicitly stored in the index, not derived from file length --------------------------------------------------------------------- --------

                Key: LUCENE-767
                URL: https://issues.apache.org/jira/browse/LUCENE-767
            Project: Lucene - Java
         Issue Type: Improvement
   Affects Versions: 1.9, 2.0.0, 2.0.1, 2.1
           Reporter: Michael McCandless
        Assigned To: Michael McCandless
           Priority: Minor

This is a spinoff of LUCENE-140
In general we should rely on "as little as possible" from the file system. Right now, maxDoc is derived by checking the file length of the FieldsReader index file (.fdx) which makes me nervous. I think we should explicitly store it instead. Note that there are no known cases where this is actually causing a problem. There was some speculation in the discussion of LUCENE-140 that it could be one of the possible, but in digging / discussion there were no specifically relevant JVM bugs found (yet!). So this would be a defensive fix at this point.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/ Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/ software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to