Hi, This looks good. I have understood what you intend to achieve in this project. Thanks for the clarifications.
Shruti On Sat, Apr 17, 2010 at 7:31 PM, Jeffrey Altman < [email protected]> wrote: > On 4/17/2010 2:25 AM, shruti jain wrote: > > Here is what I know about the cache manager and its file server > > interactions. > > The Cache Manager// resides on the client side in openAFS environment > > and communicates with AFS file server on behalf of the application > > programs running on the client. When an AFS file is needed by any > > application program running on a client machine, the request is sent to > > the Cache Manager which in turn issues RPC calls to the file server > > storing the requested file. > > This is true for any object (file, directory, mount point, symlink, ...) > > AFS supports readonly replicas. The CM is permitted to request copies > of the data from any of the replicas although at present, the CM only > reads from a single replica at a time. > > >// When the Cache Manager receives the > > requested data from the file Server, it stores it in the cache and also > > delivers it to the application program which had initially requested for > > the data. In order to maintain cache consistency, server issues a > > callback along with the data. A callback is a promise by a File Server > > to a Cache Manager to inform any change in the data delivered by the > > File Server to the Cache Manager. If any other client on the network > > modifies the file then the file server breaks this callback and thus > > gives an indication to the Cache manager that its locally cached copy of > > the file is obsolete and needs to be updated.The callback mechanism > > ensures that the Cache Manager always requests the most up-to-date > > version of a file. In this way, cache manager also performs the > > responsibility of maintaining the cache. > > You have the general idea. Let me provide a few additional details. In > the original (and currently deployed) implementation of callbacks, a > callback is a promise that the FS will notify the CM of a change for up > to S seconds with values for read/write data typically measured in > minutes and for read-only data typically measured in hours. The number > of callback promises (or registrations) that a FS can maintain is > finite. Callback registrations can therefore be canceled prematurely > without there being a change. > > The callback notification (or invalidation) is delivered via an > unauthenticated RPC channel. As a result, the notification cannot be > trusted by the CM and must be treated as meaning "a change might have > occurred, please verify if it matters". > > The existing callback notification does not provide any hint as to the > type of change that might have occurred. Callback notifications are > issued for many reasons including: > > . the data changed > . the access control list changed > . other metadata changed > . the locking state changed > . the volume in which the data is located is being replicated > (aka released) > . the object has been deleted > . the FS ran out of room in the registration table > > Once a notification is issued, the registration is broken and the > CM will receive no further notifications until it requests updated > status for the object in question. > > The CM determines what has changed by issuing a FetchStatus RPC to > the FS and comparing the prior and current status fields. > > Matt Benjamin has developed and implemented (but its not shipping yet) > an extended version of callback notifications that provide the CM with > additional details regarding the change. When combined with an > authenticated callback channel this becomes a very powerful combination. > > It is also important to discuss how the FS and CM track object data. > Each time a change to the data (not the metadata) occurs, a data version > (DV) number for the object is incremented. When the CM issues a > StoreData rpc, it is returned updated status info. If the DV was > incremented by one, then the CM knows that there was no race with > another CM and all of the data in the cache for that file is still > current. If the DV increment was greater than one, then the CM knows > that the data it just wrote is current, but all other data is suspect. > > When using the Extended Callback mechanism, the FS can issue a > notification that a StoreData occurred affecting {FileID, offset, > length} and the current DV is N without canceling the callback > registration. This permits the CM to maintain the cache coherency at a > lower cost of network traffic when an object is actively being used. > > However, when a CM starts or when an object has been idle for more than > a few minutes, there will be no callback registration. In that > situation, a change could have occurred to the file data and the CM will > be forced to discard all of the cached data if a change did occur. > Unfortunately, there is no mechanism at present for the CM to ask the FS > "I need the chunk of data represented by {FileID, offset, length} but I > currently have data in that range with the following hash value. Could > you confirm that my data is current or send me the correct data?" > > I have been considering a proposal to implement such an RPC, > RXAFS_FetchDataWithHash(FID, offset, length, hash). With such an RPC in > place, the CM can verify the contents of the cache and avoid large > amounts of unnecessary traffic. > > I am raising this idea here because I believe it is very applicable to > your project. The trust model in AFS is between the CM and the FS. > There is no trust between CMs. As a result, if a CM obtains data from > another CM, it needs a low cost mechanism to validate it against the FS. > > > So in this project, we need to modify the cache manager to enable > > interactions with other clients as well. > > In the first part of the project, where the cache manager contacts a > > fixed set of remote clients, it retrieves the file from any of these > > clients if their callback of the file is not broken. Since the callback > > is not broken, it is an indication that the file present on this remote > > client is most recent. In case no client has most recent copy of the > > file, we can contact the file server to retrieve the data. > > That is one approach but not the one I would take. If the cost of > reading the data from a local CM is so much cheaper than reading it from > the FS, the CM can read the data from the other CM (or at least get its > hash) and then verify it with the file server. > > In most file operations, the entire file is not re-written. Just > portions of it are and in the case of "append only files" such as log > files, the data never changes after it is written. Re-fetching this > data from the FS every time the DV changes is extremely wasteful. It is > much better to obtain it in the cheapest mechanism possible and then > verify it via a trusted means. > > > In the second part of the project, we can allow discovery of peer > > clients for collaboration. This can be done by modifying the file server > > to keep access logs of the clients and if a client requests for any data > > then its corresponding clients in the logs would be returned to the > > requesting clients. In order to maintain cache consistency, the > > requesting client also establishes a callback guarantee from the file > > server so that it knows of the modifications in the file irrespective of > > where it has got the file from. > > I would leave the FS out of the peer collaboration and instead permit > CMs that wish to offer data to do so via Bonjour. > > > > > I have seen the files afs_callback.c, cbqueue.c, dcache.c and server.c > > and think that these are some of the programs used in cache manager and > > server-cache manager interactions. Please correct me if I am wrong. > > In terms of how I would like to see this project structured. Before any > collaboration is implemented I would like to see a generic mechanism > added to the CM to permit use of a second level cache. Then once than > mechanism is in place, a plug-in to that framework can be implemented > that supports obtaining data from the second level cache which happens > to be peer CMs. > > The benefit of this approach is that the framework for the second level > cache can be implemented and incorporated into a future openafs release > without committing us to a particular implementation of the peer to peer > protocols. Future research in peer to peer cache sharing can then take > place at a much lower cost. > > Jeffrey Altman > > > >
