On Wed, 25 Jan 2012 14:23:01 -0500 chas williams - CONTRACTOR <[email protected]> wrote:
> > Well, that's the issue. Maybe someone already does rely on this > > inconsistent behavior; taking it away unconditionally would break them > > (which is why I wasn't suggesting doing that). > > how does one rely on something that is inconsistent? if the behavior > is suddenly consistent and their stuff breaks that could be seen as > already within the same inconsistent behavior. eventually they are > going to get bitten. Well, someone can 'rely' on something working 99% of the time. It's pretty noticeable if that suddenly drops to 0%. But even not counting that... it's inconsistent wrt caching behavior, so in certain environments and workloads it is possible for something to work over 99% of the time or even 100% (at a theoretical level), if you can make certain guarantees about cache access. As a contrived example that will generally "always" work: Say we have some batch job processing thing that runs something distributed over a series of machines. Clients A and B read some temporary config file X in AFS that says what to do. Say that the workload always involves client A finishing exactly 1 second before client B. And config file X says to delete the config file when the job is done. Assume that the workloads for A and B do not involve touching AFS at all except to read config file X. So, A will finish the job and unlink X. B will finish the job one second later, try to read X to see what to do when the job is finished, and also try to unlink it. It will get an error, but let's say the software ignores unlink() errors (which is not uncommon). With the current behavior, this will pretty much always work if the job doesn't take too long, since X will always be in B's cache, and so B can read it after the job is done. If we change things so that callbacks to X are broken when A unlinks X, the read from B will "always" fail. I'm not suggesting that such a system is particularly well-designed or anything... but I can imagine such a thing existing. > > Isn't this the same as the use case for the POSIX semantics on a > > local fs? Someone might be still reading the data (data, > > configuration, ...). Maybe you deleted a dso for a library that a > > running process is linked to. > > i forgot about dso's. upgrading a dso about is the only valid reason i > think for this behavior. In general? I think it's a very useful behavior for temporary file storage that doesn't linger. If you make a temporary file and just unlink when you're done, it'll stick around if the process is killed halfway through what it's doing. If you open, and then unlink, it goes away if the proc is killed even if you don't cleanup properly. (The OpenAFS salvager was semirecently made to use this kind of functionality for its temp storage, iirc.) For the case in AFS, it's a little weirder to be sharing such files across machines, but it's possible. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
