Jim,
Can I ask why the filename is important for the cache key? With the
cache delay, the inode/dev + *time + size should do it all.
In fact, I finally understood the difference between mtime and ctime, if
any change is made, it should be the change to ctime.
Why ctime? ctime is unique in that it isn't something that can be set by
user level programs. It changes whenever the content of the file changes
or permissions, owners, or any of the metadata of the file.
So, for instance, if someone replaces a file with an identical file, the
ctime would still change. If you check the ctime, you can also skip
checking the size.
But none of this has to do with the filename. On Unix, filenames are
especially squishy. both stat and open follow symbolic links, and you
could therefore use a symlink to point to different files over time, but
the files could have identical mtime, atime, size, owner, etc. using
stat/open and comparing with what is in cache based upon the filename
would never detect this slight of hand. This is why the inode is most
important on unix. On windows, you can't do this, so filenames are safe.
My recommendation for changes are these:
1. use John's ingenious 1-2 second delay, optionally allow config of the
cache delay.
2. base the age for the above on the ctime, not the mtime. ctime is
always younger or the same age as mtime, and covers changes to metadata,
and is immune from easy modification.
Maybe: remove the fastpath config options from the basic config file, if
it is even there, other example configs could be set to cache = off.
Happy Weekend everyone.
tom jackson
On Fri, 2008-08-22 at 13:18 -0400, Jim Davidson wrote:
Ah -- I (finally) understand... I must have missed the detail re:
serialization in message #30 out of #60 or so
So, this clarifies to me:
-- cache by filename key is correct and good for most cases and should
be on by default
-- the grace period is a clever solution for the rapid-changing,
same filename case you described and deserves to be on by default
-- ns_returnfile shouldn't use the cache but already does -- some
config and/or command flags can be added to toggle the behavior
I'll update the code with the options above.
-Jim
On Aug 21, 2008, at 11:27 PM, John Caruso wrote:
On Thursday 02:34 PM 8/21/2008, Jim Davidson wrote:
To clarify one point: There is no technical solution to creating
temp
files with the same name and avoiding the race condition without
additional synchronization.
To clarify as well: the original code didn't involve a race
condition--it was effectively serialized, as though it were like
this snippet:
foreach object $objects {
eval exec /some/external/program --output-file $tempfile --
object $object
ns_returnfile 200 text/plain $tempfile
}
(As I mentioned to you, this was basically a batch process driven by
a client-side Java applet making sequential HTTP requests to an
AOLserver-driven API web server, one transaction at a time, with the
results being returned by ns_returnfile on the server. Also, the
temp file in question was in a secure directory.)
So the bug can (and did) manifest itself with serialized access.
So, here's what I'd suggest:
-- Cache by filename key should be the default. This is technically
the correct fix to enable temporary, uniquely named files, to be
returned via ns_returnfile.
-- John's grace period code is a clever optimization if fastpath is
being used in this way and could also be an option, default off.
Again, this wouldn't have resolved Arena's initial problem; the
original code would still have hit the bug, and it would have been
just as difficult to detect that that was happening (though slightly
easier to debug). That's why I'd recommend having the mtime
workaround code active with a default of 1--otherwise people running
a default config of AOLserver will still be open to the same issue.
(That's my only stake in this, BTW; Arena is already using the mtime
fix and will continue to do so, but I'd really rather not have
someone else run into this issue in the future.)
In thinking about it today I realized that it's useful to think
about the four scenarios in which the bug can currently occur (which
I believe partition the bug space):
1) Monotonically increasing time with a different filename
2) Monotonically increasing time with the same filename
3) Time travelling with a different filename
4) Time travelling with the same filename
(Time travelling here means mucking with the mtime artifically, a
la rsync, and filename means fully-qualified filename.)
The mtime workaround resolves scenarios 1 and 2, and using the
filename as the cache key resolves scenarios 1 and 3. Nothing
suggested so far resolves scenario 4--and in fact I don't think it's
possible to resolve scenario 4 short of a