John Caruso wrote:

I'd agree that that's the intent, but the caching is hidden within ns_returnfile and it's not clear at all from the user's perspective that this alligator is lurking in the swamp. Using ns_returnfile in this way may not be the best approach in any particular situation, but it's nonetheless a completely valid usage and isn't contraindicated in any AOLserver docs I've seen.

This then is the real fix: mention it in the docs. I put a blurb on the appropriate wiki pages; feel free to suggest something better :)
The docs in the distribution should be updated too.

It happens to be used in ns_returnfile since that is the normal use case. On unix the fastpath cache is keyed off the dev/inode probably to keep the hash key shorter. Windows doesn't have device and inode numbers so it uses the filename as the hashkey, so it wouldn't run into this problem.

No, it can still easily run into this problem--it's just that the file name needs to be the same in both cases (which actually did apply in my client's case, and caused confusion in the early debugging of the problem, since the assumption was that using the same file name and/or path name was the source of the problem).

The system needs to be free to do some things to improve performance with the understanding that the user needs to be aware of those things or risk bad behaviour. I wouldn't call it an unreasonable assumption that a file with the same name (and same modtime etc) is the same file. You can run into a very similar problem with NFS (i.e., attribute caching causing a modified file to appear not so) and people have learned to deal with that.

- making ns_unlink flush the entry from the fastpath cache

Nope, since the file can be removed via (e.g.) exec rm.

True, but I'd still put this in the "system needs to be able to ..." category above. The system does some things and the developer should be aware of those things.

I don't think your suggestion of waiting for cache entries to age a second or two would work well, it just moves the race condition around and adds a whole lot of disk activity when a busy server is warming up - static files might be read a few dozen times instead of once.

Nope, not at all. The only files that would get read more than once would be those that were served within one second of being generated--which wouldn't apply to any content that fits the definition of "static".

It would work in your exact case, where the file is always removed immediately after being served and generated. But if not, it would still come up with the wrong answer.

13:50:21 - create file
13:50:21 - serve file (gets cached)
13:50:21 - delete file
13:50:21 - create file again (reuses inode)
... time passes ...
13:55:11 - serve file

In this case the file modtime is more than a few seconds old, but the cached mtime, inode, etc. are still matching the file on disk, so the stale cache entry would get delivered.

There is also at least one clever optimization where "static" content does get served within a second of being created, where the 404 page is used to generate something like an image from something like a database and writes it to a file where it is subsequently served by fastpath.

So this is actually a fairly non-intrusive fix. The main limitation is that it relies on the file timestamps and the server timestamps being synchronized, which may not always be true. But I can't think of a better solution. Simply put, fastpath caching is inherently broken because it's not possible to guarantee that the file in question really should be served from cache (again, short of a cache-defeating checksum).

The same can be said about nearly any caching system: it is unable to handle changes in the data that happen outside of the cache's control or knowledge. This is just the bargain you make when you use a cache.

But my point here wasn't to ask about potential workarounds but to highlight the issue itself, since I haven't seen it mentioned before.

I think you highlighting it is most of the fix. From there, get the caveat inserted into the documentation and the knowledge into the community so that the next person who runs into this problem will have an easier, or at least less frustrating time solving it.

-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to