Yep, this is what's happening in the trace Achim provided, too. Every 4k we write the chunk. I'm not sure how that's possible unless something is closing the file a lot, or the cache is full of stuff we can't kick out.


Actually, it's entirely possible. Here's how it all goes wrong...

When the cache is full, every call to write results in us attempting to empty the cache. On Linux the page cache means that we only call write once for each 4k chunk. However, our attempts to empty the cache are a little pathetic. We just attempt to store all of the chunks of the file currently being written back to the fileserver. If it's a new file there is only one such chunk - the one that we are currently writing. As chunks are much larger than pages, and when a chunk is dirty we flush the whole thing to the server, this is why we see repeated writes of the same data. The process goes something like this:

*) Write page at 0k, dirties first chunk of file.
*) Discover cache is full, flush first chunk (0->1024k) to the file server
*) Write page at 4k, dirties first chunk of file
*) Cache is still full, flush first chunk to file server
*) Write page at 8k, dirties first chunk of file

... and so on.

The problem is that we don't make good decisions when we decide to flush the cache. However, any change to flush items which are less active will be a behaviour change - in particular, on a multi-user system it would mean that one user could break write-on-close for other users simply by filling the cache.

Cheers,

Simon.

_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to