On Mon, 6 Sep 2010, Graham Leggett wrote:

<snip>
For those who have forgotten, that's what we do in our large-file-caching-patchset for mod_disk_cache (hidden as an attachment to https://issues.apache.org/bugzilla/show_bug.cgi?id=39380 but I should really get around to upload an up2date version that applies cleanly to the current 2.2 release). Some of the solutions there aren't really applicable to httpd proper (mostly workarounds for missing infrastructure), but some ideas are rather sane (like writing the header files in a single go with an iovec with null terminated strings instead of crlf-stuff thad needs to be parsed). Oh, and the design caters for a shared data cache (ftp and rsync access uses the same cache), which isn't really a priority for something in httpd proper.

Given that the make-cache-writes-atomic problem requires a change to the data format, it may be useful to look at this now, before v2.4 is baked, which will happen soon.

Indeed.

When at it, it might make sense to replace arch-specific data types like int and apr_size_t with apr_int32_t and such. Most people would have made the 32/64 bit transition already though, so it might be a non-issue.

Another good thing to have would be the filename of the maching data/body file. httpd mod_disk_cache hashes this from the URL, but there may be smarter ways to do this at cache-time which requires the resulting filename to be stored (for example we use dev/inode on plain files to reduce data duplication when caching DVD images with dozens of known URLs). Size of that file is also good to have, on mismatch the cache is out of sync/corrupted (unless the file is being written, but then we know enough to start answering the query from cache).

Also we save r->filename to be able to fill it in when replying on a query (I think for making logging filenames work).

How much of a performance boost is the use-null-terminated-strings?

As CPU is cheap nowadays, not much in end-to-end performance, but the logic of figuring out whether a header file is correct/complete becomes much easier when you construct the entire .header-file in an iovec, place the total header length in the on-disk structure, and then write it out.

Reading it in becomes reading main data structure, and then reading whatever length the structure indicates as headers. If you get more or less than the data structure says then something is wrong and you can either retry (if the header seems to be currently writing and the iovec size is too small so it takes multiple writes, but as the current mod_disk_cache code uses temporary files that's a non-issue) or discard it.

The current text-ish-based .header files offers no way of knowing the integrity of the header file, and store_table()/read_table() have quite a lot of complexity when just handling the null terminated strings as is would do nicely.

/Nikke
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     ni...@acc.umu.se
---------------------------------------------------------------------------
 After three days of intense pain, the snake died. * Riker
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Reply via email to