Hi -- A couple of months ago a short thread started in relation to the PRs #16593 and #38034 (which also references #42987) on the various problems related to ETags:
http://marc.info/?l=apache-httpd-dev&m=118831732512678&w=2 http://issues.apache.org/bugzilla/show_bug.cgi?id=16593 http://issues.apache.org/bugzilla/show_bug.cgi?id=38034 http://issues.apache.org/bugzilla/show_bug.cgi?id=42987 I'm not about to tackle these all immediately, but I wanted to send out a couple of proposals and find out what problems folks might see with them. 1) Per #38034, it appears that ap_meets_conditions() treats "*" incorrectly. The patch attached to the PR duplicates ap_meets_conditions() for exclusive use by mod_dav. The only changes are to add checks on resource->exists in two places and to use resource->hooks->getetag instead of looking for an ETag header in r->headers_out. In the interest of avoiding code duplication, I wonder if it would be possible to continue to use ap_meets_conditions() by using r->no_local_copy in place of resource->exists, and by setting/unsetting that and an ETag header in r->headers_out in mod_dav as necessary; e.g., prior to calls to ap_meets_conditions(). However, it looks to me as though r->no_local_copy is used in a few places to bypass the ap_meets_conditions() checks entirely (i.e., to never send a 304 response in certain cases). Does anyone have comments on what the intended purpose r->no_local_copy is or was, especially in relation to the language of RFC 2616 sections 14.24 and 14.26 ("if '*' is given and any current entity exists for that resource ...")? Would it be more appropriate and/or required by our backporting rules to add another field to request_rec, like r->exists, to indicate to ap_meets_conditions() that a resource does not exist? Or to use r->notes or conceivably r->finfo for the same purpose? Is it even possible to backport a change to the action of ap_meets_conditions() such that it would start depending on data which third-party callers won't be supplying? Advice is welcome, please! 2) I haven't looked at #16593 in any detail, but I wonder if the change proposed in the patch attached to #38034 where resource->hooks->getetag is used instead of looking for an ETag header in r->headers_out might not be an attempt to deal with this issue. If so, then whatever the eventual solution for #38034 turns out to be will likely resolve this PR as well. 3) Per #42987, it seems as if there's a subtle difference between the requirements for ETags as described in RFC 2616 section 13.3.3 and the way we currently generate them in ap_make_etag(). When a file's mtime is in the past, ap_make_etag() uses the file's inode, length, and/or mtime to create a strong ETag. However, when the mtime is the same as r->request_time, a weak ETag is generated. As the attachment to the PR reveals, though, this isn't really sufficient when certain sequences of requests occur within a single second. The RFC says that a file's mtime "could" be a weak validator, because the file might change more than once in a second. But I think the subtly important issue here is that word "could". If I understand it correctly, the implication is that one has resource for which one wants to generate weak ETags only, because one knows the semantic meaning isn't changing although the content may be. In this particular case, the modification time *might* be appropriate to use as a weak validator (although that seems somewhat unlikely, because non-semantic changes that occurred seconds apart would still cause different weak ETags to be generated, which probably isn't what you'd want in this circumstance). But, in the cases handled by ap_make_etag(), the content and/or the semantic meaning of a resource could always be changing, so weak ETags would seem to be, as the PR says, simply never appropriate. Alas, though, there is probably no way to quickly generate a strong validators like a hash -- which is why we're using inodes, lengths, and mtimes in the first place, of course. So, a modest proposal, which aims to punt the problem to the administrator, and also to resolve the fact that FileETag is provided to configure the actions of ap_make_etag(), while mod_dav_fs's dav_fs_getetag() actions are fixed. The highlights would be: - refactor ap_make_etag() and dav_fs_getetag() to use a common code base as far as possible - make dav_fs_getetag() follow the options set by FileETag (if this is a problem, then introduce a DavFileETag directive, but I'm hoping that could be avoided) - add MD5 and SHA1 as options to FileETag (while obviously much slower, certain applications may require "really strong" validators); for legacy reasons, these would have to be specifically configured and would not be implied by the "All" option - add a nomtime=none|weak|ignore option to FileETag (in the manner of ProxyPass) which specifies what to do when the MTime option is configured but the file's mtime is the same as r->request_time; the default value is "weak" to retain backward compatibility - allow mod_dav_fs to cache ETags in its state directory, either in its properties DBM files or elsewhere, at least when configured to generate "really strong" validators; e.g., lookup in cache whether file's inode, length, and mtime match cached values; if OK and mtime is not current time then return cached hash value; otherwise recalculate hash value and store in cache along with inode, length, and mtime for future comparisons - change ETag generation logic to respect the new hash options (MD5, SHA1); possibly add support for caching, at least in the mod_dav case where the refactored function is called from dav_fs_getetag() - change ETag generation logic (from current ap_make_etag() implementation) such that when the MTime option is not specified, no check against r->request_time is done; currently this is always done and the ETag may be made weak even if MTime is not configured - change ETag generation logic to respect the new nomtime=none|weak|ignore option when MTime is configured; in this case, when the file's mtime equals r->request_time, either generate no ETag (none), generate a weak ETag in the current manner (weak), or generate a strong ETag (ignore) Whew. Thoughts and criticisms? Obviously none of this will get done in the immediate future, but any discussion will, I hope, give me and others some direction on these issues. Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
