fre 2006-12-08 klockan 15:03 -0500 skrev [EMAIL PROTECTED]: > To ONLY ever use ETag as a the end-all-be-all for variant > identification is, itself, a mistake.
Well, this area of the HTTP specs is pretty clear in my eyes, but then I have read it up and down too many times unwinding the tangled web which is found in there. An entity (including encoding) is identified by request URI + Content-Location. A specific version of a "entity" is identified by it's unique ETag. Vary: tells which headers the server used in server driven negotiation of which entity to respond with. Accept-Encoding is one input to this. A strong ETag must be unique among all variants of a given URI, that is all different forms of entities that may reside under the URI and all their past and future versions. A weak ETag may be shared by two variants/versions if and only if they can be considered semantically equivalent and mutually exchangeable at the HTTP level with no semantic loss. For example different levels of compression, or minor changes of negligible or no importance to the semantics of the resource (hit counter example in the specs). > Both pieces of software ( SQUID and Apache ) need just a > little more code to finally "get it right". It's correct that the current Squid implementation is not flawless. Most notably it has very poor handling of cache invalidations at the moment. > Don't forget about "Content-Length", either. > If 2 different responses for the same requested entity come > back with 2 different Content-Lengths and there is no "Vary:" > or "ETag" then regardless of any other protocol semantics the > only SANE thing for any caching software to do is to recoginze > that, assume it is not a mistake, and REPLACE the existing > entity with the new one. Caches tend to by nature replace what they have with what they get. > Yea.. sure... you might get a lot of cache bounce that way but > at least you are returning a fresh copy. How would Content-Length changes cause cache bouncing? > It is not possible for 2 EXACTLY identical reprsentations of the > same requested entity to have different content lengths. > If the lengths are different, then SOMETHING is different with > regards to what you have in your cache. Yes, but when would this be seen? We only get the ETag from Apache, not the Content-Length. Specs forbids Apache from sending the Content-Length or other entity headers in 304 responses partly to make sure entities do not get corrupted by errors in the origin server side implementation of server driven content negotiation. > No protocol ( sic: set of rules ) can ever cover all the realities. > ( Good ) software knows how to make "common sense" > as well. Indeed and is why we are going slow on implementing the more advanced features of the specs. But violating MUST level protocol requirements is not "common sense". And if you actually follow the specs these parts do make great sense once you get the picture that ETags MUST be unique for all entity versions of a given URI. The only poor part I have seen in this area of the specs is that the If-None-Match condition is perhaps a bit blunt only telling the end results, the ETag of the valid response entity of a negotiated resource, not how the server came to that conclusion. This adds a bit more roundtrips to the origin than would be required only to figure out that "Content-Language: en" is ok both for "Accept-Language: en" and "Accept-Language: en, sv", but thats about it. (yes, I intentioanlly avoided Accept-Encoding here to illustrate the point, the mechanism is the exact same however). RFC 2616 3.11 Entity Tags A "strong entity tag" MAY be shared by two entities of a resource only if they are equivalent by octet equality. An entity tag MUST be unique across all versions of all entities associated with a particular resource. A given entity tag value MAY See also 14.26 If-None-Match, and numerous other references to ETag. I can bombard you with long chains of supporting claims from the RFC if you like depending on which parts of the equation you feel is loosely connected. Just tell me which part you don't trust and I'll happily help you see the light. a) That identity and gzip content-encoding of the same resource represents different entities of the same resource b) That different entities of the same resource MUST have different (strong) ETags. c) That gzip and identity encoding is not semantically equivalent. d) That the weak ETag W/"X" is semantically equivalent to the strong ETag "X" with the same quoted value. Regards Henrik
signature.asc
Description: Detta är en digitalt signerad meddelandedel