Re: AW: AW: Client authorization against LDAP using client certificates
On fre, 2008-07-04 at 15:43 +0200, Müller Johannes wrote: To support more than one authentication method at a time we would have to do fallback like AuthType Cert, Basic. Or for that matter AuthType Digest, Basic. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: Adding purge/invalidation to mod_cache
On fre, 2008-05-30 at 11:06 +0200, Colm MacCárthaigh wrote: Yep, Squid will delete all variations of an entity when you use Accept: */*, that isn't easy with our current approach, but I'll see what I can do - it would be nice. Squid isn't quite that good on purging variants either.. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: bugs/inappropriate coding practice discovered by interproceduralcode analysis for version 2.2.8 of Apache
On tor, 2008-05-15 at 21:00 +0200, Ruediger Pluem wrote: \apache\src\log.c(682):apr_file_puts(errstr, logf); I see nothing reasonable that we can do in this situation but ignoring the error. syslog? Regards Henrik
Re: mod_deflate Vary header tweak
On tis, 2008-04-29 at 09:42 +0200, André Malo wrote: Just to be exact - it *might* vary, depending on how no-gzip is set. But then most likely not based on Accept-Encoding but other headers such as User-Agent or the source IP... In any event I fully agree that it's then the responsibility of whatever that set the no-gzip flag to also add a proper Vary attribution to the response. Only if no-gzip is set unconditionall should Vary not be added by the one setting no-gzip. But it's acceptable (even if not 100% correct) to not add Vary when setting no-gzip if one then accepts that the uncompressed variant ay get sent to more clients by downstream cache servers. Regards Henrik
Re: Expect: non-100 messages
fre 2008-04-04 klockan 00:01 +0200 skrev Julian Reschke: I think it's clear that a proxy that sees Expect: foobar will have to immediately fail with status 417 if it doesn't know what foobar means. Yes, that's a MUST level requirement in 14.20 Expect.. third paragraph, and further clarified with another MUST level requiement in the fifth paragraph.. But older versions of HTTP/1.1 did not specify Expect and implementations based on those versions will just pass it thru as any other extension header, but I somehow doubt the proxy vendors are stuck at that.. Regards Henrik
Re: Pre-release test tarballs of httpd 1.3.40, 2.0.62 and 2.2.7 available
On sön, 2008-01-06 at 01:20 +, Nick Kew wrote: Do you mean as in tcpdump -x? I've uploaded a pair of dumps (one of client-proxy, the other of proxy-server) at the same location. tcpdump -p -i any -s 1600 -w traffic.pcap port 80 Regards Henrik
Re: thoughts on ETags and mod_dav
On sön, 2007-12-30 at 12:54 +0100, Werner Baumann wrote: Is this true. Is there no way for a cache to uniquely identify variants, but using the cache validator? Isn't this a flaw in the protocol? The Content-Location also works as a variant identifier, but requires that each variant do have a unique direct URI bypassing negotiation and is therefore not always applicable (i.e. mod_deflate). Regards Henrik
Re: svn commit: r593778 - /httpd/httpd/branches/2.2.x/STATUS
On sön, 2007-11-11 at 12:44 +, Nick Kew wrote: Note incoming c-l much earlier in the request processing cycle, and use that for ap_http_filter? This would make sense for apps that don't require c-l. Except that you would then need to buffer the whole message to compute the length.. Another way to deal with such cases is to respond with 411 before 100 Continue, and let the client compute C-L.. This is what the RFC recommends if it's known the next-hop is HTTP/1.0. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: Content-Type: application/x-www-form-urlencoded and Content-length
On tis, 2007-10-16 at 18:26 +0200, jean-frederic clere wrote: I though that a POST for a form returning Content-Type: application/x-www-form-urlencoded must have a Content-length (and no Transfer-Encoding: chunked). But I can't find this in any documentation about it. It's either content-length or chunked. One MUST be used. Content-length is strongly preferred if possible as many servers, proxies and application gateways can't handle chunked requests, but not possible if the POST:er want's to apply gzip compression or other tranfer encoding to the request. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: thoughts on ETags and mod_dav
On fre, 2007-10-12 at 00:25 -0400, Chris Darroch wrote: RFC 2616 section 14.24 (and 14.26 is similar) says, If the request would, without the If-Match header field, result in anything other than a 2xx or 412 status, then the If-Match header MUST be ignored. Thus in the typical case, if a resource doesn't exist, 404 should be returned, so ap_meets_conditions() doesn't need to handle this case at all. There is more to HTTP than only GET/HEAD. If-Match: * and If-None-Match: * is quite relevant only taking 2616 into account Most notably If-None-Match in combination with PUT, used for creating a new resource IFF one do not already exists. The first examples of PR #38024 is also quite speaking for itself on If-Match: *. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: ETag and Content-Encoding
On ons, 2007-10-03 at 14:23 +0100, Nick Kew wrote: http://issues.apache.org/bugzilla/show_bug.cgi?id=39727 We have some controversy surrounding this bug, and bugzilla has turned into a technical discussion that belongs here. Fundamental question: Does a weak ETag preclude (negotiated) changes to Content-Encoding? A weak etag means the response is semantically equivalent both at protocol and content level, and may be exchanged freely. Two resource variants with different content-encoding is not semantically equivalent as the recipient may not be able to understand an variant sent with an incompatible encoding. Sending a weak ETag do not signal that there is negotiation taking place (Vary does that), all it signals is that there may be multiple but fully compatible versions of the entity variant in circulation, or that each request results in a slightly different object where the difference has no practical meaning (i.e. embedded non-important timestamp or similar). deflates the contents. Rationale: a weak ETag promises equivalent but not byte-by-byte identical contents, and that's exactly what you have with mod_deflate. I disagree. It's two very different entities. Note: If mod_deflate is deterministic and always returning the exact same encoded version then using a strong ETag is correct. What this boils down to in the end is a) HTTP must be able to tell if an already cached variant is valid for a new request by using If-None-Match. This means that each negotiated entity needs to use a different ETag value. Accept-Encoding is no different in this any of the other inputs to content negotiation. b) If the object undergo some transformation that is not deterministic then the ETag must be weak to signify that byte-equivalence can not be guaranteed. Note regarding a: The weak/strong property of the ETag has no significance here. If-None-Match uses the weak comparision function where only the value is compared, not the strength. See 13.3.3 paragraph The weak comparison function. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: ETag and Content-Encoding
On ons, 2007-10-03 at 07:53 -0700, Justin Erenkrantz wrote: As before, I still don't understand why Vary is not sufficient to allow real-world clients to differentiate here. If Squid is ignoring Vary, then it does so at its own peril - regardless of ETags. See RFC2616 13.6 Caching Negotiated Responses and you should understand why returing an unique ETag on each variant is very important. (yes, the gzip and identity content-encoded responses is two different variants of the same resource, see earlier discussions if you don't agree on that). But yes, thinking over this a second time converting the ETag to a weak ETag is sufficient to plaster over the problem assuming the original ETag is a strong one. Not because it's correct from a protocol perspective, but becase Apache do not use the weak compare function when processing If-None-Match so in Apache's world changing a strong ETag to a weak one is about the same as assigning a new ETag. However, if the original ETag is already weak then the problem remains exactly as it is today.. Also it's also almost the same as deleting the ETag as you also destroy If-None-Match processing of filtered responses, which also is why it works.. The problem with trying to invent new ETags is that we'll almost certainly break conditional requests and I find that a total non-starter. Only because your processing of conditional requests is broken. See earlier discussions on the topic of this bug already covering this aspect. To work proper the conditionals needs to (logically) be processed when the response entity is known, this is after mod_deflate (or another filter) does it's dance to transform the response headers. Doing conditionals before the actual response headers is known is very errorprone and likely to cause false matches as you don't know this is the response which will be sent to the requestor. Your suggestion of appending ;gzip leaks information that doesn't belong in the ETag - as it is quite possible for that to appear in a valid ETag from another source - for example, it is trivial to make Subversion generate ETags containing that at the end - this would create nasty false positives and corrupt Subversion's conditional request checks. Then use something stronger, less likely to be seen in the original etag. Or fix the filter architecture to deal with conditionals proper making this question (collisions) pretty much a non-issue. Or until conditionals can be processed correctly in precense of filters drop the ETag on filtered responses where the filter do some kind of negotiation. Plus, rewriting every filter to append or delete a 'special' marker in the ETag is bound to make the situation way worse. -- justin I don't see much choice if you want to comply with the RFC requirements. The other choice is to drop the ETag header on such responses, which also is not a nice thing but at least complying with the specifications making it better than sending out the same ETag on incompatible responses from the same resource. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: ETag and Content-Encoding
On ons, 2007-10-03 at 13:29 -0700, Justin Erenkrantz wrote: The issue here is that mod_dav_svn generates an ETag (based off rev num and path) and that ETag can be later used to check for conditional requests. But, if mod_deflate always strips a 'special' tag from the ETag (per Henrik), That was only a suggestion on how you may work around your somewhat limited conditional processing capabilities wrt filters like mod_deflate, but I think it's probably the cleanest approach considering the requirements of If-Match and modifying methods (PUT, DELETE, PROPATCH etc). In that construct the tag added to the ETag by mod_deflate (or another entity transforming filter) needs to be sufficiently unique that it is not likely to be seen in the original ETag value. It's not easy to fulfill the needs of all components when doing dynamic entity transformations, especially when there is negotiation involved.. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: ETag and Content-Encoding
On ons, 2007-10-03 at 12:10 -0700, Roy T. Fielding wrote: Two resource variants with different content-encoding is not semantically equivalent as the recipient may not be able to understand an variant sent with an incompatible encoding. That is not true. The weak etag is for content that has changed but is just as good a response content as would have been received. In other words, protocol equivalence is irrelevant. By protocol semantic equivalence I mean responses being acceptable to requests. Example: Two negotiated responses with different Content-Encoding is not semantically equivalent at the HTTP level as their negotiation properties is different, and one can not substitute one for the other and expect that HTTP works. But two compressed response entities with different compression level depending on the CPU load is. Note: Ignoring transfer-encoding here as it's transport and pretty much irrelevant to the operations of the protocol other than wire message encoding/decoding. a) HTTP must be able to tell if an already cached variant is valid for a new request by using If-None-Match. This means that each negotiated entity needs to use a different ETag value. Accept-Encoding is no different in this any of the other inputs to content negotiation. That is not HTTP. Don't confuse the needs of caching with the needs of range requests -- only range requests need strong etags. I am not. I am talking about If-None-Match, not If-Range. And specifically the use of If-None-Match in 13.6 Caching Negotiated Responses. It's a very simple and effective mechanism, but requires servers to properly assign ETags to each (semantically in case of weak) unique entity of a resource (not the resource as such). Content-Encoding is no different in this than any of the other negotiated properties (Content-Type, Content-Language, whatever). Regards Henrik signature.asc Description: This is a digitally signed message part
Re: Cc: lists (Re: ETag and Content-Encoding)
On ons, 2007-10-03 at 21:44 +0100, Nick Kew wrote: The Cc: list on this and subsequent postings is screwed: (1) It includes me, so I get everything twice. OK, I can live with that, but it's annoying. Use a Message-Id filter? (2) It fails to include Henrik Nordstrom, the principal non-Apache protagonist in this discussion. No problem. I am a dev@ subscriber Regards Henrik signature.asc Description: This is a digitally signed message part
Re: ETag and Content-Encoding
On ons, 2007-10-03 at 23:52 +0200, Henrik Nordstrom wrote: That is not HTTP. Don't confuse the needs of caching with the needs of range requests -- only range requests need strong etags. I am not. I am talking about If-None-Match, not If-Range. And specifically the use of If-None-Match in 13.6 Caching Negotiated Responses. To clarify, I do not care much about strong/weak etags. This is a property of how the server generates the content with no significant relevance to caching other than that the ETags as such must be sufficiently unique (there is some cache impacts of weak etags, but not really relevant to this discussion) It anything I said seems to imply that I only want to see strong ETags then that's solely due to the use of poor language on my part and not intentional. All I am trying to say is that the responses [no Content-Encoding] and Content-Encoding: gzip from the same negotiated resource is two different variants in terms of HTTP and must carry different ETag values, if any. End. The rest is just trying to get people to see this. Apache mod_deflate do not do this when doing it's dynamic content negotiation driven transformations, and that is a bug (13.11 MUST) with quite nasty implications on caching of negotiated responses (13.6). The fact that responses with different Content-Encoding is meant to result in the same object after decoding is pretty much irrelevant here. It's two incompatible different negotated variants of the resource and is all that matters. I am also saying that the simple change of making mod_deflate transform any existing ETag into a weak one is not sufficient to address this proper, but it's quite likely to plaster over the problem for a while in most uses except when the original response ETag is already weak. It will however break completely if Apache GET If-None-Match processing is changed to use the weak comparison as mandated by the RFC (13.3.3) (to my best knowledge Apache always uses the strong function, but I may be wrong there..). Negotiation of Content-Encoding is really not any different than negotiation of any of the other content properties such as Content-Language or Content-Type. The same rules apply, and each unique outcome (variant) of the negotiation process needs to be assigned an unique ETag with no overlaps between variants, and for strong ETag's each binary version of each variant needs to have an unique ETag with no overlaps. This ignoring any out-of-band dynamic parameters to the negotiation process such as server load which might affect responses to the same request, only talking about negotiation based on request headers. For out-of-band negotiation properties it's important to respect the strong ETag binary equivalence requirements. Note: Changed language to use the more proper term variant instead of entity. Hopefully less confusing. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: Proxying OPTIONS *
On sön, 2007-09-30 at 16:54 -0700, Roy T. Fielding wrote: On Sep 30, 2007, at 4:05 PM, Nick Kew wrote: RFC2616 is clear that: 1. OPTIONS * is allowed. 2. OPTIONS can be proxied. However, it's not clear that OPTIONS * can be proxied, given that there's no natural URL representation of it (* != /*). An absolute http request-URI with no path. In RFC2068 yes, but not RFC2616.. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: FakeBasicAuth changes
On ons, 2007-09-26 at 18:06 +0200, Nick Gearls wrote: In the debug log, I can find: Faking HTTP Basic Auth header: Authorization: Basic L0M9QkUvU1Q9QmVsZ2l1bS9MPUJydXNzZWxzL089QXBwcm9hY2ggQmVsZ2l1bS9PVT1BcGFjaGUgdGVzdCBjZXJ0aWZpY2F0ZS9DTj0xMjcuMC4wLjE6cGFzc3dvcmQ= What is this header contents ? Isn't it supposed to be base64 ? I cannot decode it. It's base64. Decoding it gives /C=BE/ST=Belgium/L=Brussels/O=Approach Belgium/OU=Apache test certificate/CN=127.0.0.1:password Regards Henrik signature.asc Description: This is a digitally signed message part
Re: Fixing protocol violations in mod_proxy
On tor, 2007-09-27 at 14:08 +0100, Joe Orton wrote: From the name I'd presume these are testing a long chunk-extension, not long chunks. There is no 2616 requirement to handle arbitrarily long chunk-extensions so it's a meaningless test, unless httpd is not failing appropriately. (the chunk-extension is an optional token which can be passed after the chunk-size and is never used in practice) Well, technically there is no bound on the size of the chunk extensions in RFC2616 (same for almost all HTTP stuff, not only chunk extensions), but yes.. Regards Henri signature.asc Description: This is a digitally signed message part
Re: OpenSSL compression (Windows)
On fre, 2007-09-21 at 11:06 -0400, Tom Donovan wrote: Already-compressed data; like .jpg, .gif, .png, .zip, .tgz, .jar, and any content filtered by mod_deflate are re-compressed. This uses non-trivial CPU cycles for no (or slightly negative) benefit. Bot yes and no. Unlike HTTP, SSL compression applies to the whole datastream including request response headers, not only the object body. So exchanges of small objects over a persistent connection is likely to compress quite well even if the exchanged object as such is already fully compressed. But it may be a problem for large exchanges, or when KeepAlive is off.. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: new webaccel appliance
On tis, 2007-09-18 at 22:41 +0200, Ruediger Pluem wrote: Agreed. Depending on the answers above we may need to have a list of headers (like Accept-Encoding) where we compare the tokens in the field-value. For all other headers we would stay with the plain compare we do today. See also the TODO comments in mod_disk_cache.c::regen_key. Or you implement If-None-Match and forget about this. Except that Apache mod_deflate is still broken and returns the wrong ETag (same as the unencoded entity).. see bug #39727.. The separator is only one of many things which makes the Vary:ing headers slightly different. You also have quality parameters, locales, etc etc. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: new webaccel appliance
On tis, 2007-09-18 at 19:40 +0200, Roy T.Fielding wrote: Argued? The space does not change the value of the field (which is a comma-separated list). The question is really up to us as to how much effort we make to compare the values for equality, since the non-match just makes our cache slow and bulky. Given the number of those browsers, we should special-case this comparison. And there is also RFC2616 13.6 Caching Negotiated Responses which tells you to use If-None-Match to avoid fetching multiple copies only because of slight variations in Vary indicated request headers.. Regards Henrik signature.asc Description: This is a digitally signed message part
mod_gzip and incorrect ETag response (Bug #39727)
Just wondering if there is any plans on addressing Bug #39727, incorrect ETag on gzip:ed content (mod_deflate). Been pretty silent for a long while now, and the current implementation is a clear violation of RFC2616 and makes a mess of any shared cache trying to cache responses from mod_deflate enabled Apache servers (same problem also in mod_gzip btw..) For details about the problem this is causing see RFC2616 section 13.6, pay specific attention to the section talking about the use of If-None-Match and the implications of this when a server responds with the same ETag for the two different variants of the same resource. There is already a couple of proposed solutions, but no concensus on which is the bettter or if any of them is the proper way of addressing the issue. The problem touches - ETag generation - Module interface - Conditionals processing when there is modules altering the content Squid currently have a kind of workaround in place for the Apache problem, but relies on being able to detect broken Apache servers by the precense of Apache in the Server: header, which isn't fool prof by any means. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: mod_gzip and incorrect ETag response (Bug #39727)
On mån, 2007-08-27 at 13:09 -0400, Akins, Brian wrote: On 8/27/07 12:34 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hasn't the non-compressed variant become an extreme edge-case by now? I would certainly hope so. Unfortunately not. About 30% of our requests do not advertise gzip support.. And MSIE deserves a special mention here.. if MSIE is using a proxy and NOT configured to Use HTTP/1.1 via proxies then it will not advertise gzip support, and what is worse it will not at all understand a gzip:ed response should one be given to it.. seen at least in MSIE6, have not tested 7. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: mod_gzip and incorrect ETag response (Bug #39727)
On mån, 2007-08-27 at 22:00 +0200, Ruediger Pluem wrote: But without an adjusted conditional checking this leads to a failure of conditional requests. And I currently do not see how we can adjust ap_meets_conditions. As I understand 13.3.3 of RFC2616 the DEFLATE_OUT filter transforms a possible strong ETag of the response it filters into a weak ETag. So shouldn't we simply transform a strong ETag into a weak one? It can still be a strong one provided the server always end up in the same gzip encoding (which it should, when using gzip.. at least unless zlib is upgraded..) Regards Henrik signature.asc Description: This is a digitally signed message part
Re: [PATCH]: mod_cache: don't store headers that will never be used
On sön, 2007-07-29 at 20:34 +0200, Graham Leggett wrote: Niklas Edmundsson wrote: The solution is to NOT rewrite the on-disk headers when the following conditions are true: - The body is NOT stale (ie. HTTP_NOT_MODIFIED when revalidating) - The on-disk header hasn't expired. - The request has max-age=0 This is perfectly OK with RFC2616 10.3.5 and does NOT break anything. From 10.3.5: If a cache uses a received 304 response to update a cache entry, the cache MUST update the entry to reflect any new field values given in the response. This sinks this, unless I am misunderstanding something. Is anything about the cache updated when the headers is not rewritten, making any difference in headers or freshness on the next request? It's prefectly fine for the cache to completely ignore 304 responses if you like, except for the small cornercase if the 304 for some reason indicates a different object than expected. If the cached object is still very fresh there is not much use of updating the cache only because you can. Regards Henrik
Re: [Issue] External links @ the wiki, aka pagechange wars
ons 2007-05-30 klockan 21:39 +0100 skrev Nick Kew: It then proceeds to list HTTP status codes, and gives an errordocument for each one. Unfortunately a number of them are bogus gibberish. It's the gibberish Apache emits if you shoot yourself in the foot using Redirect. Garbage in, garbage out. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_cache: Don't update when req max-age=0?
tor 2007-05-24 klockan 13:22 +0200 skrev Niklas Edmundsson: c) RFC-wise it seems to me that a not-modified object is a not-modified object. There is no guarantee that next request will hit the same cache, so nothing can expect a max-age=0 request to force a cache to rewrite its headers and then access it with max-age!=0 and get headers of that age. Yes. RFC wise it's fine to not update the cache with the 304. Updating of cached entries is optional (RFC2616 10.3.5 last paragraph). The only MUST regardig 304 and caches is that you MUST ignore the 304 and retry the request without the conditional if the 304 indicates another object than what is currently cached (i.e. ETag or Last-Modified differs). (same section, the paragraph above) Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_cache: Don't update when req max-age=0?
tis 2007-05-22 klockan 11:40 +0200 skrev Niklas Edmundsson: -8--- Does anybody see a problem with changing mod_cache to not update the stored headers when the request has max-age=0, the body turns out not to be stale and the on-disk header hasn't expired? -8--- My understanding: It's fine in an RFC point of view for the cache to completely ignore a 304 and not update the stored entity at all. But the response to this request should be the merge of the two responses assuming the conditional was added by the cache. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: [PATCH] mod_cache 304 on HEAD (bug 41230)
mån 2007-04-16 klockan 22:58 +0200 skrev Ruediger Pluem: My first question in this situation is: What is the correct thing to do here? Generate the response from the cache (of course with the updated headers from the 304 backend response) and delete the cache entry afterwards? My understanding (regarding no-store and cache updates from 304 responses): The response you send if the client request was a unconditional SHOULD be the merged response of the old response and entity headers from the 304. But you do not need to delete the already cached response without no-store if you do not want to (change of CC is not an invalidation criteria), but you MUST NOT store the updated headers from a no-store session (no-store on either response or request). Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_ftp named virtual hosts?
ons 2007-04-11 klockan 10:46 -0500 skrev William A. Rowe, Jr.: Firefox is fine with... ftp://[EMAIL PROTECTED]:[EMAIL PROTECTED]/ but it's odd enough I wouldn't trust that to be consistently supported, and you raise a good point with proxy/firewalls. The above isn't a correctly formed URL. MUST be (RFC wise) ftp://me%40myhost:[EMAIL PROTECTED]/ which resolves the ambiguity, but is perhaps even less intelligible to the user. So if, for example, the admin wanted to define as the alternative separator, ftp://memyhost:[EMAIL PROTECTED]/ would be a little less ambiguous to browser-style schemas. Sounds reasonable. Except that it's quite impractical to use in HTML coding and very many applications (and users) rendering data into HTML will get it wrong.. Note: how most browser user agents implements the ftp:// URI scheme in general is quite far outside the standards, so it's not easy to know what will happen when trying something other than plain anonymous ftp with non-problematic characters, or even file paths.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Chunked transfer encoding on responses.
lör 2007-04-07 klockan 04:00 -0500 skrev William A. Rowe, Jr.: Of course this person is entirely wrong if the client doesn't Accept-Encoding: chunked which is exactly the logic we test. So why is there a dependency on keep-alive being enabled? Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Redundant SSL virtual host warnings?
sön 2007-04-08 klockan 18:48 +0100 skrev Jay L. T. Cornwall: So the part I'm leading up to is: how about a way to turn off these warnings? Or perhaps a simple certificate analysis to see if the wildcard matches all the virtual hosts for which it serves? Sounds good to me. Related to this, in current versions of TLS the client MAY advertise which host it is desiring to get connected to which would also require this if implemented in Apache mod_ssl. (server_name hello extension defined in RFC4366 section 3.1) Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Chunked transfer encoding on responses.
lör 2007-04-07 klockan 09:18 +0200 skrev André Malo: Hmm, you may get something wrong here. The httpd does apply chunked encoding automatically when it needs to. That is in keep-alive situations without given or determineable Content-Length. Why doesn't it do it in all other cases? My answer is: because it would be useless (as in: not of any use :-). I don't agree fully here. chunked is not useless in the non-keepalive case. What it adds there compared to the HTTP/1.0 method of just closing the connection is error detection. A receiver seeing the connection closed before the final eof chunk knows something went wrong and the response is not complete. If chunked is not used the receiver usually can not tell that there was a problem. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_ftp named virtual hosts?
fre 2007-04-06 klockan 21:37 +0100 skrev Nick Kew: What about modifying mod_ftp USER directive to accept username in the format of [EMAIL PROTECTED], and tokenize user as the username, host as the http-ish Host: virtual host name? Sounds fair, provided the protocol doesn't assign some (different) semantics to that. FTP as such doesn't assign any semantic on the syntax of usernames, but very many FTP firewalls/proxies do... The proposed [EMAIL PROTECTED] is in fact the most common FTP proxy method, meaning connect as user on host, and to login using a [EMAIL PROTECTED] style login via such proxy may be a little awkward if it at all works.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Reverse proxy mode and DAV protocol
ons 2007-04-04 klockan 13:12 +0200 skrev Julian Reschke: What I meant by reason was the fact that the Destination header (and some aspects of the If header) require absolute URIs, which is problematic when there's a reverse proxy in the transmission path. All the issues around to rewrite or not to rewrite headers go away once these headers use absolute paths (well, as long as the reverse proxy doesn't also rewrite paths, but I would claim that this is nearly impossible to get right with WebDAV). Rewriting is nearly impossible to get right. Even when limited to simple rewrites of just the host component. There is many aspects of HTTP and HTTP applications which depend on the URIs being known on both sides (client and server), and the more rewrites you do the higher the risk that some of these is overlooked and things starts to break. Most reverse proxies fail even the simple host:port based rewrites, forgetting or wrongly mapping the internal URIs in some random headers (i.e. Location, Content-Location, Destination, etc) or generated response entities. WebDAV in particular has a lot of URIs embedded in generated XML request/response entities. If you do rewrite then you better make sure you have a clear view of how the backend URIs should be mapped back to external URIs, and make sure your rewrites is applied on the complete HTTP messages headers body, both requests and responses, and not just the request line. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Limiting response body length
mån 2007-02-12 klockan 12:41 +0200 skrev Dziugas Baltrunas: To illustrate, squid for this purpose has reply_body_max_size [1] parameter. Looks like it is only Content-Length response header (if any) dependent, It also terminates requests when the amount of data transferred hits the specified limit if not known ahead by content-length. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Limiting response body length
mån 2007-02-12 klockan 17:51 + skrev Nick Kew: 2. Where there's chunked encoding, the check would best be implemented in the chunking filter. 3. A simple count/abort filter is then a last resort. And it won't be able to tell the client what's happened, because the header has already been sent (unless it buffers the entire response, which is horribly inefficient). Why differing 2 and 3? What's the benefit of doing it in the chunking filter? Just to avoid having yet another filter in the chain or something besides that? Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Limiting response body length
mån 2007-02-12 klockan 21:55 + skrev Nick Kew: Because the chunking filter is equipped to discard the chunk that takes it over the limit, and substitute end-of-chunking. If we do that in a new filter, we have to reinvent that wheel. Not sure substitue end-of-chunking is a reasonable thing here. It's an abort condition, not an EOF condition. Imho you'd better abort the flow, that way telling the client that the request failed instead of silently truncating the response. But yes, the earlier you know the limit is going to be hit the better. Just not sure you will find many where the chunk size is of such size it really makes a difference, but I may be wrong.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Regarding graceful restart
tor 2007-02-08 klockan 17:15 -0800 skrev Devi Krishna: Hi, Resending this mail, just in case anyone would have suggestions/inputs as how to fix this for connections that are in the ESTABLISHED state or FIN state or any other TCP state other than LISTEN Maybe change the wake up call to just connect briefly without actually sending a full HTTP request? This should be sufficient to wake up any processes sleeping in accept() and will not cause anything to get processed.. But I am not sure I understand the original problem so I may be completely off here.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Regarding graceful restart
fre 2007-02-09 klockan 18:34 +0100 skrev Plüm, Rüdiger, VF EITO: Not if BSD accept filters are in place. In this case the kernel waits until it sees a HTTP request until it wakes up the process. And on Linux with TCP_DEFER_ACCEPT enabled you need to sent a least one byte of data. So send two blank lines. Should satisfy both.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Mod_cache expires check
tor 2007-01-18 klockan 12:05 +0100 skrev Plüm, Rüdiger, VF EITO: Just curious: Is the Unix epoch an invalid date in the Expires header (as this in the past it does not really matter for the question whether this document is expired or not as it would be in both cases)? The RFC does not care for the UNIX epoch. Any valid date which can be represented in the textual form is a valid Expires header. And any Expires header you can not understand for whatever reason is already expired in the past. To solve this there is Cache-Control max-age which is not sensitive to the limits of your internal time representation, and which overrides Expires if present. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Mod_cache expires check
mån 2007-01-15 klockan 13:56 +0100 skrev Bart van der Schans: In r463496 the following check was added to mod_cache.c : else if (exp != APR_DATE_BAD exp r-request_time) { /* if a Expires header is in the past, don't cache it */ reason = Expires header already expired, not cacheable; } This check fails to correctly identify the expires header Thu, 01 Jan 1970 00:00:00 GMT. The apr_date_parse_http function(exps) returns (apr_time_t)0 which is equal to APR_DATE_BAD, but it should recognize it as an already expired header. Is there a way to distinct between APR_DATE_BAD and the unix epoch? Or is that considered a bad date? Well.. all bad dates should be considered expired.. If there is an Expires header and the value could not be understood properly then the object is by HTTP/1.1 already expired. RFC 2616 14.21. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Add 2.2.4 to bugzilla
lör 2007-01-13 klockan 01:06 +0100 skrev Ruediger Pluem: This could be modified to: 1. Fix on trunk = Change state in Resolved, fixed and add a comment with revision of fix. 2. Proposed for backport = Leave state in Resolved, fixed and add a comment with revision of backport proposal (STATUS file) 3. Backported = Change state to Closed and add a comment with revision of backport. Another alternative is ignoring Bugzilla for backport status, only using the STATUS file with references to Bugzilla entries needing to get backported to the release. I.e. something like 1. Fix on trunk - Resolved, Fixed. Reference to revision in bugzilla (preferably automatic) 2. Audited by a release maintainer for the main release to judge if backport needed, added to STATUS file if backport needed. - Closed. 3. Backported - cleared in STATUS file. Reference to revision in Bugzilla (preferably automatic). Another alternative which is more in line with normal release management is using the target milestone feature builtin to Bugzilla. 1. Fix on trunk - Resolved, Fixed. Reference to revision in bugzilla (preferably automatic). No target milestone assigned yet. 2. Audited by release maintainer for the main release. If backport needed added to STATUS and - New with target milestone of the release. Else closed. 3. Backported. - Resolved, Fixed. 4. Audited by next older release maintainer if any, as in 2. Repeat until all maintained releases has been covered. the beauty of the above is that it's easy to query Bugzilla for the list of bugs in various states, and that it pans out quite well when you keep maintaining older releases. Same process all the way. From my personal point of view I think it is important to add the revision number of the fix / backport to the comment because: 1. People who are interested / have the know how can easily cross check what has been changed. 2. People who only want a specific fix either because there is no newer stable version, or because they cannot do an upgrade to a later stable version for whatever reason can easily find the needed patch. Note: If you properly reference bugzilla entries in the changelog messages when committing changes then there is automated tools which can both resolve bugzilla entries and add references.. Could save you some headache but requires a bit of setup.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
ons 2006-12-13 klockan 08:51 -0500 skrev Brian Akins: However, on an initial request (ie, non-conditional) we do not have an etag from the client, we only have info like Host, URI, Accept-*, etc. So, how would the cache identify which entity to serve in this case? You have the URL and the other cached entities of that URL. It does not matter if the client request was a conditional or not. The conditions in the request is on the response to see if it should be a 200 or 304, not selectors on what entity to respond with. The selected response entity is always the same for the same request, with or without conditions. Obviously on the very first request for a given URL you have nothing, and that request is forwarded without any added condition. However, after that every Vary cache miss on that URL is a If-None-Match conditional to ask the server if any of the cached entity variants is applicable for the current request. I have read it many times.. In our case - cnn.com, etc. - we have to decided to be RFC compliant from the client to the cache server. From the cache to the origin, however, we are not as concerned. And you are free to. A reverse proxy is by definition the origin server. How it finds the content is of no concern to the RFC, just happens to be HTTP and not plain files, NFS, database or whatever. In a reverse-proxy-cache, this is not a big deal. However, in a normal forward-proxy-cache, where one does not control both cache and origin, one must be more careful. Indeed. But on the other hand it's actually reverse proxy configurations which has pushed for 13.6 compliance in Squid as it's a lot easier for processing intensive servers to evaluate If-None-Match than to render the entity again, and when you depend on Accept-Language + Accept-Encoding + User-Agent the number of request combinations becomes quite significant, especially if there maybe only is two or three variants under the URL. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
tis 2006-12-12 klockan 09:20 -0500 skrev Brian Akins: Only conditional requests from clients, generally, have If-None-Match headers. Correct. It's a conditional. These days you also see them from Squid btw. So the only way for a cache, on an initial request from a client, to determine what object to serve is to use the Client supplied information - which doesn't include an Etag, so you have to, usually, rely on URI first, and then the Vary information. Indeed. This is always the case. If-None-Match MUST NOT be used for identification of which response to use. It's a conditional only. But the unique identity of the response entity is defined by request-URI + ETag and/or Content-Location. The cache is not supposed to evaluate Accept-* headers in determining the entity identity, only the origin server. The identity of the entity is important for - Cache correctness, making sure updates invalidate cached copies where needed. - Avoiding duplicated storage There may be any number of request header combinations in any Vary dimensions all mapping to the same entity. This logics is not at all unique for Accept-Encoding. The logics on how a cache is supposed to operate applies equal to all Vary indicated headers. The specs does not make any distinction between Accept-Encoding, Accept-Language, User-Agent etc in how caches are supposed to operate. It all boils down to the entity identified by URI + ETag and/or Content-Location as returned in 200 and 304 responses allowing the cache to map requests to entities. Please see RFC2616 13.6 Caching Negotiated Responses, it explains how the RFC intends that caches should operate wrt Vary, ETag and Content-Location in full detail. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 15:35 -0800 skrev Justin Erenkrantz: As Kevin mentioned, Squid is only using the ETag and is ignoring the Vary header. That's the crux of the broken behavior on their part. If they want to point out minor RFC violations in Apache, then we can play that game as well. (mod_cache deals with this Vary/ETag case just fine, FWIW.) We are not at all ignoring Vary, but we are using If-None-Match to ask the server which one of the N already cached entities belonging to the resource URI is valid for this specific request, indirectly learning the server side content negotiation logics used. The compromise I'd be willing to accept is to have mod_deflate support the 'TE: gzip' request header and add 'gzip' to the Transfer-Encoding bit - and to prefer that over any Accept-Encoding bits that are sent. Would be a great move if you can not make it behave correct in the content space. But if you make mod_deflate behave according to the RFC then sending Content-Encoding: gzip is fine to me. But TE is a much better fit from the RFC point of view. The ETag can clearly remain the same in that case - even as a strong ETag. Yes. So, Squid can change to send along TE: gzip (if it isn't already). TE: gzip is likely to appear in 3.1. And, everyone else who sends Accept-Encoding gets the result in a way that doesn't pooch their cache if they try to do a later conditional request. As long as mod_deflate continues ignoring the RFC wrt ETag there will conflicts with various cache implementations. Is that acceptable? -- justin Intentionally not following a MUST level requirements in the RFC is not an acceptable solution in my eyes. For one thing even if you ignore everyone else it would make it impossible for Apache + mod_deflate to claim RFC 2616 HTTP/1.1 compliance. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
lör 2006-12-09 klockan 15:23 +0100 skrev Justin Erenkrantz: See the problem here is that you have to teach ap_meets_conditions() about this. An ETag of 1234-gzip needs to also satisfy a conditional request when the ETag when ap_meets_conditions() is run is 1234. In other words, ap_meets_conditions() also needs to strip -gzip if it is present before it does the ETag comparison. But, the issue is that there is no real way for us to implement this without a butt-ugly hack. Be careful there.. Blindly stripping the decoration alone won't work out. Consider for example If-None-Match. In specific If-None-Match with the ETag of the gzip variant should only return 304 if the request would cause Apache to send the gzip:ed variant of the entity. If-None-Match: list of etags returns 304 with the single correct ETag if any of the ETags in the directive matches the current response to the current request. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
lör 2006-12-09 klockan 19:02 +0100 skrev Justin Erenkrantz: AIUI, many caches do not allow the response to be cached at all if it doesn't have an ETag. Most still caches it, but for example Mozilla has bugs vrt Vary handling if there is no ETag and the conditions changes.. In the ideal world, I think a weak ETag would be the 'right' thing I don't have an opinion if you return a strong or weak ETag, but it must still be different than the ETag of the identity encoded object, not just the same ETag flagged as weak. Your main decision if the ETag on the mod_deflate generated entity should be weak or strong should be a) If the original entity is weak, then the mod_deflate generated one MUST be weak as well.. b) If mod_deflate can not be trusted to generate the exact same octet representation on each request then the ETag of the generated entity MUST be weak. Else the ETag SHOULD be strong. however, the current spec doesn't allow conditional GETs to work with weak ETags. Err.. Weak ETags is allowed in If-None-Match for GET/HEAD. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 15:40 -0800 skrev Justin Erenkrantz: I think we all (hopefully) agree that a weak ETag is ideally what mod_deflate should add. Please read RFC2616 13.6 Caching Negotiated Responses for an in-depth description of how caches should handle Vary. And please stop lying about Squid. If you think something in our cache implementation of Vary/ETag is not right then say what and back it up with RFC reference. My base requirement is that you comply with If-None-Match. For this you MUST return a different ETag. It does not matter to me if it's weak or strong as the main concerns for a cache is GET/HEAD requests. Flagging the existing ETag as weak does not make it a different ETag as If-None-Match on GET/HEAD allows for the weak comparison function where weakness is ignored. 13.3.3 Weak and Strong Validators - The weak comparison function: in order to be considered equal, both validators MUST be identical in every way, but either or both of them MAY be tagged as weak without affecting the result. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
lör 2006-12-09 klockan 05:44 -0500 skrev [EMAIL PROTECTED]: It's relevant to the extent that I think there are still some things missing from the RFCs with regards to all this which is why a piece of software like SQUID might be doing the wrong thing as well. Ater reading the RFC on this topic many many times I can not agree that it's that incomplete. The scheme set by the RFC is quite complete as long as you stay with strong ETags, allowing for cache correctness, update serialization and many good things. Situations requiring weak etags also works out pretty well for cache correctness thanks to If-None-Match, but not other operations as they are banned from both non-GET/HEAD requests and If-Match conditions. ...and, currently, if the cache has stored both a compressed and and non-compressed version of the same entity received from Apache ( sic: mod_deflate ) then the same ( strong ) ETag is returned in the conditional GET for both of the cached variants. Hmmm... begins to look like a problem... but is it really?... It is. See 13.6 Caching Negotiated Responses (all of it). And then skim over 14.26 If-None-Match, and finally read 10.3.5 304 Not Modified. Then piece them together. Also take note that nowhere is there any requirement on the cache to evaluate any server driven content negotiation inputs (Accept-XXX etc). This responsibility is fully at the origin server and reflected back via ETag. Caches evaluate Vary in finding the correct response entity. If the server says that any one of the representations, as indicated by the ETag in a 304 response, is okay, okay means fresh. Not only that, it also tells which entity among the N cached ones is valid to send as response to this request. happen to share the same (strong) ETag... if SQUID is delivering stale compressed variants when a 304 response says that the original identity variant is not fresh then that's just a colossal screw-up in the caching code itself. The 304 says Send the entity with the ETag XXX, its still fresh. Nothing more. If does not indicate if this is a identiy of gzip encoded, neither the content length, content type or anything other relevant to the actual content besides the ETag and/or Content-Location. Regardless of what the server says... how could you ever get into a situation where you would consider a compressed variant of an entity fresh when the identity version is now stale? As HTTP did not consider dynamic content encoding it sees the two entities as different objects (i.e. file and file.gz) and does not enforce a strict synchronization between the two. The only requirement set in the RFC is that the origin server SHOULD make sure the two representations on the server is in synch. is seriously confused even if the ETags are the same and the cache is sending back stale compressed variants when the identity variant ( strong ETag value ) is also stale. I don't know what condition you refer to here. the Squid cache (2.6) only remembers the last seen of the two as the later response with the same ETag overwrites the first.. There's still something missing from the specs or something. Not that I can tell. When an exact, literal interpretation of a spec tends to defy common sense... my instinct is to suspect the spec itself. In what way? There is something in your reasoning I don't get. DCE ( Dynamic Content Encoding ) is a valid concept even if it wasn't sufficiently imagined at the time the specs were codified. It works. It works WELL... and it is something that OUGHT to always be possible if the RFCs mean anything at all. And it is possible. Just that you need to pay attention to Content-Location ETag Content-MD5 as all of these is affected by dynamically altering the entity by server driven content negotiation with static or dynamic recoding of the entity. One of the main prime directives for developing Apache 2.0 at all was to finally re-org the IO stream so that schemes like DCE could be done more easily than were already being done in the 1.3.x framework. Mission was accomplished. Filtering was born. It would be a shame to consider abandoning one of the very concepts that gave birth to Apache 2.0 for the sake of a few more lines of code that could take it into the end zone. Agreed. No argument here. Transfer-encoding is about a DECADE overdue now. And as already indicated should be piece of cake to add to mod_deflate, and as HTTP support evolves in clients and caches is likely to lessen the complexity of dealing with mod_deflate and conditionals considerably. In the case of compressed entities it would still be a good idea to always add a standard header which indicates the original uncompressed content-length ( if it's possible to know it ). There is no such header in HTTP, but you are free to propose one. But it's worth noting that this information also exists in the gzip encoding. Current specs does not handle
Re: Wrong etag sent with mod_deflate
lör 2006-12-09 klockan 20:38 -0500 skrev [EMAIL PROTECTED]: If you are referring to Justin quoting ME let me supply a big fat MEA CULPA here and say right now that I haven't looked at the SQUID Vary/ETag code since the last major release and I DO NOT KNOW FOR SURE what SQUID is doing ( or not doing ) if/when it sees the same (strong) ETag for both a compressed and an identity version of the same entity. Thats not the problem. The problem is that Apache tells us that we should use whatever we got first on all subsequent responses. The chain of events leading to the problem is as follows: 1. We forward request A. Lets say this claims Accept-Encoding: gzip. 2. Apache mod_deflate returns an gzip:ed entity with ETag 6bf1f7-6-1b6d6340 and Vary: Accept-Encoding. 3. We get another request with a different Accept-Encoding value. This gets forwarded to Apache with an If-None-Match header telling the ETags of the entities we have, i.e. If-None-Match 6bf1f7-6-1b6d6340. 4. The entity hasn't changed and Apache responds with a 304 ETag 6bf1f7-6-1b6d6340 telling us that the valid response entity for this request is the previous received response with ETag 6bf1f7-6-1b6d6340, and any updated HTTP headers for that response. The problem arises in '4'. Period. I DO NOT KNOW FER SURE. Then stop saying that Squid is broken, does not implement X or broken clients such as Squid. All I ask. Fine to say that you do not understand why it is a problem for Squid. In my other posts, I was suggesting, however, that even if an upstream content server ( Apache ) is not sending separate unique ETags I am still having a hard time understanding why that would cause SQUID to deliver the wrong Varied response back to the user. Simply because Apache explicitly tells it do exactly that in it's 304 response. A compressed version of an entity IS the same entity... Nope. It's a different representation of the the same resource, but not the same entity in terms of HTTP. This is the key difference between Content-Encoding and Transfer-Encoding. Content-Encoding is a property of the entity. Transfer-Encoding is a property of how the message is sent, just like chunked, with no implications on the entity. The problem arises from trying to use Content-Encoding as if it was Transfer-Encoding. Many years ago we had the same discussion about Vary, and when dust settled all understood the problem about not sending correct Vary in the responses. Now as the cache implementation is evolving we are hitting the exact same problem again in a different form this time due to ETag collisions. I am sorry that we did not realize the full extent of the brokenness of these responses the first time when Vary was discussed. for all intents and purposes... it just has compression applied. One cannot possibly become stale without the other also being stale at the same exact moment in time. HTTP does not make this strict freshness relation between entities of the same URI, but thats a different question and generally not a big problem. At the moment... yes... I do... but if you read my other posts I also have a feeling the reason I can't quote you Verse and Chapter from an RFC is because I have a sneaking suspicion that there is something missing from the ETag/Vary scheme that can lead to problems like this... and it's NOT IN ANY RFC YET. And what I am saying is that Apache mod_deflate is violating a MUST level requirement on ETag in the RFC, thereby making the caching section of the same RFC break down. In other words... you may be doing exactly what hours and hours of reading an RFC seems to be telling you you SHOULD do... but there still might be something else that OUGHT to be done. And I am telling you that this part of the RFC is complete, save for the small detail that the server can not signal that both the compressed and identity encoding becomes stale when one changes, only one at a time. There will always be the chance that some upstream server will ( mistakenly? ) keep the same (strong) ETag on a compressed variant. True, there will always be non-compliant implementation out there in various forms, and they will continue causing problems at least for as long as it's about MUST level violations. In many cases (this one included) workarounds can be found, but that does not justify the ones being non-compliant to continue and intentionally being non-compliant when informed about the problem. People are not perfect and they make mistakes. I still think that even when that happens any caching software should follow the be lenient in what you accpet and strict in what you send rule and still use the other information available to it Which in this case is none. The only information we ever get from Apache is the ETag of the supposedly valid to use response, and possibly new freshness details about the same. ( sic: What the client really asked for and expects ) and do the right thing. Only the cache knows
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 14:47 +0100 skrev Justin Erenkrantz: mod_deflate is certainly not creating a new resource It is creating a new HTTP entity. Not a new object on your server, but still a new unique HTTP entity with different characteristics from the identity encoding. If we were talking about transfer-encoding then you would be correct as it only alter the encoding for transfer purposes and not the HTTP entity as such, but this is content-encoding. Content encoding is a property of the response entity. The main reason why things get blurred is because the creation of this entity is done on the fly instead of creating a new resource on the server like HTTP expects. As result you need to be very careful with the ETag and Content-Location headers. Not modifying ETag (including just making it weak) says that the identity and gzip encodings is semantically equivalent, and can be exchanged freely. In other words says it's fine to send gzip encoding to all clients (which we all know it's not). Not modifying/removing Content-Location is less harmful but will cause cache bouncing, as each time the cache sees a new response entity for a given URI any older ones with the same Content-Location will get removed from the cache. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 14:40 +0100 skrev Justin Erenkrantz: Uh, no, they *are* semantically equivalent - but, yes, not syntactically (bit-for-bit) equivalent. You inflate the response and you get exactly what the ETag originally represented. To entities is only semantically equivalent if they can be interchanged freely at the HTTP level with no semantic difference in the end-user result. identiy and gzip encoding can not be said to bidirectionally have the same semantic meaning as a gzip encoded entity is pure rubbish to a recipient not understanding gzip. No more than a Swedish translation of a document could be said to be semantically equivalent to a Greek translation of the same document. Content-Encoding is a case of unidirectional semantic equivalence where the identity encoding can be substituted for the gzip encoding with kept semantics, but for ETag bidirectional semantic equivalence is required which is not fulfilled as gzip encoding can not be substituted for identity encoding without risking a significant semantic difference to the recipient. The only real difference of a weak etag compared to a strong one is that the weak one does not guarantee octet equality. All other restrictions apply. Plus a bunch of protocol restrictions where weak etags is not allowed to be used. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
tor 2006-12-07 klockan 02:42 +0100 skrev Justin Erenkrantz: -1 on adding semantic junk to the existing ETag (and keeping it strong); that's blatantly uncool. Any generated ETag from mod_deflate should either be the original strong version or a weak version of any previous etag. mod_deflate by *definition* is just creating a weak version of the prior entity. You basically only have two choices: a) Make mod_deflate not send an ETag on modified responses. b) Modify the value (within the quotes) of the ETag somehow. And if mod_deflate can not be trusted to always return the same octet representation make sure to use an weak ETag unless the ETag generation is also tightly coupled to the octet representation guaranteing a different ETag should mod_deflate encode slightly different. And to be fully compliant you also need to pay attention to the Content-Location header. Here I don't see much choice but to not send Content-Location in mod_deflate mangled responses (but can be kept on the original response, no problem there). RFC 2616 13.6 Caching Negotiated Responses, last paragraph. mod_deflate does properly stick in the Vary header, so caches already have enough knowledge to know what's going on anyway even without a fix. (This is probably why mod_cache doesn't flag it as an error.) My opinion is to fix the protocol and move on... -- justin The protocol is quite fine as it is, and not easy to change. As it is now it's mainly a matter of understanding that mod_deflate does create a completely new entity from the original one. To the protocol it's exactly the same as when using mod_negotiate and having both the identity and gzip encoded entities on disk. The fact that you do this encoding on the fly is of no concern to HTTP. Another option is to explore the use gzip transfer encoding instead of content encodin. In transfer encoding none of these problems apply as it's done on the transport level and not entity level, but it's not that well supported in clients unfortunately.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 14:40 +0100 skrev Justin Erenkrantz: Uh, no, they *are* semantically equivalent - but, yes, not syntactically (bit-for-bit) equivalent. You inflate the response and you get exactly what the ETag originally represented. To entities is only semantically equivalent if they can be interchanged freely at the HTTP level with no semantic difference in the end-user result. identiy and gzip encoding can not be said to bidirectionally have the same semantic meaning as a gzip encoded entity is pure rubbish to a recipient not understanding gzip. No more than a Swedish translation of a document could be said to be semantically equivalent to a Greek translation of the same document. Content-Encoding is a case of unidirectional semantic equivalence where the identity encoding can be substituted for the gzip encoding with kept semantics, but for ETag bidirectional semantic equivalence is required which is not fulfilled as gzip encoding can not be substituted for identity encoding without risking a significant semantic difference to the recipient. The only real difference of a weak etag compared to a strong one is that the weak one does not guarantee octet equality. All other restrictions apply. Plus a bunch of protocol restrictions where weak etags is not allowed to be used. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 15:03 -0500 skrev [EMAIL PROTECTED]: To ONLY ever use ETag as a the end-all-be-all for variant identification is, itself, a mistake. Well, this area of the HTTP specs is pretty clear in my eyes, but then I have read it up and down too many times unwinding the tangled web which is found in there. An entity (including encoding) is identified by request URI + Content-Location. A specific version of a entity is identified by it's unique ETag. Vary: tells which headers the server used in server driven negotiation of which entity to respond with. Accept-Encoding is one input to this. A strong ETag must be unique among all variants of a given URI, that is all different forms of entities that may reside under the URI and all their past and future versions. A weak ETag may be shared by two variants/versions if and only if they can be considered semantically equivalent and mutually exchangeable at the HTTP level with no semantic loss. For example different levels of compression, or minor changes of negligible or no importance to the semantics of the resource (hit counter example in the specs). Both pieces of software ( SQUID and Apache ) need just a little more code to finally get it right. It's correct that the current Squid implementation is not flawless. Most notably it has very poor handling of cache invalidations at the moment. Don't forget about Content-Length, either. If 2 different responses for the same requested entity come back with 2 different Content-Lengths and there is no Vary: or ETag then regardless of any other protocol semantics the only SANE thing for any caching software to do is to recoginze that, assume it is not a mistake, and REPLACE the existing entity with the new one. Caches tend to by nature replace what they have with what they get. Yea.. sure... you might get a lot of cache bounce that way but at least you are returning a fresh copy. How would Content-Length changes cause cache bouncing? It is not possible for 2 EXACTLY identical reprsentations of the same requested entity to have different content lengths. If the lengths are different, then SOMETHING is different with regards to what you have in your cache. Yes, but when would this be seen? We only get the ETag from Apache, not the Content-Length. Specs forbids Apache from sending the Content-Length or other entity headers in 304 responses partly to make sure entities do not get corrupted by errors in the origin server side implementation of server driven content negotiation. No protocol ( sic: set of rules ) can ever cover all the realities. ( Good ) software knows how to make common sense as well. Indeed and is why we are going slow on implementing the more advanced features of the specs. But violating MUST level protocol requirements is not common sense. And if you actually follow the specs these parts do make great sense once you get the picture that ETags MUST be unique for all entity versions of a given URI. The only poor part I have seen in this area of the specs is that the If-None-Match condition is perhaps a bit blunt only telling the end results, the ETag of the valid response entity of a negotiated resource, not how the server came to that conclusion. This adds a bit more roundtrips to the origin than would be required only to figure out that Content-Language: en is ok both for Accept-Language: en and Accept-Language: en, sv, but thats about it. (yes, I intentioanlly avoided Accept-Encoding here to illustrate the point, the mechanism is the exact same however). RFC 2616 3.11 Entity Tags A strong entity tag MAY be shared by two entities of a resource only if they are equivalent by octet equality. An entity tag MUST be unique across all versions of all entities associated with a particular resource. A given entity tag value MAY See also 14.26 If-None-Match, and numerous other references to ETag. I can bombard you with long chains of supporting claims from the RFC if you like depending on which parts of the equation you feel is loosely connected. Just tell me which part you don't trust and I'll happily help you see the light. a) That identity and gzip content-encoding of the same resource represents different entities of the same resource b) That different entities of the same resource MUST have different (strong) ETags. c) That gzip and identity encoding is not semantically equivalent. d) That the weak ETag W/X is semantically equivalent to the strong ETag X with the same quoted value. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 11:44 -0800 skrev Roy T. Fielding: In other words, Henrik has it right. It is our responsibility to assign different etags to different variants because doing otherwise may result in errors on shared caches that use the etag as a variant identifier. Thanks ;-) Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
fre 2006-12-08 klockan 22:28 +0100 skrev Henrik Nordstrom: A strong ETag must be unique among all variants of a given URI, that is all different forms of entities that may reside under the URI and all their past and future versions. Forgot the last piece there which clears many doubts: Entities from different URIs may share the same ETag (or even Content-Location) with no implications on any form of equivalence between the two. Also I am sorry that my use of terms is a bit messed up wrt entity vs variant vs version, but so is the specs.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: 2.2.x as a transparent reverse proxy
fre 2006-12-08 klockan 22:04 + skrev Nick Kew: How does a transparent reverse proxy differ from a reverse proxy as we know and document it? The Linux cttproxy patch allows proxies to be fully transparent masquerading using the original clients source address on the connections to the backend. Has some concerns at the TCP/IP layer and a lot of restrictions on how it can be deployed, but eases deployment in some cases as the origin server logs is not so disturbed.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
tor 2006-12-07 klockan 02:31 +0100 skrev Justin Erenkrantz: mod_deflate should just add the W/ prefix if it's not already there. -- justin No, that won't work. You still be just as non-conforming by doing that. But if mod_deflate may to produce different octet-level results on different requests for the same original entity then it must do this in addition to other transforms of the ETag. The identity and gzip encodings is not bidirectionally semantically equivalent, and additionally normal conditional comparing W/X to X is true. See RFC 2616 3.3.3 Weak and Strong Validators You must make the value of the ETag differ between the two entities. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Wrong etag sent with mod_deflate
tor 2006-12-07 klockan 02:42 +0100 skrev Justin Erenkrantz: -1 on adding semantic junk to the existing ETag (and keeping it strong); that's blatantly uncool. Any generated ETag from mod_deflate should either be the original strong version or a weak version of any previous etag. mod_deflate by *definition* is just creating a weak version of the prior entity. You basically only have two choices: a) Make mod_deflate not send an ETag on modified responses. b) Modify the value (within the quotes) of the ETag somehow. And if mod_deflate can not be trusted to always return the same octet representation make sure to use an weak ETag unless the ETag generation is also tightly coupled to the octet representation guaranteing a different ETag should mod_deflate encode slightly different. And to be fully compliant you also need to pay attention to the Content-Location header. Here I don't see much choice but to not send Content-Location in mod_deflate mangled responses (but can be kept on the original response, no problem there). RFC 2616 13.6 Caching Negotiated Responses, last paragraph. mod_deflate does properly stick in the Vary header, so caches already have enough knowledge to know what's going on anyway even without a fix. (This is probably why mod_cache doesn't flag it as an error.) My opinion is to fix the protocol and move on... -- justin The protocol is quite fine as it is, and not easy to change. As it is now it's mainly a matter of understanding that mod_deflate does create a completely new entity from the original one. To the protocol it's exactly the same as when using mod_negotiate and having both the identity and gzip encoded entities on disk. The fact that you do this encoding on the fly is of no concern to HTTP. Another option is to explore the use gzip transfer encoding instead of content encodin. In transfer encoding none of these problems apply as it's done on the transport level and not entity level, but it's not that well supported in clients unfortunately.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_disk_cache summarization
fre 2006-10-27 klockan 23:33 +0200 skrev Graham Leggett: A second approach could involve the use of the Etags associated with file responses, which in the case of files served off disk (as I understand it) are generated based on inode number and various other uniquely file specific information. How ETag:s is generated is extremely server dependent, and not guaranteed to be unique across different URLs. You can not at all count on two files having the same ETag but different URLs to be the same file, unless you also is responsible for the server providing all the URLs in question and know that the server guarantees this behavior of ETag beyond what the HTTP specification says. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_disk_cache summarization
lör 2006-10-28 klockan 00:21 +0200 skrev Henrik Nordstrom: fre 2006-10-27 klockan 23:33 +0200 skrev Graham Leggett: A second approach could involve the use of the Etags associated with file responses, which in the case of files served off disk (as I understand it) are generated based on inode number and various other uniquely file specific information. How ETag:s is generated is extremely server dependent, and not guaranteed to be unique across different URLs. You can not at all count on two files having the same ETag but different URLs to be the same file, unless you also is responsible for the server providing all the URLs in question and know that the server guarantees this behavior of ETag beyond what the HTTP specification says. Content-MD5 may be possible to use for this purpose of identifying the same file from different URLs, if it wasn't for the stupid facts that a) Few if any servers send Content-MD5 b) The HTTP standard is a bit ambiguous on the meaning Content-MD5 and can mean different things on 204 responses depending on who reads the spec.. c) There is no conditional to ask for a file only if the Content-MD5 differs. Only way to get the Content-MD5 without the actual content if it's the same is to use a HEAD request and manually compare the header. And due to the ambiguity mentioned above I would not count on Content-MD5 being correct in HEAD responses.. d) And even if the Content-MD5 is the same it says nothing about the entity headers (content-type etc). Two responses with different entity headers are different responses even if their body is the same. If you do use Content-MD5 or a similar checksum you better verify the checksum to match the content before migrating it to another URL. If not you could open yourself up to cache pollution attacks. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Issue with persistent http proxy backend connection
tor 2006-10-12 klockan 13:19 +0200 skrev Ruediger Pluem: I do not think that this matters all too much, because the backend closes the connection *immediately* after sending out the response. To help this, perhaps there should be a check just before sending the response as well, and send Connection: close if it's likely this thread should terminate after this response? MaxRequestsPerChild certainly can be evaluated before the response is sent. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities
tor 2006-09-21 klockan 12:18 +0200 skrev Plüm, Rüdiger, VF EITO: IMHO this waits for a DoS to happen if the requestor can trick the backend to get stuck with the request. So one request of this type would be sufficient to DoS the whole server if the timeout is not very short. How would this be more of a DoS than just flooding the proxy with connections to a non-existing server? The delay is per URL, not a while requested site. Sure, an attacker can use this to make it look like a site with this problem is non-responsive for users via the cache, but it's not that difficult to handle. Maybe you already do what we do in Squid: ignore the cache on reload request. Solves the problem quite nicely. However, in Squid we do start transmitting what is available immediately, but our design is somewhat different. To avoid DoS all you need to do is keep monitoring the client connection, and abort if the client aborts while waiting for the entity to become available. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities
tor 2006-09-21 klockan 00:19 +0300 skrev Issac Goldstand: The only really relevant line I saw (in a quick 15 minute review) is RFC 2616-3.6 (regarding transfer-encodings): Transfer-coding values are used to indicate an encoding transformation that has been, can be, or may need to be applied to an entity-body in order to ensure safe transport through the network. This differs from a content coding in that the transfer-coding is a property of the message, not of the original entity. Based on that, it seems to be ok. However, we'd have to remove strong ETags as a side-effect if it was done (since strong ETags change when entity headers change). Hmm... transfer-encoding is a function of the transport alone, not the entity. Don't mix these up. The entity is unaltered by transfer-encoding, it's only how it's transferred over the transport (i.e. TCP) which is altered. This also means that transfer-encoding is hop-by-hop. In applications layered along the intentions of the RFC then a cache (any level, browser or proxy) would never see any transfer encoding as this should have been decoded by the receiving protocol handler, only identity encoding should be seen. This is different from Content-Encoding which does alter the entity as such. Modifications of the Content-Encoding must also account for ETag:s as no two entity variants of the same URL may carry the same strong ETag. And move trailers into headers (another reason to rewrite the headers file at the end). And probably other things which I'm not think of... Thats always ok. the division of main and trailer headers is also mainly a transport thing. Only available with chunked encoding btw as it's the only transfer mechanism which allows for a tralier. The specs allows you to drop any trailer headers if hard to deal with or to merge them with the main header if you can. In direct chunked-chunked proxy transfer you should proxy the trailer as well. In chunked-identiy transfer (i.e. HTTP/1.1 response - HTTP/1.0 client) the tralier is silently dropped as there is no means to transfer the trailer in HTTP/1.0, and you can't rewind a TCP stream to add data earlier... Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: apache 2.2 crashes at the start time in mod_dbd.c then preparing AuthDBDUserPWQuery
sön 2006-07-23 klockan 00:10 +0100 skrev Nick Kew: But if you look at the full traceback and crossreference it to the source, I think that looks improbable. Do you have sufficient gcc/gdb expertise to shed real light on this? Not really, only experience.. From what I have seen the causes to significantly garbled/nonsense arguments in stack traces is 7 of 10 -O2 somehow messing with the arguments or otherwise making gdb confused 2 of 10 smashed stacks in the called function 1 of 10 bugs in the calling function or earlier actually passing non-sense data to the function. The first and last can be identified by going up to the calling function and inspecting what the arguments should have been. The middle by hexdumping the stack contents, and looking at matching usage of nearby local arrays. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: apache 2.2 crashes at the start time in mod_dbd.c then preparing AuthDBDUserPWQuery
lör 2006-07-22 klockan 18:00 +0100 skrev Nick Kew: #3 0x08081d67 in ap_dbd_prepare (s=0x8daf5a0, query=0x Address 0x out of bounds, label=0x Address 0x out of bounds) at mod_dbd.c:150 Note: Could maybe be -O2 or higher optimizing away the variables when they are no longer needed. Seen such things happen very often on many platforms. Does not need to indicate a bug or even a problem.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: [Patch]: Do not compress bodies of header only requests in mod_deflate
tis 2006-07-18 klockan 00:47 +0200 skrev Ruediger Pluem: And this is exactly the question: Is it ok for the HEAD response to differ from the GET response with respect to T-E and C-L headers It's not in case of C-L. For a starter HEAD is used by quite many robots with simplistic caches to verify that the copy they have is current and correct. The RFC is quite strict that entity headers of a HEAD response SHOULD match those of a identical GET request, so difference in C-L is not acceptable by the RFC. (T-E is transport, and may differ) It's a pity T-E gzip isn't deployable. Would eleminate this whole question as it's not an entity transform.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Compiling a C++ module with g++ on Solaris
sön 2006-06-11 klockan 18:17 +0100 skrev Phil Endecott: Is it possible that there is some libstdc++ initialisation that hasn't happened? I could imagine that this would require special support from the linker or the dlopen stuff, and that that behaves differently with Sun's libc and linker than on Linux. Not too unlikely. A simple test is to try if it makes any difference if you have the Apache binary linked by g++ instead of gcc. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel