Re: AW: AW: Client authorization against LDAP using client certificates

2008-07-04 Thread Henrik Nordstrom
On fre, 2008-07-04 at 15:43 +0200, Müller Johannes wrote:

 To support more than one authentication method at a time we would have to do 
 fallback like AuthType Cert, Basic.

Or for that matter AuthType Digest, Basic.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: Adding purge/invalidation to mod_cache

2008-05-30 Thread Henrik Nordstrom
On fre, 2008-05-30 at 11:06 +0200, Colm MacCárthaigh wrote:

 Yep, Squid will delete all variations of an entity when you use
 Accept: */*, that isn't easy with our current approach, but I'll see
 what I can do - it would be nice.

Squid isn't quite that good on purging variants either..

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: bugs/inappropriate coding practice discovered by interproceduralcode analysis for version 2.2.8 of Apache

2008-05-15 Thread Henrik Nordstrom
On tor, 2008-05-15 at 21:00 +0200, Ruediger Pluem wrote:
   \apache\src\log.c(682):apr_file_puts(errstr, logf);
 
 I see nothing reasonable that we can do in this situation but ignoring the 
 error.

syslog?

Regards
Henrik



Re: mod_deflate Vary header tweak

2008-04-29 Thread Henrik Nordstrom
On tis, 2008-04-29 at 09:42 +0200, André Malo wrote:

 Just to be exact - it *might* vary, depending on how no-gzip is set. 

But then most likely not based on Accept-Encoding but other headers such
as User-Agent or the source IP...

In any event I fully agree that it's then the responsibility of whatever
that set the no-gzip flag to also add a proper Vary attribution to the
response.

Only if no-gzip is set unconditionall should Vary not be added by the
one setting no-gzip. But it's acceptable (even if not 100% correct) to
not add Vary when setting no-gzip if one then accepts that the
uncompressed variant ay get sent to more clients by downstream cache
servers.

Regards
Henrik



Re: Expect: non-100 messages

2008-04-05 Thread Henrik Nordstrom
fre 2008-04-04 klockan 00:01 +0200 skrev Julian Reschke:

 I think it's clear that a proxy that sees Expect: foobar will have to 
 immediately fail with status 417 if it doesn't know what foobar means.

Yes, that's a MUST level requirement in 14.20 Expect.. third paragraph,
and further clarified with another MUST level requiement in the fifth
paragraph..

But older versions of HTTP/1.1 did not specify Expect and
implementations based on those versions will just pass it thru as any
other extension header, but I somehow doubt the proxy vendors are stuck
at that..

Regards
Henrik



Re: Pre-release test tarballs of httpd 1.3.40, 2.0.62 and 2.2.7 available

2008-01-05 Thread Henrik Nordstrom
On sön, 2008-01-06 at 01:20 +, Nick Kew wrote:

 Do you mean as in tcpdump -x?  I've uploaded a pair of dumps
 (one of client-proxy, the other of proxy-server) at the same
 location.

tcpdump -p -i any -s 1600 -w traffic.pcap port 80

Regards
Henrik



Re: thoughts on ETags and mod_dav

2007-12-30 Thread Henrik Nordstrom
On sön, 2007-12-30 at 12:54 +0100, Werner Baumann wrote:

 Is this true. Is there no way for a cache to uniquely identify variants, 
 but using the cache validator? Isn't this a flaw in the protocol?

The Content-Location also works as a variant identifier, but requires
that each variant do have a unique direct URI bypassing negotiation and
is therefore not always applicable (i.e. mod_deflate).

Regards
Henrik



Re: svn commit: r593778 - /httpd/httpd/branches/2.2.x/STATUS

2007-11-11 Thread Henrik Nordstrom
On sön, 2007-11-11 at 12:44 +, Nick Kew wrote:

 Note incoming c-l much earlier in the request processing cycle,
 and use that for ap_http_filter?  This would make sense for apps
 that don't require c-l.

Except that you would then need to buffer the whole message to compute
the length..

Another way to deal with such cases is to respond with 411 before 100
Continue, and let the client compute C-L.. This is what the RFC
recommends if it's known the next-hop is HTTP/1.0.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: Content-Type: application/x-www-form-urlencoded and Content-length

2007-10-16 Thread Henrik Nordstrom
On tis, 2007-10-16 at 18:26 +0200, jean-frederic clere wrote:

 I though that a POST for a form returning Content-Type:
 application/x-www-form-urlencoded must have a Content-length (and no
 Transfer-Encoding: chunked). But I can't find this in any documentation
 about it.

It's either content-length or chunked. One MUST be used. Content-length
is strongly preferred if possible as many servers, proxies and
application gateways can't handle chunked requests, but not possible if
the POST:er want's to apply gzip compression or other tranfer encoding
to the request.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: thoughts on ETags and mod_dav

2007-10-12 Thread Henrik Nordstrom
On fre, 2007-10-12 at 00:25 -0400, Chris Darroch wrote:

RFC 2616 section 14.24 (and 14.26 is similar) says, If the request
 would, without the If-Match header field, result in anything other than a
 2xx or 412 status, then the If-Match header MUST be ignored.  Thus in
 the typical case, if a resource doesn't exist, 404 should be returned,
 so ap_meets_conditions() doesn't need to handle this case at all.

There is more to HTTP than only GET/HEAD.

If-Match: *
and
If-None-Match: *

is quite relevant only taking 2616 into account

Most notably If-None-Match in combination with PUT, used for creating a
new resource IFF one do not already exists.

The first examples of PR #38024 is also quite speaking for itself on
If-Match: *.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: ETag and Content-Encoding

2007-10-03 Thread Henrik Nordstrom
On ons, 2007-10-03 at 14:23 +0100, Nick Kew wrote:
 http://issues.apache.org/bugzilla/show_bug.cgi?id=39727
 
 We have some controversy surrounding this bug, and bugzilla
 has turned into a technical discussion that belongs here.
 
 Fundamental question:  Does a weak ETag preclude (negotiated) 
 changes to Content-Encoding?

A weak etag means the response is semantically equivalent both at
protocol and content level, and may be exchanged freely.

Two resource variants with different content-encoding is not
semantically equivalent as the recipient may not be able to understand
an variant sent with an incompatible encoding.

Sending a weak ETag do not signal that there is negotiation taking place
(Vary does that), all it signals is that there may be multiple but fully
compatible versions of the entity variant in circulation, or that each
request results in a slightly different object where the difference has
no practical meaning (i.e. embedded non-important timestamp or similar).

 deflates the contents.  Rationale: a weak ETag promises
 equivalent but not byte-by-byte identical contents, and
 that's exactly what you have with mod_deflate.

I disagree. It's two very different entities.

Note: If mod_deflate is deterministic and always returning the exact
same encoded version then using a strong ETag is correct.


What this boils down to in the end is

a) HTTP must be able to tell if an already cached variant is valid for a
new request by using If-None-Match. This means that each negotiated
entity needs to use a different ETag value. Accept-Encoding is no
different in this any of the other inputs to content negotiation.

b) If the object undergo some transformation that is not deterministic
then the ETag must be weak to signify that byte-equivalence can not be
guaranteed.

Note regarding a: The weak/strong property of the ETag has no
significance here. If-None-Match uses the weak comparision function
where only the value is compared, not the strength. See 13.3.3 paragraph
The weak comparison function.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: ETag and Content-Encoding

2007-10-03 Thread Henrik Nordstrom
On ons, 2007-10-03 at 07:53 -0700, Justin Erenkrantz wrote:

 As before, I still don't understand why Vary is not sufficient to
 allow real-world clients to differentiate here.  If Squid is ignoring
 Vary, then it does so at its own peril - regardless of ETags.

See RFC2616 13.6 Caching Negotiated Responses and you should understand
why returing an unique ETag on each variant is very important. (yes, the
gzip and identity content-encoded responses is two different variants of
the same resource, see earlier discussions if you don't agree on that).

But yes, thinking over this a second time converting the ETag to a weak
ETag is sufficient to plaster over the problem assuming the original
ETag is a strong one. Not because it's correct from a protocol
perspective, but becase Apache do not use the weak compare function when
processing If-None-Match so in Apache's world changing a strong ETag to
a weak one is about the same as assigning a new ETag.

However, if the original ETag is already weak then the problem remains
exactly as it is today..

Also it's also almost the same as deleting the ETag as you also destroy
If-None-Match processing of filtered responses, which also is why it
works..

 The problem with trying to invent new ETags is that we'll almost
 certainly break conditional requests and I find that a total
 non-starter.

Only because your processing of conditional requests is broken. See
earlier discussions on the topic of this bug already covering this
aspect.

To work proper the conditionals needs to (logically) be processed when
the response entity is known, this is after mod_deflate (or another
filter) does it's dance to transform the response headers. Doing
conditionals before the actual response headers is known is very
errorprone and likely to cause false matches as you don't know this is
the response which will be sent to the requestor.

 Your suggestion of appending ;gzip leaks information
 that doesn't belong in the ETag - as it is quite possible for that to
 appear in a valid ETag from another source - for example, it is
 trivial to make Subversion generate ETags containing that at the end -
 this would create nasty false positives and corrupt Subversion's
 conditional request checks.

Then use something stronger, less likely to be seen in the original
etag. Or fix the filter architecture to deal with conditionals proper
making this question (collisions) pretty much a non-issue.

Or until conditionals can be processed correctly in precense of filters
drop the ETag on filtered responses where the filter do some kind of
negotiation.

 Plus, rewriting every filter to append or
 delete a 'special' marker in the ETag is bound to make the situation
 way worse.  -- justin

I don't see much choice if you want to comply with the RFC requirements.
The other choice is to drop the ETag header on such responses, which
also is not a nice thing but at least complying with the specifications
making it better than sending out the same ETag on incompatible
responses from the same resource.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: ETag and Content-Encoding

2007-10-03 Thread Henrik Nordstrom
On ons, 2007-10-03 at 13:29 -0700, Justin Erenkrantz wrote:

 The issue here is that mod_dav_svn generates an ETag (based off rev
 num and path) and that ETag can be later used to check for conditional
 requests.  But, if mod_deflate always strips a 'special' tag from the
 ETag (per Henrik),

That was only a suggestion on how you may work around your somewhat
limited conditional processing capabilities wrt filters like
mod_deflate, but I think it's probably the cleanest approach considering
the requirements of If-Match and modifying methods (PUT, DELETE,
PROPATCH etc). In that construct the tag added to the ETag by
mod_deflate (or another entity transforming filter) needs to be
sufficiently unique that it is not likely to be seen in the original
ETag value.

It's not easy to fulfill the needs of all components when doing dynamic
entity transformations, especially when there is negotiation involved..

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: ETag and Content-Encoding

2007-10-03 Thread Henrik Nordstrom
On ons, 2007-10-03 at 12:10 -0700, Roy T. Fielding wrote:

  Two resource variants with different content-encoding is not
  semantically equivalent as the recipient may not be able to understand
  an variant sent with an incompatible encoding.
 
 That is not true.  The weak etag is for content that has changed but
 is just as good a response content as would have been received.
 In other words, protocol equivalence is irrelevant.

By protocol semantic equivalence I mean responses being acceptable to
requests.

Example: Two negotiated responses with different Content-Encoding is not
semantically equivalent at the HTTP level as their negotiation
properties is different, and one can not substitute one for the other
and expect that HTTP works.

But two compressed response entities with different compression level
depending on the CPU load is.

Note: Ignoring transfer-encoding here as it's transport and pretty much
irrelevant to the operations of the protocol other than wire message
encoding/decoding.

  a) HTTP must be able to tell if an already cached variant is valid  
  for a
  new request by using If-None-Match. This means that each negotiated
  entity needs to use a different ETag value. Accept-Encoding is no
  different in this any of the other inputs to content negotiation.
 
 That is not HTTP.  Don't confuse the needs of caching with the needs
 of range requests -- only range requests need strong etags.

I am not. I am talking about If-None-Match, not If-Range. And
specifically the use of If-None-Match in 13.6 Caching Negotiated
Responses.

It's a very simple and effective mechanism, but requires servers to
properly assign ETags to each (semantically in case of weak) unique
entity of a resource (not the resource as such).

Content-Encoding is no different in this than any of the other
negotiated properties (Content-Type, Content-Language, whatever).

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: Cc: lists (Re: ETag and Content-Encoding)

2007-10-03 Thread Henrik Nordstrom
On ons, 2007-10-03 at 21:44 +0100, Nick Kew wrote:

 The Cc: list on this and subsequent postings is screwed:
 
   (1) It includes me, so I get everything twice.
   OK, I can live with that, but it's annoying.

Use a Message-Id filter?

   (2) It fails to include Henrik Nordstrom, the principal 
   non-Apache protagonist in this discussion.

No problem. I am a dev@ subscriber

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: ETag and Content-Encoding

2007-10-03 Thread Henrik Nordstrom
On ons, 2007-10-03 at 23:52 +0200, Henrik Nordstrom wrote:
  That is not HTTP.  Don't confuse the needs of caching with the needs
  of range requests -- only range requests need strong etags.
 
 I am not. I am talking about If-None-Match, not If-Range. And
 specifically the use of If-None-Match in 13.6 Caching Negotiated
 Responses.

To clarify, I do not care much about strong/weak etags. This is a
property of how the server generates the content with no significant
relevance to caching other than that the ETags as such must be
sufficiently unique (there is some cache impacts of weak etags, but not
really relevant to this discussion)

It anything I said seems to imply that I only want to see strong ETags
then that's solely due to the use of poor language on my part and not
intentional.

All I am trying to say is that the responses

[no Content-Encoding]
and
Content-Encoding: gzip

from the same negotiated resource is two different variants in terms of
HTTP and must carry different ETag values, if any.

End.

The rest is just trying to get people to see this.

Apache mod_deflate do not do this when doing it's dynamic content
negotiation driven transformations, and that is a bug (13.11 MUST) with
quite nasty implications on caching of negotiated responses (13.6).

The fact that responses with different Content-Encoding is meant to
result in the same object after decoding is pretty much irrelevant here.
It's two incompatible different negotated variants of the resource and
is all that matters.

I am also saying that the simple change of making mod_deflate transform
any existing ETag into a weak one is not sufficient to address this
proper, but it's quite likely to plaster over the problem for a while in
most uses except when the original response ETag is already weak. It
will however break completely if Apache GET If-None-Match processing is
changed to use the weak comparison as mandated by the RFC (13.3.3) (to
my best knowledge Apache always uses the strong function, but I may be
wrong there..).

Negotiation of Content-Encoding is really not any different than
negotiation of any of the other content properties such as
Content-Language or Content-Type. The same rules apply, and each unique
outcome (variant) of the negotiation process needs to be assigned an
unique ETag with no overlaps between variants, and for strong ETag's
each binary version of each variant needs to have an unique ETag with no
overlaps.

This ignoring any out-of-band dynamic parameters to the negotiation
process such as server load which might affect responses to the same
request, only talking about negotiation based on request headers. For
out-of-band negotiation properties it's important to respect the strong
ETag binary equivalence requirements.


Note: Changed language to use the more proper term variant instead of
entity. Hopefully less confusing.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: Proxying OPTIONS *

2007-10-01 Thread Henrik Nordstrom
On sön, 2007-09-30 at 16:54 -0700, Roy T. Fielding wrote:
 On Sep 30, 2007, at 4:05 PM, Nick Kew wrote:
 
  RFC2616 is clear that:
1.  OPTIONS * is allowed.
2.  OPTIONS can be proxied.
 
  However, it's not clear that OPTIONS * can be proxied,
  given that there's no natural URL representation of it (* != /*).
 
 An absolute http request-URI with no path.

In RFC2068 yes, but not RFC2616..

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: FakeBasicAuth changes

2007-09-27 Thread Henrik Nordstrom
On ons, 2007-09-26 at 18:06 +0200, Nick Gearls wrote:

 In the debug log, I can find:
Faking HTTP Basic Auth header: Authorization: Basic 
 L0M9QkUvU1Q9QmVsZ2l1bS9MPUJydXNzZWxzL089QXBwcm9hY2ggQmVsZ2l1bS9PVT1BcGFjaGUgdGVzdCBjZXJ0aWZpY2F0ZS9DTj0xMjcuMC4wLjE6cGFzc3dvcmQ=
 
 What is this header contents ? Isn't it supposed to be base64 ? I cannot 
 decode it.

It's base64. Decoding it gives

/C=BE/ST=Belgium/L=Brussels/O=Approach Belgium/OU=Apache test 
certificate/CN=127.0.0.1:password

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: Fixing protocol violations in mod_proxy

2007-09-27 Thread Henrik Nordstrom
On tor, 2007-09-27 at 14:08 +0100, Joe Orton wrote:

 From the name I'd presume these are testing a long chunk-extension, not 
 long chunks.  There is no 2616 requirement to handle arbitrarily long 
 chunk-extensions so it's a meaningless test, unless httpd is not failing 
 appropriately.  (the chunk-extension is an optional token which can be 
 passed after the chunk-size and is never used in practice)

Well, technically there is no bound on the size of the chunk extensions
in RFC2616 (same for almost all HTTP stuff, not only chunk extensions),
but yes..

Regards
Henri


signature.asc
Description: This is a digitally signed message part


Re: OpenSSL compression (Windows)

2007-09-21 Thread Henrik Nordstrom
On fre, 2007-09-21 at 11:06 -0400, Tom Donovan wrote:

 Already-compressed data; like .jpg, .gif, .png, .zip, .tgz, .jar, and 
 any content filtered by mod_deflate are re-compressed.  This uses 
 non-trivial CPU cycles for no (or slightly negative) benefit.

Bot yes and no. Unlike HTTP, SSL compression applies to the whole
datastream including request  response headers, not only the object
body. So exchanges of small objects over a persistent connection is
likely to compress quite well even if the exchanged object as such is
already fully compressed.

But it may be a problem for large exchanges, or when KeepAlive is off..

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: new webaccel appliance

2007-09-19 Thread Henrik Nordstrom
On tis, 2007-09-18 at 22:41 +0200, Ruediger Pluem wrote:

 Agreed. Depending on the answers above we may need to have a list of headers
 (like Accept-Encoding) where we compare the tokens in the field-value.
 For all other headers we would stay with the plain compare we do today.
 See also the TODO comments in mod_disk_cache.c::regen_key.

Or you implement If-None-Match and forget about this. Except that Apache
mod_deflate is still broken and returns the wrong ETag (same as the
unencoded entity).. see bug #39727..

The separator is only one of many things which makes the Vary:ing
headers slightly different. You also have quality parameters, locales,
etc etc.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: new webaccel appliance

2007-09-18 Thread Henrik Nordstrom
On tis, 2007-09-18 at 19:40 +0200, Roy T.Fielding wrote:

 Argued?  The space does not change the value of the field (which is
 a comma-separated list).  The question is really up to us as to how
 much effort we make to compare the values for equality, since the
 non-match just makes our cache slow and bulky.  Given the number
 of those browsers, we should special-case this comparison.

And there is also RFC2616 13.6 Caching Negotiated Responses which tells
you to use If-None-Match to avoid fetching multiple copies only because
of slight variations in Vary indicated request headers..

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


mod_gzip and incorrect ETag response (Bug #39727)

2007-08-27 Thread Henrik Nordstrom
Just wondering if there is any plans on addressing Bug #39727, incorrect
ETag on gzip:ed content (mod_deflate).

Been pretty silent for a long while now, and the current implementation
is a clear violation of RFC2616 and makes a mess of any shared cache
trying to cache responses from mod_deflate enabled Apache servers (same
problem also in mod_gzip btw..)

For details about the problem this is causing see RFC2616 section 13.6,
pay specific attention to the section talking about the use of
If-None-Match and the implications of this when a server responds with
the same ETag for the two different variants of the same resource.

There is already a couple of proposed solutions, but no concensus on
which is the bettter or if any of them is the proper way of addressing
the issue.

The problem touches
- ETag generation
- Module interface
- Conditionals processing when there is modules altering the content

Squid currently have a kind of workaround in place for the Apache
problem, but relies on being able to detect broken Apache servers by the
precense of Apache in the Server: header, which isn't fool prof by any
means.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: mod_gzip and incorrect ETag response (Bug #39727)

2007-08-27 Thread Henrik Nordstrom
On mån, 2007-08-27 at 13:09 -0400, Akins, Brian wrote:
 On 8/27/07 12:34 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 
  Hasn't the non-compressed variant become an extreme edge-case
  by now? I would certainly hope so.
   
 
 Unfortunately not.  About 30% of our requests do not advertise gzip
 support..

And MSIE deserves a special mention here.. if MSIE is using a proxy and
NOT configured to Use HTTP/1.1 via proxies then it will not advertise
gzip support, and what is worse it will not at all understand a gzip:ed
response should one be given to it.. seen at least in MSIE6, have not
tested 7.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: mod_gzip and incorrect ETag response (Bug #39727)

2007-08-27 Thread Henrik Nordstrom
On mån, 2007-08-27 at 22:00 +0200, Ruediger Pluem wrote:

 But without an adjusted conditional checking this leads to a failure
 of conditional requests. And I currently do not see how we can adjust
 ap_meets_conditions. As I understand 13.3.3 of RFC2616 the DEFLATE_OUT
 filter transforms a possible strong ETag of the response it filters
 into a weak ETag. So shouldn't we simply transform a strong ETag into
 a weak one?

It can still be a strong one provided the server always end up in the
same gzip encoding (which it should, when using gzip.. at least unless
zlib is upgraded..)

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: [PATCH]: mod_cache: don't store headers that will never be used

2007-07-29 Thread Henrik Nordstrom
On sön, 2007-07-29 at 20:34 +0200, Graham Leggett wrote:
 Niklas Edmundsson wrote:
 
  The solution is to NOT rewrite the on-disk headers when the following 
  conditions are true:
  - The body is NOT stale (ie. HTTP_NOT_MODIFIED when revalidating)
  - The on-disk header hasn't expired.
  - The request has max-age=0
  
  This is perfectly OK with RFC2616 10.3.5 and does NOT break anything.
 
  From 10.3.5: If a cache uses a received 304 response to update a cache 
 entry, the cache MUST update the entry to reflect any new field values 
 given in the response. This sinks this, unless I am misunderstanding 
 something.

Is anything about the cache updated when the headers is not rewritten,
making any difference in headers or freshness on the next request?

It's prefectly fine for the cache to completely ignore 304 responses if
you like, except for the small cornercase if the 304 for some reason
indicates a different object than expected. If the cached object is
still very fresh there is not much use of updating the cache only
because you can.

Regards
Henrik



Re: [Issue] External links @ the wiki, aka pagechange wars

2007-05-30 Thread Henrik Nordstrom
ons 2007-05-30 klockan 21:39 +0100 skrev Nick Kew:

 It then proceeds to list HTTP status codes, and gives an errordocument
 for each one.  Unfortunately a number of them are bogus gibberish.

It's the gibberish Apache emits if you shoot yourself in the foot using
Redirect. Garbage in, garbage out.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_cache: Don't update when req max-age=0?

2007-05-24 Thread Henrik Nordstrom
tor 2007-05-24 klockan 13:22 +0200 skrev Niklas Edmundsson:

 c) RFC-wise it seems to me that a not-modified object is a
 not-modified object. There is no guarantee that next request will
 hit the same cache, so nothing can expect a max-age=0 request to
 force a cache to rewrite its headers and then access it with
 max-age!=0 and get headers of that age.

Yes. RFC wise it's fine to not update the cache with the 304. Updating
of cached entries is optional (RFC2616 10.3.5 last paragraph).

The only MUST regardig 304 and caches is that you MUST ignore the 304
and retry the request without the conditional if the 304 indicates
another object than what is currently cached (i.e. ETag or Last-Modified
differs).  (same section, the paragraph above)

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_cache: Don't update when req max-age=0?

2007-05-22 Thread Henrik Nordstrom
tis 2007-05-22 klockan 11:40 +0200 skrev Niklas Edmundsson:

 -8---
 Does anybody see a problem with changing mod_cache to not update the 
 stored headers when the request has max-age=0, the body turns out not 
 to be stale and the on-disk header hasn't expired?
 -8---

My understanding:

It's fine in an RFC point of view for the cache to completely ignore a
304 and not update the stored entity at all. But the response to this
request should be the merge of the two responses assuming the
conditional was added by the cache.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: [PATCH] mod_cache 304 on HEAD (bug 41230)

2007-04-16 Thread Henrik Nordstrom
mån 2007-04-16 klockan 22:58 +0200 skrev Ruediger Pluem:

 My first question in this situation is: What is the correct thing to do here?
 Generate the response from the cache (of course with the updated headers from 
 the 304
 backend response) and delete the cache entry afterwards?

My understanding (regarding no-store and cache updates from 304
responses):

The response you send if the client request was a unconditional SHOULD
be the merged response of the old response and entity headers from the
304. 

But you do not need to delete the already cached response without
no-store if you do not want to (change of CC is not an invalidation
criteria), but you MUST NOT store the updated headers from a no-store
session (no-store on either response or request).

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_ftp named virtual hosts?

2007-04-11 Thread Henrik Nordstrom
ons 2007-04-11 klockan 10:46 -0500 skrev William A. Rowe, Jr.:

 Firefox is fine with...
 
 ftp://[EMAIL PROTECTED]:[EMAIL PROTECTED]/
 
 but it's odd enough I wouldn't trust that to be consistently supported,
 and you raise a good point with proxy/firewalls.

The above isn't a correctly formed URL. MUST be (RFC wise)

ftp://me%40myhost:[EMAIL PROTECTED]/

which resolves the ambiguity, but is perhaps even less intelligible to
the user.

 So if, for example, the admin wanted to define  as the alternative
 separator, ftp://memyhost:[EMAIL PROTECTED]/ would be a little less ambiguous
 to browser-style schemas.

Sounds reasonable. Except that it's quite impractical to use in HTML
coding and very many applications (and users) rendering data into HTML
will get it wrong..

Note: how most browser user agents implements the ftp:// URI scheme in
general is quite far outside the standards, so it's not easy to know
what will happen when trying something other than plain anonymous ftp
with non-problematic characters, or even file paths..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Chunked transfer encoding on responses.

2007-04-08 Thread Henrik Nordstrom
lör 2007-04-07 klockan 04:00 -0500 skrev William A. Rowe, Jr.:

 Of course this person is entirely wrong if the client doesn't
 Accept-Encoding: chunked
 
 which is exactly the logic we test.

So why is there a dependency on keep-alive being enabled?

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Redundant SSL virtual host warnings?

2007-04-08 Thread Henrik Nordstrom
sön 2007-04-08 klockan 18:48 +0100 skrev Jay L. T. Cornwall:

 So the part I'm leading up to is: how about a way to turn off these
 warnings? Or perhaps a simple certificate analysis to see if the
 wildcard matches all the virtual hosts for which it serves?

Sounds good to me. 

Related to this, in current versions of TLS the client MAY advertise
which host it is desiring to get connected to which would also require
this if implemented in Apache mod_ssl. (server_name hello extension
defined in RFC4366 section 3.1)

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Chunked transfer encoding on responses.

2007-04-07 Thread Henrik Nordstrom
lör 2007-04-07 klockan 09:18 +0200 skrev André Malo:

 Hmm, you may get something wrong here. The httpd does apply chunked encoding 
 automatically when it needs to. That is in keep-alive situations without 
 given or determineable Content-Length.
 
 Why doesn't it do it in all other cases? My answer is: because it would be 
 useless (as in: not of any use :-).

I don't agree fully here. chunked is not useless in the non-keepalive
case. What it adds there compared to the HTTP/1.0 method of just closing
the connection is error detection. A receiver seeing the connection
closed before the final eof chunk knows something went wrong and the
response is not complete. If chunked is not used the receiver usually
can not tell that there was a problem.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_ftp named virtual hosts?

2007-04-06 Thread Henrik Nordstrom
fre 2007-04-06 klockan 21:37 +0100 skrev Nick Kew:

  What about modifying mod_ftp USER directive to accept username in the
  format of [EMAIL PROTECTED], and tokenize user as the username, host as the
  http-ish Host: virtual host name?
 
 Sounds fair, provided the protocol doesn't assign some (different)
 semantics to that.

FTP as such doesn't assign any semantic on the syntax of usernames, but
very many FTP firewalls/proxies do...

The proposed [EMAIL PROTECTED] is in fact the most common FTP proxy method,
meaning connect as user on host, and to login using a [EMAIL PROTECTED] style
login via such proxy may be a little awkward if it at all works..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Reverse proxy mode and DAV protocol

2007-04-04 Thread Henrik Nordstrom
ons 2007-04-04 klockan 13:12 +0200 skrev Julian Reschke:

 What I meant by reason was the fact that the Destination header (and 
 some aspects of the If header) require absolute URIs, which is 
 problematic when there's a reverse proxy in the transmission path. All 
 the issues around to rewrite or not to rewrite headers go away once 
 these headers use absolute paths (well, as long as the reverse proxy 
 doesn't also rewrite paths, but I would claim that this is nearly 
 impossible to get right with WebDAV).

Rewriting is nearly impossible to get right. Even when limited to
simple rewrites of just the host component.

There is many aspects of HTTP and HTTP applications which depend on the
URIs being known on both sides (client and server), and the more
rewrites you do the higher the risk that some of these is overlooked and
things starts to break.

Most reverse proxies fail even the simple host:port based rewrites,
forgetting or wrongly mapping the internal URIs in some random headers
(i.e. Location, Content-Location, Destination, etc) or generated
response entities. 

WebDAV in particular has a lot of URIs embedded in generated XML
request/response entities.



If you do rewrite then you better make sure you have a clear view of how
the backend URIs should be mapped back to external URIs, and make sure
your rewrites is applied on the complete HTTP messages headers  body,
both requests and responses, and not just the request line.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Limiting response body length

2007-02-12 Thread Henrik Nordstrom
mån 2007-02-12 klockan 12:41 +0200 skrev Dziugas Baltrunas:

 To illustrate, squid for this purpose has reply_body_max_size [1]
 parameter. Looks like it is only Content-Length response header (if
 any) dependent,

It also terminates requests when the amount of data transferred hits the
specified limit if not known ahead by content-length.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Limiting response body length

2007-02-12 Thread Henrik Nordstrom
mån 2007-02-12 klockan 17:51 + skrev Nick Kew:

 2. Where there's chunked encoding, the check would best be
 implemented in the chunking filter.
 
 3. A simple count/abort filter is then a last resort.
 And it won't be able to tell the client what's happened,
 because the header has already been sent (unless it
 buffers the entire response, which is horribly inefficient).

Why differing 2 and 3? What's the benefit of doing it in the chunking
filter? Just to avoid having yet another filter in the chain or
something besides that?

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Limiting response body length

2007-02-12 Thread Henrik Nordstrom
mån 2007-02-12 klockan 21:55 + skrev Nick Kew:

 Because the chunking filter is equipped to discard the chunk that
 takes it over the limit, and substitute end-of-chunking.
 If we do that in a new filter, we have to reinvent that wheel.

Not sure substitue end-of-chunking is a reasonable thing here. It's an
abort condition, not an EOF condition. Imho you'd better abort the flow,
that way telling the client that the request failed instead of silently
truncating the response.

But yes, the earlier you know the limit is going to be hit the better.
Just not sure you will find many where the chunk size is of such size it
really makes a difference, but I may be wrong..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Regarding graceful restart

2007-02-09 Thread Henrik Nordstrom
tor 2007-02-08 klockan 17:15 -0800 skrev Devi Krishna:
 Hi, 
 
  Resending this mail, just in case anyone would have
 suggestions/inputs as how to fix this for connections that are in the
 ESTABLISHED state or FIN state or any other TCP state other than
 LISTEN

Maybe change the wake up call to just connect briefly without actually
sending a full HTTP request? This should be sufficient to wake up any
processes sleeping in accept() and will not cause anything to get
processed..

But I am not sure I understand the original problem so I may be
completely off here..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Regarding graceful restart

2007-02-09 Thread Henrik Nordstrom
fre 2007-02-09 klockan 18:34 +0100 skrev Plüm, Rüdiger, VF EITO:

 Not if BSD accept filters are in place. In this case the kernel waits until it
 sees a HTTP request until it wakes up the process.
 And on Linux with TCP_DEFER_ACCEPT enabled you need to sent a least one byte 
 of data.

So send two blank lines. Should satisfy both..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Mod_cache expires check

2007-01-18 Thread Henrik Nordstrom
tor 2007-01-18 klockan 12:05 +0100 skrev Plüm, Rüdiger, VF EITO:

 Just curious: Is the Unix epoch an invalid date in the Expires header
 (as this in the past it does not really matter for the question whether
 this document is expired or not as it would be in both cases)?

The RFC does not care for the UNIX epoch.

Any valid date which can be represented in the textual form is a valid
Expires header. And any Expires header you can not understand for
whatever reason is already expired in the past.

To solve this there is Cache-Control max-age which is not sensitive to
the limits of your internal time representation, and which overrides
Expires if present.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Mod_cache expires check

2007-01-17 Thread Henrik Nordstrom
mån 2007-01-15 klockan 13:56 +0100 skrev Bart van der Schans:
 In r463496 the following check was added to mod_cache.c :
 
 else if (exp != APR_DATE_BAD  exp  r-request_time)
 {
 /* if a Expires header is in the past, don't cache it */
 reason = Expires header already expired, not cacheable;
 }
 
 This check fails to correctly identify the expires header Thu, 01 Jan
 1970 00:00:00 GMT. The apr_date_parse_http function(exps) returns
 (apr_time_t)0 which is equal to APR_DATE_BAD, but it should recognize it
 as an already expired header. Is there a way to distinct between
 APR_DATE_BAD and the unix epoch? Or is that considered a bad date?

Well.. all bad dates should be considered expired.. If there is an
Expires header and the value could not be understood properly then the
object is by HTTP/1.1 already expired. RFC 2616 14.21.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Add 2.2.4 to bugzilla

2007-01-12 Thread Henrik Nordstrom
lör 2007-01-13 klockan 01:06 +0100 skrev Ruediger Pluem:

 This could be modified to:
 
 1. Fix on trunk = Change state in Resolved, fixed and add a comment with 
 revision
of fix.
 2. Proposed for backport = Leave state in Resolved, fixed and add a 
 comment with
revision of backport proposal (STATUS file)
 3. Backported = Change state to Closed and add a comment with revision of
backport.

Another alternative is ignoring Bugzilla for backport status, only using
the STATUS file with references to Bugzilla entries needing to get
backported to the release.

I.e. something like

1. Fix on trunk - Resolved, Fixed. Reference to revision in bugzilla
(preferably automatic)

2. Audited by a release maintainer for the main release to judge if
backport needed, added to STATUS file if backport needed. - Closed.

3. Backported - cleared in STATUS file. Reference to revision in
Bugzilla (preferably automatic).


Another alternative which is more in line with normal release management
is using the target milestone feature builtin to Bugzilla.

1. Fix on trunk - Resolved, Fixed. Reference to revision in bugzilla
(preferably automatic). No target milestone assigned yet.

2. Audited by release maintainer for the main release. If backport
needed added to STATUS and - New with target milestone of the release.
Else closed.

3. Backported. - Resolved, Fixed.

4. Audited by next older release maintainer if any, as in 2. Repeat
until all maintained releases has been covered.

the beauty of the above is that it's easy to query Bugzilla for the list
of bugs in various states, and that it pans out quite well when you keep
maintaining older releases. Same process all the way.


 From my personal point of view I think it is important to add the revision 
 number
 of the fix / backport to the comment because:
 
 1. People who are interested / have the know how can easily cross check what 
 has been
changed.
 2. People who only want a specific fix either because there is no newer 
 stable version,
or because they cannot do an upgrade to a later stable version for 
 whatever reason
can easily find the needed patch.

Note: If you properly reference bugzilla entries in the changelog
messages when committing changes then there is automated tools which can
both resolve bugzilla entries and add references.. Could save you some
headache but requires a bit of setup..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-13 Thread Henrik Nordstrom
ons 2006-12-13 klockan 08:51 -0500 skrev Brian Akins:

 However, on an initial request (ie, non-conditional) we do not have an etag 
 from 
 the client, we only have info like Host, URI, Accept-*, etc.  So, how would 
 the 
 cache identify which entity to serve in this case?

You have the URL and the other cached entities of that URL. It does
not matter if the client request was a conditional or not. The
conditions in the request is on the response to see if it should be a
200 or 304, not selectors on what entity to respond with. The selected
response entity is always the same for the same request, with or without
conditions.

Obviously on the very first request for a given URL you have nothing,
and that request is forwarded without any added condition. However,
after that every Vary cache miss on that URL is a If-None-Match
conditional to ask the server if any of the cached entity variants is
applicable for the current request.

 I have read it many times.. In our case - cnn.com, etc. - we have to decided 
 to 
 be RFC compliant from the client to the cache server.  From the cache to 
 the 
 origin, however, we are not as concerned.

And you are free to. A reverse proxy is by definition the origin server.
How it finds the content is of no concern to the RFC, just happens to be
HTTP and not plain files, NFS, database or whatever.

 In a reverse-proxy-cache, this is not 
 a big deal. However, in a normal forward-proxy-cache, where one does not 
 control both cache and origin, one must be more careful.

Indeed.

But on the other hand it's actually reverse proxy configurations which
has pushed for 13.6 compliance in Squid as it's a lot easier for
processing intensive servers to evaluate If-None-Match than to render
the entity again, and when you depend on Accept-Language +
Accept-Encoding + User-Agent the number of request combinations becomes
quite significant, especially if there maybe only is two or three
variants under the URL.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-12 Thread Henrik Nordstrom
tis 2006-12-12 klockan 09:20 -0500 skrev Brian Akins:

 Only conditional requests from clients, generally, have If-None-Match 
 headers. 

Correct. It's a conditional. These days you also see them from Squid
btw.

 So the only way for a cache, on an initial request from a client, to 
 determine 
 what object to serve is to use the Client supplied information - which 
 doesn't 
 include an Etag, so you have to, usually, rely on URI first, and then the 
 Vary 
 information.

Indeed. This is always the case. If-None-Match MUST NOT be used for
identification of which response to use. It's a conditional only.

But the unique identity of the response entity is defined by request-URI
+ ETag and/or Content-Location. The cache is not supposed to evaluate
Accept-* headers in determining the entity identity, only the origin
server.

The identity of the entity is important for

- Cache correctness, making sure updates invalidate cached copies where
needed.

- Avoiding duplicated storage

There may be any number of request header combinations in any Vary
dimensions all mapping to the same entity.

This logics is not at all unique for Accept-Encoding. The logics on how
a cache is supposed to operate applies equal to all Vary indicated
headers. The specs does not make any distinction between
Accept-Encoding, Accept-Language, User-Agent etc in how caches are
supposed to operate. It all boils down to the entity identified by URI +
ETag and/or Content-Location as returned in 200 and 304 responses
allowing the cache to map requests to entities.

Please see RFC2616 13.6 Caching Negotiated Responses, it explains how
the RFC intends that caches should operate wrt Vary, ETag and
Content-Location in full detail.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-09 Thread Henrik Nordstrom
fre 2006-12-08 klockan 15:35 -0800 skrev Justin Erenkrantz:

 As Kevin mentioned, Squid is only using the ETag and is ignoring the
 Vary header.  That's the crux of the broken behavior on their part.
 If they want to point out minor RFC violations in Apache, then we can
 play that game as well.  (mod_cache deals with this Vary/ETag case
 just fine, FWIW.)

We are not at all ignoring Vary, but we are using If-None-Match to ask
the server which one of the N already cached entities belonging to the
resource URI is valid for this specific request, indirectly learning the
server side content negotiation logics used.

 The compromise I'd be willing to accept is to have mod_deflate support
 the 'TE: gzip' request header and add 'gzip' to the Transfer-Encoding
 bit - and to prefer that over any Accept-Encoding bits that are sent.

Would be a great move if you can not make it behave correct in the
content space.

But if you make mod_deflate behave according to the RFC then sending
Content-Encoding: gzip is fine to me. But TE is a much better fit from
the RFC point of view.

 The ETag can clearly remain the same in that case - even as a strong
 ETag.

Yes.

 So, Squid can change to send along TE: gzip (if it isn't
 already).

TE: gzip is likely to appear in 3.1.

 And, everyone else who sends Accept-Encoding gets the
 result in a way that doesn't pooch their cache if they try to do a
 later conditional request.

As long as mod_deflate continues ignoring the RFC wrt ETag there will
conflicts with various cache implementations.

 Is that acceptable?  -- justin

Intentionally not following a MUST level requirements in the RFC is not
an acceptable solution in my eyes. For one thing even if you ignore
everyone else it would make it impossible for Apache + mod_deflate to
claim RFC 2616 HTTP/1.1 compliance.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-09 Thread Henrik Nordstrom
lör 2006-12-09 klockan 15:23 +0100 skrev Justin Erenkrantz:

 See the problem here is that you have to teach ap_meets_conditions()
 about this.  An ETag of 1234-gzip needs to also satisfy a
 conditional request when the ETag when ap_meets_conditions() is run is
 1234.  In other words, ap_meets_conditions() also needs to strip
 -gzip if it is present before it does the ETag comparison.  But, the
 issue is that there is no real way for us to implement this without a
 butt-ugly hack.

Be careful there.. Blindly stripping the decoration alone won't work
out. Consider for example If-None-Match. In specific If-None-Match with
the ETag of the gzip variant should only return 304 if the request would
cause Apache to send the gzip:ed variant of the entity.

If-None-Match: list of etags

returns 304 with the single correct ETag if any of the ETags in the
directive matches the current response to the current request.


Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-09 Thread Henrik Nordstrom
lör 2006-12-09 klockan 19:02 +0100 skrev Justin Erenkrantz:

 AIUI, many caches do not allow the response to be cached at all if it
 doesn't have an ETag.

Most still caches it, but for example Mozilla has bugs vrt Vary handling
if there is no ETag and the conditions changes..

 In the ideal world, I think a weak ETag would be the 'right' thing

I don't have an opinion if you return a strong or weak ETag, but it must
still be different than the ETag of the identity encoded object, not
just the same ETag flagged as weak.

Your main decision if the ETag on the mod_deflate generated entity
should be weak or strong should be

a) If the original entity is weak, then the mod_deflate generated one
MUST be weak as well..

b) If mod_deflate can not be trusted to generate the exact same octet
representation on each request then the ETag of the generated entity
MUST be weak.

Else the ETag SHOULD be strong.

 however, the current spec doesn't allow conditional GETs to work with
 weak ETags.

Err.. Weak ETags is allowed in If-None-Match for GET/HEAD.

Regards
Henrik




signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-09 Thread Henrik Nordstrom
fre 2006-12-08 klockan 15:40 -0800 skrev Justin Erenkrantz:

 I think we all (hopefully) agree that a weak ETag is ideally what
 mod_deflate should add.

Please read RFC2616 13.6 Caching Negotiated Responses for an in-depth
description of how caches should handle Vary. And please stop lying
about Squid. If you think something in our cache implementation of
Vary/ETag is not right then say what and back it up with RFC reference.

My base requirement is that you comply with If-None-Match. For this you
MUST return a different ETag. It does not matter to me if it's weak or
strong as the main concerns for a cache is GET/HEAD requests. Flagging
the existing ETag as weak does not make it a different ETag as
If-None-Match on GET/HEAD allows for the weak comparison function where
weakness is ignored.

13.3.3 Weak and Strong Validators

  - The weak comparison function: in order to be considered equal,
both validators MUST be identical in every way, but either or
both of them MAY be tagged as weak without affecting the
result.


Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-09 Thread Henrik Nordstrom
lör 2006-12-09 klockan 05:44 -0500 skrev [EMAIL PROTECTED]:

 It's relevant to the extent that I think there are still some things
 missing from the RFCs with regards to all this which is why a piece
 of software like SQUID might be doing the wrong thing as well.

Ater reading the RFC on this topic many many times I can not agree that
it's that incomplete.

The scheme set by the RFC is quite complete as long as you stay with
strong ETags, allowing for cache correctness, update serialization and
many good things.

Situations requiring weak etags also works out pretty well for cache
correctness thanks to If-None-Match, but not other operations as they
are banned from both non-GET/HEAD requests and If-Match conditions.
  
 ...and, currently, if the cache has stored both a compressed and
 and non-compressed version of the same entity received from Apache
 ( sic: mod_deflate ) then the same ( strong ) ETag is returned
 in the conditional GET for both of the cached variants.
  
 Hmmm... begins to look like a problem... but is it really?... 

It is.

See 13.6 Caching Negotiated Responses (all of it). And then skim over
14.26 If-None-Match, and finally read 10.3.5 304 Not Modified. Then
piece them together.

Also take note that nowhere is there any requirement on the cache to
evaluate any server driven content negotiation inputs (Accept-XXX etc).
This responsibility is fully at the origin server and reflected back via
ETag.

Caches evaluate Vary in finding the correct response entity.

  If the server says that any one of the representations,
  as indicated by the ETag in a 304 response, is okay, 
  
 okay means fresh.

Not only that, it also tells which entity among the N cached ones is
valid to send as response to this request.

 happen to share the same (strong) ETag... if SQUID is delivering
 stale compressed variants when a 304 response says that the
 original identity variant is not fresh then that's just
 a colossal screw-up in the caching code itself.

The 304 says

Send the entity with the ETag XXX, its still fresh. Nothing more. If
does not indicate if this is a identiy of gzip encoded, neither the
content length, content type or anything other relevant to the actual
content besides the ETag and/or Content-Location.
 
 Regardless of what the server says... how could you ever get
 into a situation where you would consider a compressed variant
 of an entity fresh when the identity version is now stale? 

As HTTP did not consider dynamic content encoding it sees the two
entities as different objects (i.e. file and file.gz) and does not
enforce a strict synchronization between the two. The only requirement
set in the RFC is that the origin server SHOULD make sure the two
representations on the server is in synch.

 is seriously confused even if the ETags are the same and the
 cache is sending back stale compressed variants when the
 identity variant ( strong ETag value ) is also stale. 

I don't know what condition you refer to here. the Squid cache (2.6)
only remembers the last seen of the two as the later response with the
same ETag overwrites the first..

 There's still something missing from the specs or something.

Not that I can tell.
 
 When an exact, literal interpretation of a spec tends to 
 defy common sense... my instinct is to suspect the spec itself.

In what way? There is something in your reasoning I don't get.
  
 DCE ( Dynamic Content Encoding ) is a valid concept even if it
 wasn't sufficiently imagined at the time the specs were
 codified. It works. It works WELL... and it is something that
 OUGHT to always be possible if the RFCs mean anything at all.

And it is possible. Just that you need to pay attention to

  Content-Location
  ETag
  Content-MD5

as all of these is affected by dynamically altering the entity by server
driven content negotiation with static or dynamic recoding of the
entity.

 One of the main prime directives for developing Apache 2.0
 at all was to finally re-org the IO stream so that schemes
 like DCE could be done more easily than were already being
 done in the 1.3.x framework. Mission was accomplished.
 Filtering was born. It would be a shame to consider abandoning
 one of the very concepts that gave birth to Apache 2.0 for 
 the sake of a few more lines of code that could take it
 into the end zone.

Agreed.
 
 No argument here. Transfer-encoding is about a DECADE overdue now.

And as already indicated should be piece of cake to add to mod_deflate,
and as HTTP support evolves in clients and caches is likely to lessen
the complexity of dealing with mod_deflate and conditionals
considerably.
 
 In the case of compressed entities it would still be a good idea
 to always add a standard header which indicates the original
 uncompressed content-length ( if it's possible to know it ).

There is no such header in HTTP, but you are free to propose one. But
it's worth noting that this information also exists in the gzip
encoding.

Current specs does not handle 

Re: Wrong etag sent with mod_deflate

2006-12-09 Thread Henrik Nordstrom
lör 2006-12-09 klockan 20:38 -0500 skrev [EMAIL PROTECTED]:

 If you are referring to Justin quoting ME let me supply a big
 fat MEA CULPA here and say right now that I haven't looked
 at the SQUID Vary/ETag code since the last major release
 and I DO NOT KNOW FOR SURE what SQUID is doing ( or
 not doing ) if/when it sees the same (strong) ETag for both
 a compressed and an identity version of the same entity.

Thats not the problem. The problem is that Apache tells us that we
should use whatever we got first on all subsequent responses.

The chain of events leading to the problem is as follows:

1. We forward request A. Lets say this claims Accept-Encoding: gzip.

2. Apache mod_deflate returns an gzip:ed entity with ETag
6bf1f7-6-1b6d6340 and Vary: Accept-Encoding.

3. We get another request with a different Accept-Encoding value. This
gets forwarded to Apache with an If-None-Match header telling the ETags
of the entities we have, i.e. If-None-Match 6bf1f7-6-1b6d6340.

4. The entity hasn't changed and Apache responds with a 304 ETag
6bf1f7-6-1b6d6340 telling us that the valid response entity for this
request is the previous received response with ETag 6bf1f7-6-1b6d6340,
and any updated HTTP headers for that response.

The problem arises in '4'.

 Period. I DO NOT KNOW FER SURE.

Then stop saying that Squid is broken, does not implement X or broken
clients such as Squid. All I ask. Fine to say that you do not understand
why it is a problem for Squid.

 In my other posts, I was suggesting, however, that even if
 an upstream content server ( Apache ) is not sending separate
 unique ETags I am still having a hard time understanding why
 that would cause SQUID to deliver the wrong Varied response
 back to the user.

Simply because Apache explicitly tells it do exactly that in it's 304
response.

 A compressed version of an entity IS the same entity...

Nope. It's a different representation of the the same resource, but not
the same entity in terms of HTTP. This is the key difference between
Content-Encoding and Transfer-Encoding.

Content-Encoding is a property of the entity.

Transfer-Encoding is a property of how the message is sent, just like
chunked, with no implications on the entity.

The problem arises from trying to use Content-Encoding as if it was
Transfer-Encoding.

Many years ago we had the same discussion about Vary, and when dust
settled all understood the problem about not sending correct Vary in the
responses. Now as the cache implementation is evolving we are hitting
the exact same problem again in a different form this time due to ETag
collisions. I am sorry that we did not realize the full extent of the
brokenness of these responses the first time when Vary was discussed.

 for
 all intents and purposes... it just has compression 
 applied. One cannot possibly become stale without the
 other also being stale at the same exact moment in time.

HTTP does not make this strict freshness relation between entities of
the same URI, but thats a different question and generally not a big
problem.

 At the moment... yes... I do... but if you read my other posts I
 also have a feeling the reason I can't quote you Verse and Chapter
 from an RFC is because I have a sneaking suspicion that there
 is something missing from the ETag/Vary scheme that can 
 lead to problems like this... and it's NOT IN ANY RFC YET.

And what I am saying is that Apache mod_deflate is violating a MUST
level requirement on ETag in the RFC, thereby making the caching section
of the same RFC break down.

 In other words... you may be doing exactly what hours and hours
 of reading an RFC seems to be telling you you SHOULD do... but
 there still might be something else that OUGHT to be done.

And I am telling you that this part of the RFC is complete, save for the
small detail that the server can not signal that both the compressed and
identity encoding becomes stale when one changes, only one at a time.

 There will always be the chance that some upstream server will
 ( mistakenly? ) keep the same (strong) ETag on a compressed
 variant.

True, there will always be non-compliant implementation out there in
various forms, and they will continue causing problems at least for as
long as it's about MUST level violations. In many cases (this one
included) workarounds can be found, but that does not justify the ones
being non-compliant to continue and intentionally being non-compliant
when informed about the problem.

 People are not perfect and they make mistakes. I still
 think that even when that happens any caching software should
 follow the be lenient in what you accpet and strict in what you
 send rule and still use the other information available to it

Which in this case is none. The only information we ever get from Apache
is the ETag of the supposedly valid to use response, and possibly new
freshness details about the same.

 ( sic: What the client really asked for and expects ) and 
 do the right thing. Only the cache knows 

Re: Wrong etag sent with mod_deflate

2006-12-08 Thread Henrik Nordstrom
fre 2006-12-08 klockan 14:47 +0100 skrev Justin Erenkrantz:

 mod_deflate is certainly not creating a new resource

It is creating a new HTTP entity. Not a new object on your server, but
still a new unique HTTP entity with different characteristics from the
identity encoding.

If we were talking about transfer-encoding then you would be correct as
it only alter the encoding for transfer purposes and not the HTTP entity
as such, but this is content-encoding. Content encoding is a property of
the response entity.

The main reason why things get blurred is because the creation of this
entity is done on the fly instead of creating a new resource on the
server like HTTP expects. As result you need to be very careful with the
ETag and Content-Location headers.

Not modifying ETag (including just making it weak) says that the
identity and gzip encodings is semantically equivalent, and can be
exchanged freely. In other words says it's fine to send gzip encoding to
all clients (which we all know it's not).

Not modifying/removing Content-Location is less harmful but will cause
cache bouncing, as each time the cache sees a new response entity for a
given URI any older ones with the same Content-Location will get removed
from the cache.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-08 Thread Henrik Nordstrom
fre 2006-12-08 klockan 14:40 +0100 skrev Justin Erenkrantz:

 Uh, no, they *are* semantically equivalent - but, yes, not
 syntactically (bit-for-bit) equivalent.  You inflate the response and
 you get exactly what the ETag originally represented.

To entities is only semantically equivalent if they can be interchanged
freely at the HTTP level with no semantic difference in the end-user
result.

identiy and gzip encoding can not be said to bidirectionally have the
same semantic meaning as a gzip encoded entity is pure rubbish to a
recipient not understanding gzip. No more than a Swedish translation of
a document could be said to be semantically equivalent to a Greek
translation of the same document.

Content-Encoding is a case of unidirectional semantic equivalence where
the identity encoding can be substituted for the gzip encoding with kept
semantics, but for ETag bidirectional semantic equivalence is required
which is not fulfilled as gzip encoding can not be substituted for
identity encoding without risking a significant semantic difference to
the recipient.

The only real difference of a weak etag compared to a strong one is that
the weak one does not guarantee octet equality. All other restrictions
apply. Plus a bunch of protocol restrictions where weak etags is not
allowed to be used.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-08 Thread Henrik Nordstrom
tor 2006-12-07 klockan 02:42 +0100 skrev Justin Erenkrantz:

 -1 on adding semantic junk to the existing ETag (and keeping it
 strong); that's blatantly uncool.  Any generated ETag from mod_deflate
 should either be the original strong version or a weak version of any
 previous etag.  mod_deflate by *definition* is just creating a weak
 version of the prior entity.

You basically only have two choices:

a) Make mod_deflate not send an ETag on modified responses.

b) Modify the value (within the quotes) of the ETag somehow. And if
mod_deflate can not be trusted to always return the same octet
representation make sure to use an weak ETag unless the ETag generation
is also tightly coupled to the octet representation guaranteing a
different ETag should mod_deflate encode slightly different.

And to be fully compliant you also need to pay attention to the
Content-Location header. Here I don't see much choice but to not send
Content-Location in mod_deflate mangled responses (but can be kept on
the original response, no problem there).

RFC 2616 13.6 Caching Negotiated Responses, last paragraph.

 mod_deflate does properly stick in the Vary header, so caches already
 have enough knowledge to know what's going on anyway even without a
 fix.  (This is probably why mod_cache doesn't flag it as an error.)

 My opinion is to fix the protocol and move on...  -- justin

The protocol is quite fine as it is, and not easy to change. As it is
now it's mainly a matter of understanding that mod_deflate does create a
completely new entity from the original one. To the protocol it's
exactly the same as when using mod_negotiate and having both the
identity and gzip encoded entities on disk. The fact that you do this
encoding on the fly is of no concern to HTTP.

Another option is to explore the use gzip transfer encoding instead of
content encodin. In transfer encoding none of these problems apply as
it's done on the transport level and not entity level, but it's not that
well supported in clients unfortunately..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-08 Thread Henrik Nordstrom
fre 2006-12-08 klockan 14:40 +0100 skrev Justin Erenkrantz:

 Uh, no, they *are* semantically equivalent - but, yes, not
 syntactically (bit-for-bit) equivalent.  You inflate the response and
 you get exactly what the ETag originally represented.

To entities is only semantically equivalent if they can be interchanged
freely at the HTTP level with no semantic difference in the end-user
result.

identiy and gzip encoding can not be said to bidirectionally have the
same semantic meaning as a gzip encoded entity is pure rubbish to a
recipient not understanding gzip. No more than a Swedish translation of
a document could be said to be semantically equivalent to a Greek
translation of the same document.

Content-Encoding is a case of unidirectional semantic equivalence where
the identity encoding can be substituted for the gzip encoding with kept
semantics, but for ETag bidirectional semantic equivalence is required
which is not fulfilled as gzip encoding can not be substituted for
identity encoding without risking a significant semantic difference to
the recipient.

The only real difference of a weak etag compared to a strong one is that
the weak one does not guarantee octet equality. All other restrictions
apply. Plus a bunch of protocol restrictions where weak etags is not
allowed to be used.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-08 Thread Henrik Nordstrom
fre 2006-12-08 klockan 15:03 -0500 skrev [EMAIL PROTECTED]:

 To ONLY ever use ETag as a the end-all-be-all for variant 
 identification is, itself, a mistake.

Well, this area of the  HTTP specs is pretty clear in my eyes, but then
I have read it up and down too many times unwinding the tangled web
which is found in there.

An entity (including encoding) is identified by request URI +
Content-Location.

A specific version of a entity is identified by it's unique ETag.

Vary: tells which headers the server used in server driven negotiation
of which entity to respond with. Accept-Encoding is one input to this.

A strong ETag must be unique among all variants of a given URI, that is
all different forms of entities that may reside under the URI and all
their past and future versions.

A weak ETag may be shared by two variants/versions if and only if they
can be considered semantically equivalent and mutually exchangeable at
the HTTP level with no semantic loss. For example different levels of
compression, or minor changes of negligible or no importance to the
semantics of the resource (hit counter example in the specs).
 
 Both pieces of software ( SQUID and Apache ) need just a 
 little more code to finally get it right.

It's correct that the current Squid implementation is not flawless. Most
notably it has very poor handling of cache invalidations at the moment.
 
 Don't forget about Content-Length, either. 
 If 2 different responses for the same requested entity come
 back with 2 different Content-Lengths and there is no Vary:
 or ETag then regardless of any other protocol semantics the 
 only SANE thing for any caching software to do is to recoginze 
 that, assume it is not a mistake, and REPLACE the existing 
 entity with the new one.

Caches tend to by nature replace what they have with what they get.

 Yea.. sure... you might get a lot of cache bounce that way but
 at least you are returning a fresh copy.

How would Content-Length changes cause cache bouncing?

 It is not possible for 2 EXACTLY identical reprsentations of the
 same requested entity to have different content lengths.
 If the lengths are different, then SOMETHING is different with
 regards to what you have in your cache.

Yes, but when would this be seen?

We only get the ETag from Apache, not the Content-Length. Specs forbids
Apache from sending the Content-Length or other entity headers in 304
responses partly to make sure entities do not get corrupted by errors in
the origin server side implementation of server driven content
negotiation.

 No protocol ( sic: set of rules ) can ever cover all the realities.
 ( Good ) software knows how to make common sense
 as well.

Indeed and is why we are going slow on implementing the more advanced
features of the specs. But violating MUST level protocol requirements is
not common sense. And if you actually follow the specs these parts do
make great sense once you get the picture that ETags MUST be unique for
all entity versions of a given URI. The only poor part I have seen in
this area of the specs is that the If-None-Match condition is perhaps a
bit blunt only telling the end results, the ETag of the valid response
entity of a negotiated resource, not how the server came to that
conclusion. This adds a bit more roundtrips to the origin than would be
required only to figure out that Content-Language: en is ok both for
Accept-Language: en and Accept-Language: en, sv, but thats about it.
(yes, I intentioanlly avoided Accept-Encoding here to illustrate the
point, the mechanism is the exact same however).

RFC 2616 3.11 Entity Tags

   A strong entity tag MAY be shared by two entities of a resource
   only if they are equivalent by octet equality.

   An entity tag MUST be unique across all versions of all entities
   associated with a particular resource. A given entity tag value MAY


See also 14.26 If-None-Match, and numerous other references to ETag.

I can bombard you with long chains of supporting claims from the RFC if
you like depending on which parts of the equation you feel is loosely
connected. Just tell me which part you don't trust and I'll happily help
you see the light.

a) That identity and gzip content-encoding of the same resource
represents different entities of the same resource

b) That different entities of the same resource MUST have different
(strong) ETags.

c) That gzip and identity encoding is not semantically equivalent.

d) That the weak ETag W/X is semantically equivalent to the strong
ETag X with the same quoted value.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-08 Thread Henrik Nordstrom
fre 2006-12-08 klockan 11:44 -0800 skrev Roy T. Fielding:

 In other words, Henrik has it right.  It is our responsibility to
 assign different etags to different variants because doing otherwise
 may result in errors on shared caches that use the etag as a variant
 identifier.

Thanks ;-)

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-08 Thread Henrik Nordstrom
fre 2006-12-08 klockan 22:28 +0100 skrev Henrik Nordstrom:

 A strong ETag must be unique among all variants of a given URI, that is
 all different forms of entities that may reside under the URI and all
 their past and future versions.

Forgot the last piece there which clears many doubts:

Entities from different URIs may share the same ETag (or even
Content-Location) with no implications on any form of equivalence
between the two.

Also I am sorry that my use of terms is a bit messed up wrt entity vs
variant vs version, but so is the specs..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: 2.2.x as a transparent reverse proxy

2006-12-08 Thread Henrik Nordstrom
fre 2006-12-08 klockan 22:04 + skrev Nick Kew:

 How does a transparent reverse proxy differ from a reverse
 proxy as we know and document it?

The Linux cttproxy patch allows proxies to be fully transparent
masquerading using the original clients source address on the
connections to the backend.

Has some concerns at the TCP/IP layer and a lot of restrictions on how
it can be deployed, but eases deployment in some cases as the origin
server logs is not so disturbed..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-07 Thread Henrik Nordstrom
tor 2006-12-07 klockan 02:31 +0100 skrev Justin Erenkrantz:

 mod_deflate should just add the W/ prefix if it's not already there.  -- 
 justin

No, that won't work. You still be just as non-conforming by doing that.
But if mod_deflate may to produce different octet-level results on
different requests for the same original entity then it must do this in
addition to other transforms of the ETag.

The identity and gzip encodings is not bidirectionally semantically
equivalent, and additionally normal conditional comparing W/X to X
is true.

See RFC 2616 3.3.3 Weak and Strong Validators

You must make the value of the ETag differ between the two entities.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Wrong etag sent with mod_deflate

2006-12-07 Thread Henrik Nordstrom
tor 2006-12-07 klockan 02:42 +0100 skrev Justin Erenkrantz:

 -1 on adding semantic junk to the existing ETag (and keeping it
 strong); that's blatantly uncool.  Any generated ETag from mod_deflate
 should either be the original strong version or a weak version of any
 previous etag.  mod_deflate by *definition* is just creating a weak
 version of the prior entity.

You basically only have two choices:

a) Make mod_deflate not send an ETag on modified responses.

b) Modify the value (within the quotes) of the ETag somehow. And if
mod_deflate can not be trusted to always return the same octet
representation make sure to use an weak ETag unless the ETag generation
is also tightly coupled to the octet representation guaranteing a
different ETag should mod_deflate encode slightly different.

And to be fully compliant you also need to pay attention to the
Content-Location header. Here I don't see much choice but to not send
Content-Location in mod_deflate mangled responses (but can be kept on
the original response, no problem there).

RFC 2616 13.6 Caching Negotiated Responses, last paragraph.

 mod_deflate does properly stick in the Vary header, so caches already
 have enough knowledge to know what's going on anyway even without a
 fix.  (This is probably why mod_cache doesn't flag it as an error.)

 My opinion is to fix the protocol and move on...  -- justin

The protocol is quite fine as it is, and not easy to change. As it is
now it's mainly a matter of understanding that mod_deflate does create a
completely new entity from the original one. To the protocol it's
exactly the same as when using mod_negotiate and having both the
identity and gzip encoded entities on disk. The fact that you do this
encoding on the fly is of no concern to HTTP.

Another option is to explore the use gzip transfer encoding instead of
content encodin. In transfer encoding none of these problems apply as
it's done on the transport level and not entity level, but it's not that
well supported in clients unfortunately..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_disk_cache summarization

2006-10-27 Thread Henrik Nordstrom
fre 2006-10-27 klockan 23:33 +0200 skrev Graham Leggett:

 A second approach could involve the use of the Etags associated with 
 file responses, which in the case of files served off disk (as I 
 understand it) are generated based on inode number and various other 
 uniquely file specific information.

How ETag:s is generated is extremely server dependent, and not
guaranteed to be unique across different URLs. You can not at all count
on two files having the same ETag but different URLs to be the same
file, unless you also is responsible for the server providing all the
URLs in question and know that the server guarantees this behavior of
ETag beyond what the HTTP specification says.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_disk_cache summarization

2006-10-27 Thread Henrik Nordstrom
lör 2006-10-28 klockan 00:21 +0200 skrev Henrik Nordstrom:
 fre 2006-10-27 klockan 23:33 +0200 skrev Graham Leggett:
 
  A second approach could involve the use of the Etags associated with 
  file responses, which in the case of files served off disk (as I 
  understand it) are generated based on inode number and various other 
  uniquely file specific information.
 
 How ETag:s is generated is extremely server dependent, and not
 guaranteed to be unique across different URLs. You can not at all count
 on two files having the same ETag but different URLs to be the same
 file, unless you also is responsible for the server providing all the
 URLs in question and know that the server guarantees this behavior of
 ETag beyond what the HTTP specification says.

Content-MD5 may be possible to use for this purpose of identifying the
same file from different URLs, if it wasn't for the stupid facts that

a) Few if any servers send Content-MD5

b) The HTTP standard is a bit ambiguous on the meaning Content-MD5 and
can mean different things on 204 responses depending on who reads the
spec..

c) There is no conditional to ask for a file only if the Content-MD5
differs. Only way to get the Content-MD5 without the actual content if
it's the same is to use a HEAD request and manually compare the header.
And due to the ambiguity mentioned above I would not count on
Content-MD5 being correct in HEAD responses..

d) And even if the Content-MD5 is the same it says nothing about the
entity headers (content-type etc). Two responses with different entity
headers are different responses even if their body is the same.


If you do use Content-MD5 or a similar checksum you better verify the
checksum to match the content before migrating it to another URL. If not
you could open yourself up to cache pollution attacks.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Issue with persistent http proxy backend connection

2006-10-12 Thread Henrik Nordstrom
tor 2006-10-12 klockan 13:19 +0200 skrev Ruediger Pluem:

 I do not think that this matters all too much, because the backend closes
 the connection *immediately* after sending out the response.

To help this, perhaps there should be a check just before sending the
response as well, and send Connection: close if it's likely this
thread should terminate after this response?

MaxRequestsPerChild certainly can be evaluated before the response is
sent.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-21 Thread Henrik Nordstrom
tor 2006-09-21 klockan 12:18 +0200 skrev Plüm, Rüdiger, VF EITO:

 IMHO this waits for a DoS to happen if the requestor can trick the backend
 to get stuck with the request. So one request of this type would be sufficient
 to DoS the whole server if the timeout is not very short.

How would this be more of a DoS than just flooding the proxy with
connections to a non-existing server? The delay is per URL, not a while
requested site.

Sure, an attacker can use this to make it look like a site with this
problem is non-responsive for users via the cache, but it's not that
difficult to handle. Maybe you already do what we do in Squid: ignore
the cache on reload request. Solves the problem quite nicely. However,
in Squid we do start transmitting what is available immediately, but our
design is somewhat different.

To avoid DoS all you need to do is keep monitoring the client
connection, and abort if the client aborts while waiting for the entity
to become available.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-20 Thread Henrik Nordstrom
tor 2006-09-21 klockan 00:19 +0300 skrev Issac Goldstand:

 The only really relevant line I saw (in a quick 15 minute review) is RFC
 2616-3.6 (regarding transfer-encodings):
Transfer-coding values are used to indicate an encoding
transformation that has been, can be, or may need to be applied to an
entity-body in order to ensure safe transport through the network.
This differs from a content coding in that the transfer-coding is a
property of the message, not of the original entity.
 
 Based on that, it seems to be ok.  However, we'd have to remove strong
 ETags as a side-effect if it was done (since strong ETags change when
 entity headers change).

Hmm... transfer-encoding is a function of the transport alone, not the
entity. Don't mix these up. The entity is unaltered by
transfer-encoding, it's only how it's transferred over the transport
(i.e. TCP) which is altered. This also means that transfer-encoding is
hop-by-hop. In applications layered along the intentions of the RFC then
a cache (any level, browser or proxy) would never see any transfer
encoding as this should have been decoded by the receiving protocol
handler, only identity encoding should be seen.

This is different from Content-Encoding which does alter the entity as
such. Modifications of the Content-Encoding must also account for ETag:s
as no two entity variants of the same URL may carry the same strong
ETag.

 And move trailers into headers (another reason
 to rewrite the headers file at the end).  And probably other things
 which I'm not think of...

Thats always ok. the division of main and trailer headers is also mainly
a transport thing. Only available with chunked encoding btw as it's the
only transfer mechanism which allows for a tralier. The specs allows you
to drop any trailer headers if hard to deal with or to merge them with
the main header if you can.

In direct chunked-chunked proxy transfer you should proxy the trailer
as well. In chunked-identiy transfer (i.e. HTTP/1.1 response -
HTTP/1.0 client) the tralier is silently dropped as there is no means to
transfer the trailer in HTTP/1.0, and you can't rewind a TCP stream to
add data earlier...

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: apache 2.2 crashes at the start time in mod_dbd.c then preparing AuthDBDUserPWQuery

2006-07-23 Thread Henrik Nordstrom
sön 2006-07-23 klockan 00:10 +0100 skrev Nick Kew:

 But if you look at the full traceback and crossreference it to the
 source, I think that looks improbable.  Do you have sufficient gcc/gdb
 expertise to shed real light on this?

Not really, only experience..

From what I have seen the causes to significantly garbled/nonsense
arguments in stack traces is

  7 of 10 -O2 somehow messing with the arguments or otherwise making gdb
confused
  2 of 10 smashed stacks in the called function
  1 of 10 bugs in the calling function or earlier actually passing
non-sense data to the function.

The first and last can be identified by going up to the calling function
and inspecting what the arguments should have been.

The middle by hexdumping the stack contents, and looking at matching
usage of nearby local arrays.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: apache 2.2 crashes at the start time in mod_dbd.c then preparing AuthDBDUserPWQuery

2006-07-22 Thread Henrik Nordstrom
lör 2006-07-22 klockan 18:00 +0100 skrev Nick Kew:

 #3  0x08081d67 in ap_dbd_prepare (s=0x8daf5a0, query=0x Address 
 0x out of bounds, label=0x Address 0x out of 
 bounds)
 at mod_dbd.c:150

Note: Could maybe be -O2 or higher optimizing away the variables when
they are no longer needed. Seen such things happen very often on many
platforms. Does not need to indicate a bug or even a problem..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: [Patch]: Do not compress bodies of header only requests in mod_deflate

2006-07-17 Thread Henrik Nordstrom
tis 2006-07-18 klockan 00:47 +0200 skrev Ruediger Pluem:

 And this is exactly the question: Is it ok for
 the HEAD response to differ from the GET response with respect to T-E
 and C-L headers

It's not in case of C-L. For a starter HEAD is used by quite many robots
with simplistic caches to verify that the copy they have is current and
correct.

The RFC is quite strict that entity headers of a HEAD response SHOULD
match those of a identical GET request, so difference in C-L is not
acceptable by the RFC. (T-E is transport, and may differ)

It's a pity T-E gzip isn't deployable. Would eleminate this whole
question as it's not an entity transform..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Compiling a C++ module with g++ on Solaris

2006-06-12 Thread Henrik Nordstrom
sön 2006-06-11 klockan 18:17 +0100 skrev Phil Endecott:

 Is it possible that there is some libstdc++ initialisation that hasn't 
 happened?  I could imagine that this would require special support from 
 the linker or the dlopen stuff, and that that behaves differently with 
 Sun's libc and linker than on Linux.

Not too unlikely.

A simple test is to try if it makes any difference if you have the
Apache binary linked by g++ instead of gcc.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel