Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-08 Thread Joshua Slive



Justin Erenkrantz wrote:

On Tue, Nov 08, 2005 at 07:48:07AM +0100, Ruediger Pluem wrote:

So do you think that there is a todo for mod_authz_host to add such things
or should this be left to the administrator who can of course use
mod_headers in the first case to add Cache-Control: private?


It'd be nice if mod_authz_host could figure out when to stick in
Cache-Control: private on its own.

A possible candidate looks to be in the else block near
mod_authz_host.c:279

 else if (a-order[method] == DENY_THEN_ALLOW) {

Placing a config-overridable

 apr_table_set(Cache-Control, private);

line in that else block would likely work, I guess.  (should that be
apr_table_merge instead?)

Completely untested and clearly not thought through.  =)  -- justin


Although the idea of setting cache-control based on Order seems nice at 
first glance, I think we need to remember that users assume the 
following three configs are interchangeable.  And I don't see anything 
inherently wrong with the assumption, given that they have the exact 
same effect:


1. Order Allow,Deny
   Allow from all

2. Order Deny,Allow
   Allow from all

3. Order Deny,Allow

The difference between the three only becomes important if you add more 
Allow/Deny directives.


Joshua.


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-08 Thread Colm MacCarthaigh
On Tue, Nov 08, 2005 at 12:54:18PM -0500, Joshua Slive wrote:
 1. Order Allow,Deny
Allow from all
 
 2. Order Deny,Allow
Allow from all
 
 3. Order Deny,Allow
 
 The difference between the three only becomes important if you add more 
 Allow/Deny directives.

o.k., is the following reasonable?

If DENY_THEN_ALLOW:
if no-rules:
no-header;
else if single-allow-from-all  no-deny-rules:
no-header;
else
header;

else if ALLOW_THEN_DENY:
if no-rules:
who-cares;
else if single-allow-from-all  no-deny-rules:
no-header;
else
header;

Which, are reducable to:

if single-allow-from-all  no-deny-rules:
no-header;
else
header;

right?

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-08 Thread Joshua Slive


Colm MacCarthaigh wrote:


if single-allow-from-all  no-deny-rules:
no-header;
else
header;


I think that is probably reasonable and would catch 99.5% of real 
configs.  There is a silly case that I didn't mention:


Order deny,allow
Deny from all
Allow from all

So really, the optimal algorithm would be
if deny,allow  (no-deny-rules || allow-all-is-last)
   no-header;
else if allow,deny  no-deny-rules
   no-header;
else
   header;

Another thing to consider, however, is:

BrowserMatch email-grabber bad-robot
Order allow,deny
Allow from all
Deny from env=bad-robot

Do env= directives get excluded from the algorithm?  Otherwise, 
apache.org (and many other sites) suddenly becomes completely uncachable.


To be 100% correct, any use of BrowserMatch (or SetEnvIf User-Agent) 
should set Vary: User-Agent, but this is not what is desired most of the 
time.


My personal opinion is that you are going to surprise many more people 
by trying to infer the correct cache headers than you will by leaving 
them out.  As an example of us trying to be too smart, consider 
mod_rewrite, which sends Vary: Host when doing rewriting based on the 
Host: header.  This header is redundant as far as I can tell (we don't 
set it for ordinary name-based vhosting), but makes sites uncachable in 
some browsers and proxies.


Joshua.


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-08 Thread Ruediger Pluem


On 11/08/2005 01:36 AM, Roy T. Fielding wrote:
 On Nov 7, 2005, at 3:03 PM, Ruediger Pluem wrote:
 

[..cut..]

 
 but the next request for this (fresh) resource will not check the
 access control and
 deliver it to any client, regardless of the IP. Correct?
 

Many thanks for sorting my confused thoughts.

 
 The forward proxy would deliver it to any client that had the
 ability to GET from that proxy.
 

And this actually depends on if the resource requested by the client has been 
already
cached or not. If it has not been cached things like

Proxy *

order allow,deny
allow from 192.168.1.10

/Proxy

work as expected (access to the proxied resource is only granted to 
192.168.1.10).

But once the resource has been cached by mod_cache access to it is granted to 
*every* client,
because the access checker has not been run when the quick handler decides to 
deliver
the (fresh) content by inserting the CACHE_OUT filter and kicking the filter 
stack.

Although this is not a regression to 2.0.x (is it one to 1.3.x???), it is a 
weird behaviour
from the users perspective. Even more as

http://httpd.apache.org/docs/2.0/mod/mod_proxy.html#access

suggests to secure a forward proxy by using mod_authz_host. Currently the 
advice should be the
opposite: Yes, secure your forward proxy, but do *not* do this with 
mod_authz_host as it
does not work as expected.

Nevertheless I regard this discussion as very useful with respect to caching 
reverse proxies or
other cached local resources that are under access control.

That said I come back to the starting point of this discussion:

I think Paul's patch to make it configurable where to run the cache handler is 
currently the
best proposal on the table, provided that it is configurable in a way that 
expresses what it
does from the users perspective. So I would regard something like 
CacheRunQuickHandler as a
bad idea. I would have something in mind like CacheDisableAccessControl (ok the 
flaming can start :-).


Regards

Rüdiger


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-08 Thread Joshua Slive



Ruediger Pluem wrote:


http://httpd.apache.org/docs/2.0/mod/mod_proxy.html#access

suggests to secure a forward proxy by using mod_authz_host. Currently the 
advice should be the
opposite: Yes, secure your forward proxy, but do *not* do this with 
mod_authz_host as it
does not work as expected.


Has anyone actually tested this?  Is it true that there is no way to run 
a host-restricted cached proxy?  That would be really lame.


Joshua.


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-08 Thread Ruediger Pluem


On 11/08/2005 10:15 PM, Joshua Slive wrote:
 
 
 Ruediger Pluem wrote:
 

[..cut..]

 
 
 Has anyone actually tested this?  Is it true that there is no way to run
 a host-restricted cached proxy?  That would be really lame.

I tested only with 2.0.55 today. But given the fact that this part of the cache
architecture has not changed between 2.0.x and 2.2.x/trunk I am pretty sure that
we have the same thing there.

Regards

Rüdiger


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Nick Kew
On Monday 07 November 2005 03:26, Paul Querna wrote:

  Cache-control: private
 
  is what should be added for any resource under access control.

I'd prefer something a little less drastic, like 'faking' a header out
of remote-ip.

 I still like making it admin configurable. Allowing the admin to
 configure mod_cache to run as a quick-handler, or a normal-handler.  It
 puts the burden of breaking standards onto the Admin.

+1

-- 
Nick Kew


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Roy T. Fielding

On Nov 7, 2005, at 2:06 AM, Nick Kew wrote:


On Monday 07 November 2005 03:26, Paul Querna wrote:


Cache-control: private

is what should be added for any resource under access control.


I'd prefer something a little less drastic, like 'faking' a header out
of remote-ip.


Why?  All you accomplish is to cause this problem of a downstream
cache not knowing that there is access control.  If you don't want
access control, then don't use access control.

Roy



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Ruediger Pluem


On 11/07/2005 07:27 PM, Roy T. Fielding wrote:
 On Nov 7, 2005, at 2:06 AM, Nick Kew wrote:
 
 On Monday 07 November 2005 03:26, Paul Querna wrote:

 Cache-control: private

 is what should be added for any resource under access control.


 I'd prefer something a little less drastic, like 'faking' a header out
 of remote-ip.
 
 
 Why?  All you accomplish is to cause this problem of a downstream
 cache not knowing that there is access control.  If you don't want
 access control, then don't use access control.
I agree that there are many situation where it does not make sense to cache 
things under access
control, but there are ones where it makes sense.

e.g. If you create a forward proxy with httpd that should use caching and that 
only
a limited number of clients on your LAN should be able to use.

So I agree with Paul that it should be configurable.

Regards

Rüdiger



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Graham Leggett

Ruediger Pluem wrote:


I agree that there are many situation where it does not make sense to cache 
things under access
control, but there are ones where it makes sense.

e.g. If you create a forward proxy with httpd that should use caching and that 
only
a limited number of clients on your LAN should be able to use.


Forward proxies using access control use the Proxy-Authenticate header, 
which is entirely different access control to the WWW-Authenticate 
header used in normal access control. The Cache-Control: private header 
would not apply in this case.



So I agree with Paul that it should be configurable.


Thinking about this for a bit, I don't think it should be configurable. 
Adding Cache-Control: private to access controlled resources is part 
of RFC2616, and this spec shouldn't be overriden lightly.


If there is a compelling reason to support not adding Cache-Control: 
private to authenticated requests, then it's definitely an option, but I 
think we should default to the safe option for now.


Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Paul Querna
Graham Leggett wrote:
 Ruediger Pluem wrote:
 
 I agree that there are many situation where it does not make sense to
 cache things under access
 control, but there are ones where it makes sense.

 e.g. If you create a forward proxy with httpd that should use caching
 and that only
 a limited number of clients on your LAN should be able to use.
 
 Forward proxies using access control use the Proxy-Authenticate header,
 which is entirely different access control to the WWW-Authenticate
 header used in normal access control. The Cache-Control: private header
 would not apply in this case.
 
 So I agree with Paul that it should be configurable.
 
 Thinking about this for a bit, I don't think it should be configurable.
 Adding Cache-Control: private to access controlled resources is part
 of RFC2616, and this spec shouldn't be overriden lightly.
 
 If there is a compelling reason to support not adding Cache-Control:
 private to authenticated requests, then it's definitely an option, but I
 think we should default to the safe option for now.

The compelling reason is that this implies that even for the DEFAULT
configuration of apache, we should be sending cache-control private, for
EVERY page served.

That is bad. bad bad bad bad bad bad bad bad bad bad bad.  Did I mention
that is bad?

We need a better solution.

This also implies that if we you use mod_rewrite based on any
non-Varied-Header information, you should be setting Cache-Control:
Private too.


-Paul


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Roy T. Fielding

On Nov 7, 2005, at 1:01 PM, Paul Querna wrote:

If there is a compelling reason to support not adding Cache-Control:
private to authenticated requests, then it's definitely an option, 
but I

think we should default to the safe option for now.


The compelling reason is that this implies that even for the DEFAULT
configuration of apache, we should be sending cache-control private, 
for

EVERY page served.


Why?


This also implies that if we you use mod_rewrite based on any
non-Varied-Header information, you should be setting Cache-Control:
Private too.


No, you should be setting Vary: * if the content varies.  That is
also required by HTTP.

The default in all cases should be HTTP-compliant.  You can define
additional directives for overriding compliance by consent of
the owner, but we shouldn't ship a server that doesn't work
correctly by default.

Roy



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Ruediger Pluem


On 11/07/2005 09:48 PM, Graham Leggett wrote:
 Ruediger Pluem wrote:
 
 I agree that there are many situation where it does not make sense to
 cache things under access
 control, but there are ones where it makes sense.

 e.g. If you create a forward proxy with httpd that should use caching
 and that only
 a limited number of clients on your LAN should be able to use.
 
 
 Forward proxies using access control use the Proxy-Authenticate header,
 which is entirely different access control to the WWW-Authenticate
 header used in normal access control. The Cache-Control: private header
 would not apply in this case.

This is often done via IP addresses and not via username/password.
And this is what I think is the real pain and complain: I does not work
with IP based access controls. Setting Cache-Control: private is just not
what you want here, because this would prevent caching in this case.
BTW: RFC2616 says in 14.9.1:

private
  Indicates that all or part of the response message is intended for
  a single user and MUST NOT be cached by a shared cache. This
  allows an origin server to state that the specified parts of the
  response are intended for only one user and are not a valid
  response for requests by other users. A private (non-shared) cache
  MAY cache the response.

It talks about *single* users. The problems we are facing here are *groups* of
users. So the cache is a shared cache for this group of users in this case.

Regards

Rüdiger

[..cut..]



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Nick Kew
On Monday 07 November 2005 21:10, Roy T. Fielding wrote:
 On Nov 7, 2005, at 1:01 PM, Paul Querna wrote:
  If there is a compelling reason to support not adding Cache-Control:
  private to authenticated requests, then it's definitely an option,
  but I
  think we should default to the safe option for now.
 
  The compelling reason is that this implies that even for the DEFAULT
  configuration of apache, we should be sending cache-control private,
  for
  EVERY page served.

 Why?

  This also implies that if we you use mod_rewrite based on any
  non-Varied-Header information, you should be setting Cache-Control:
  Private too.

 No, you should be setting Vary: * if the content varies.  That is
 also required by HTTP.

That applies if it varies by some request header.

The whole problem here is that Remote-IP is not a request header.
It is not accessible through HTTP.  And it would be hard to incorporate,
because either we trust it and it's trivial to forge, or we enforce it and
exclude any client behind NAT.

 The default in all cases should be HTTP-compliant.  You can define
 additional directives for overriding compliance by consent of
 the owner, but we shouldn't ship a server that doesn't work
 correctly by default.

If that was the only issue, there wouldn't be a problem.

-- 
Nick Kew


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Colm MacCarthaigh
On Mon, Nov 07, 2005 at 09:28:54PM +, Nick Kew wrote:
  No, you should be setting Vary: * if the content varies.  That is
  also required by HTTP.
 
 That applies if it varies by some request header.

Vary: * means that how the content varies in unspecified, and section
12.1 of RFC2616 explicitly mentions the network address of the client as
an example of server driven negotiation, and that the Vary header can be
used for such things :)

 The whole problem here is that Remote-IP is not a request header.
 It is not accessible through HTTP.  And it would be hard to incorporate,
 because either we trust it and it's trivial to forge, or we enforce it and
 exclude any client behind NAT.

Content that is variable by IP address should have Vary: * imo, and
content that is allowed/denied on a per-IP address basis, should
probably have Cache-Control: private.

The first is really a problem for server administrators, but the second
can be handled by httpd, would it be reasonable to set the header unless
there is either no Allow/Deny rules at all, or there is one Allow from
all rule and no Deny rules?

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Ruediger Pluem


On 11/07/2005 10:31 PM, Justin Erenkrantz wrote:
 --On November 7, 2005 10:16:34 PM +0100 Ruediger Pluem
 [EMAIL PROTECTED] wrote:

[..cut..]

 
 The problem is that without Cache-Control: private, any downstream cache
 would have the exact same problem.  There's no way for it to know that
 the response differs based on IPs unless the Origin says so.  -- justin

This is true. But in the case of a forward proxy that is used to give office
users access to the internet in general based on there IP this is no problem.
I do not argue that this behaviour should be the default behaviour of httpd.
I completely agree with Roy that httpd by default must be HTTP compliant, but
there should be possibilties (and there already are) to
break this compliance with explicit configuration options to get some things 
working.

Regards

Rüdiger



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Justin Erenkrantz
--On November 7, 2005 11:09:05 PM +0100 Ruediger Pluem [EMAIL PROTECTED] 
wrote:



must be HTTP compliant, but there should be possibilties (and there
already are) to
break this compliance with explicit configuration options to get some
things working.


Yes, CacheStorePrivate will do this.  -- justin


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Roy T. Fielding

On Nov 7, 2005, at 2:09 PM, Ruediger Pluem wrote:

The problem is that without Cache-Control: private, any downstream 
cache

would have the exact same problem.  There's no way for it to know that
the response differs based on IPs unless the Origin says so.  -- 
justin


This is true. But in the case of a forward proxy that is used to give 
office users access to the internet in general based on there IP this 
is no problem.


Then either the forward proxy has an external agreement with the
source (and can override the cache-control) or it has no clue
about the source and cannot safely cache the content.  In any
case, the messages that we send must be correctly marked as
private because that is our configuration.

If the forward proxy sends a request that has no Cookie and
no Authorization, then it is likely that a cache-control private
response is indicative of IP-based access control.  If you want
to code up a forward ProxyCache override based on that, go wild.

Roy



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Ruediger Pluem


On 11/07/2005 11:30 PM, Roy T. Fielding wrote:
 On Nov 7, 2005, at 2:09 PM, Ruediger Pluem wrote:
 
 The problem is that without Cache-Control: private, any downstream cache
 would have the exact same problem.  There's no way for it to know that
 the response differs based on IPs unless the Origin says so.  -- justin


 This is true. But in the case of a forward proxy that is used to give
 office users access to the internet in general based on there IP this
 is no problem.
 
 
 Then either the forward proxy has an external agreement with the
 source (and can override the cache-control) or it has no clue
 about the source and cannot safely cache the content.  In any
 case, the messages that we send must be correctly marked as
 private because that is our configuration.

Just checking if I understood things correctly:

If I have a forward proxy to which I limit access via IP based access control
I should add Cache-Control: private to any response I get back from the backend
(either a Remote Proxy or the origin server).

This response would not be cached by mod_cache unless I overwrite it with
CacheStorePrivate on.

If I set CacheStorePrivate to on the reponse gets cached by mod_cache, but the
next request for this (fresh) resource will not check the access control and
deliver it to any client, regardless of the IP. Correct?


Regards

Rüdiger



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Ruediger Pluem


On 11/07/2005 10:10 PM, Roy T. Fielding wrote:
 On Nov 7, 2005, at 1:01 PM, Paul Querna wrote:
 
 If there is a compelling reason to support not adding Cache-Control:
 private to authenticated requests, then it's definitely an option, but I
 think we should default to the safe option for now.


 The compelling reason is that this implies that even for the DEFAULT
 configuration of apache, we should be sending cache-control private, for
 EVERY page served.
 
 
 Why?

Not for every page, but if I get it right once you lock out one bad boy via

deny ipaddress

than it should be sent. AFAIK this not done automatically currently once you 
add a deny
directive somewhere. Does this need to be changed?


Regards

Rüdiger


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Graham Leggett

Ruediger Pluem wrote:


If I have a forward proxy to which I limit access via IP based access control
I should add Cache-Control: private to any response I get back from the backend
(either a Remote Proxy or the origin server).


A very important distinction: forward and reverse proxy authentication 
works completely differently from each other.


In a forward proxy configuration, you authenticate access to the proxy 
using Proxy-Authenticate. Once authenticated you can view cached content.


In a reverse proxy (or any normal content) configuration, you 
authenticate access to content using WWW-Authenticate, and here the 
Cache-Control: private must be used to make sure that content generated 
for you is not inadvertently delivered to someone else.


Cache-Control: private is not necessary in the forward proxy config. It 
is necessary in the reverse proxy / normal config case.


Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Roy T. Fielding

On Nov 7, 2005, at 3:03 PM, Ruediger Pluem wrote:

Just checking if I understood things correctly:

If I have a forward proxy to which I limit access via IP based access 
control
I should add Cache-Control: private to any response I get back from 
the backend

(either a Remote Proxy or the origin server).


No, access control on a forward proxy has nothing to do with
cache-control.  Cache-control is defined by the origin server.

This response would not be cached by mod_cache unless I overwrite it 
with

CacheStorePrivate on.

If I set CacheStorePrivate to on the reponse gets cached by mod_cache,


Yes, though I would hope that you would set that within a
Location directive specific to a given set of URIs.


but the next request for this (fresh) resource will not check the
access control and
deliver it to any client, regardless of the IP. Correct?


The forward proxy would deliver it to any client that had the
ability to GET from that proxy.

Roy



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Roy T. Fielding

On Nov 7, 2005, at 3:10 PM, Ruediger Pluem wrote:
Not for every page, but if I get it right once you lock out one bad 
boy via



deny ipaddress

than it should be sent. AFAIK this not done automatically currently 
once you add a deny

directive somewhere. Does this need to be changed?


I can't remember which directive applies where, but if the
access control is set to deny all and allow some, where some
is a locally restricted subset of all, then cache-control
private is required on non-error responses unless the request
included Authorization (in which case cache-control private
is optional because it is already implied with Auth).

If the directive is set to allow all and deny some, then
it is reasonable to assume that the access control is for
service reasons, not authentication, and thus anyone who
receives the message should be allowed to cache it for others.

It would be wise to make both configurable.

Roy



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-07 Thread Justin Erenkrantz
On Tue, Nov 08, 2005 at 07:48:07AM +0100, Ruediger Pluem wrote:
 So do you think that there is a todo for mod_authz_host to add such things
 or should this be left to the administrator who can of course use
 mod_headers in the first case to add Cache-Control: private?

It'd be nice if mod_authz_host could figure out when to stick in
Cache-Control: private on its own.

A possible candidate looks to be in the else block near
mod_authz_host.c:279

 else if (a-order[method] == DENY_THEN_ALLOW) {

Placing a config-overridable

 apr_table_set(Cache-Control, private);

line in that else block would likely work, I guess.  (should that be
apr_table_merge instead?)

Completely untested and clearly not thought through.  =)  -- justin


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-06 Thread Roy T . Fielding

On Nov 4, 2005, at 10:56 AM, William A. Rowe, Jr. wrote:

It leaves us wondering; how can allow from/deny from n.n.n.n be mapped 
to
RFC 2616 semantics, or at least, without running the many server hooks 
on
later requests?  The only way I can see, is that we should have any 
more
explicit allow from/deny from leave a marker in the request record 
from that
authorization phase, and mark it nocache if the request doesn't 
otherwise

set the authentication required headers.


Cache-control: private

is what should be added for any resource under access control.

Roy



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-06 Thread Paul Querna
Roy T.Fielding wrote:
 On Nov 4, 2005, at 10:56 AM, William A. Rowe, Jr. wrote:
 
 It leaves us wondering; how can allow from/deny from n.n.n.n be mapped to
 RFC 2616 semantics, or at least, without running the many server hooks on
 later requests?  The only way I can see, is that we should have any more
 explicit allow from/deny from leave a marker in the request record
 from that
 authorization phase, and mark it nocache if the request doesn't otherwise
 set the authentication required headers.
 
 Cache-control: private
 
 is what should be added for any resource under access control.

But.. But.. But.. I want to have my cake and eat it too.

This does imply we should be adding this header anytime _)any_
mod_authz_* module is invoked.  That would suck.

I still like making it admin configurable. Allowing the admin to
configure mod_cache to run as a quick-handler, or a normal-handler.  It
puts the burden of breaking standards onto the Admin.

-Paul


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-04 Thread Graham Leggett

Nick Kew wrote:


I'm not convinced by that either.  In fact, I dislike the whole run it in a
quick handler principle - it runs a supertanker through the KISS principle,
and has consequently left us with a cache that never really worked.
Even if we fix this, it's sure to have a high bugrate for the forseeable
future precisely because it violates KISS.


The principle behind running it in a quick handler in the original 
design was so that the cache could work as a 100% standards compliant 
HTTP/1.1 caching proxy, as described in section 13 of RFC2616.


RFC2616 is well understood, and already nails down all security issues 
to do with proxy caching. Making the cache follow a widely accepted and 
known to work standard follows the KISS principle.


Over time, the cache has slowly moved backwards from being as far 
forward on the frontend of the filter stack as possible, to further and 
further back into the webserver itself.


In the process, the cache becomes less and less compliant with RFC2616, 
which has made it more difficult to understand and more complex to 
implement.


The httpd cache is simply yet another cache in the chain of HTTP/1.1 
caches that are typically present when a browser accepts a page from a 
website. The authentication issue is handled by RFC2616 already, and as 
long as httpd mod_cache conforms to the correct headers handling, and 
works like the other proxies in the chain, then authentication is not a 
problem.


Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-04 Thread Joshua Slive


Graham Leggett wrote:
The httpd cache is simply yet another cache in the chain of HTTP/1.1 
caches that are typically present when a browser accepts a page from a 
website. The authentication issue is handled by RFC2616 already, and as 
long as httpd mod_cache conforms to the correct headers handling, and 
works like the other proxies in the chain, then authentication is not a 
problem.


I agree with you about 90%.  The problem is that there are a very few 
things that aren't accounted for in standard HTTP caching rules.  One 
example is Varying access by client IP address.  Another example is 
changing protocol behavior when communicating with the client. 
Reasonable proxy administrators may want to do these things.


Joshua.


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-04 Thread Brian Akins

Joshua Slive wrote:

I haven't looked carefully at the code, but I don't believe 
protocol-level things like the force-response-1.0 variable are stored in 
the cache.


If it's a global setenvif variable (runs is post-read, before 
quick-handler), then these adjustments work, because 
force-response-1.0 and others are at the protocol level, mod_cache 
doesn't cache whether it was a 1.0 or 1.1 response.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-04 Thread Graham Leggett
Joshua Slive said:

 I agree with you about 90%.  The problem is that there are a very few
 things that aren't accounted for in standard HTTP caching rules.  One
 example is Varying access by client IP address.

I can't see how you could have any meaningful caching at all if the
content is varied by IP address, unless you had the IP address in a header
and did some clever caching of variants.

In this case you'd probably not use the cache at all for this part of the
URL space.

 Another example is
 changing protocol behavior when communicating with the client.

What protocol behaviour would change, can you give an example?

Regards,
Graham
--



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-04 Thread William A. Rowe, Jr.

I almost tried to snip the comments below in my reply, and there was nothing
I could clip out - thank you Graham for explaining so clearly the entire design
principals of how and why mod_proxy does exactly what it does.

It leaves us wondering; how can allow from/deny from n.n.n.n be mapped to
RFC 2616 semantics, or at least, without running the many server hooks on
later requests?  The only way I can see, is that we should have any more
explicit allow from/deny from leave a marker in the request record from that
authorization phase, and mark it nocache if the request doesn't otherwise
set the authentication required headers.

Thoughts?

Graham Leggett wrote:

Nick Kew wrote:

I'm not convinced by that either.  In fact, I dislike the whole run 
it in a
quick handler principle - it runs a supertanker through the KISS 
principle,

and has consequently left us with a cache that never really worked.
Even if we fix this, it's sure to have a high bugrate for the forseeable
future precisely because it violates KISS.



The principle behind running it in a quick handler in the original 
design was so that the cache could work as a 100% standards compliant 
HTTP/1.1 caching proxy, as described in section 13 of RFC2616.


RFC2616 is well understood, and already nails down all security issues 
to do with proxy caching. Making the cache follow a widely accepted and 
known to work standard follows the KISS principle.


Over time, the cache has slowly moved backwards from being as far 
forward on the frontend of the filter stack as possible, to further and 
further back into the webserver itself.


In the process, the cache becomes less and less compliant with RFC2616, 
which has made it more difficult to understand and more complex to 
implement.


The httpd cache is simply yet another cache in the chain of HTTP/1.1 
caches that are typically present when a browser accepts a page from a 
website. The authentication issue is handled by RFC2616 already, and as 
long as httpd mod_cache conforms to the correct headers handling, and 
works like the other proxies in the chain, then authentication is not a 
problem.


Regards,
Graham
--




Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-04 Thread Joshua Slive



Graham Leggett wrote:

Joshua Slive said:


I agree with you about 90%.  The problem is that there are a very few
things that aren't accounted for in standard HTTP caching rules.  One
example is Varying access by client IP address.


I can't see how you could have any meaningful caching at all if the
content is varied by IP address, unless you had the IP address in a header
and did some clever caching of variants.

In this case you'd probably not use the cache at all for this part of the
URL space.


This is the case we've been discussing where someone wishes to, for 
example, restrict a reverse proxy to a particular network.  I agree that 
it can't be done with standard caching rules, which is the problem.  I 
don't think it is a huge problem, but I'm sure there are people who wish 
to run a host-restricted proxy.





Another example is
changing protocol behavior when communicating with the client.


What protocol behaviour would change, can you give an example?


force-response-1.0, for example.  Brian points out that this could still 
work if it was global.  But you couldn't apply particular protocol 
adjustments to particular areas of the URL-space.  This probably isn't a 
big problem.



Joshua.


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-04 Thread Ruediger Pluem


On 11/04/2005 08:20 PM, Joshua Slive wrote:
 
 
 Graham Leggett wrote:

[..cut..]


 In this case you'd probably not use the cache at all for this part of the
 URL space.
 
 
 This is the case we've been discussing where someone wishes to, for
 example, restrict a reverse proxy to a particular network.  I agree that
 it can't be done with standard caching rules, which is the problem.  I
 don't think it is a huge problem, but I'm sure there are people who wish
 to run a host-restricted proxy.

What about forward proxies that should only be usable from certain client IP's?
I admit I haven't tested so far, but these should fail also.

Regards

Rüdiger



cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-03 Thread Nick Kew
On Wednesday 02 November 2005 20:26, William A. Rowe, Jr. wrote:
 Colm MacCarthaigh wrote:
  I think the text Deny from all is a particularly dangerous thing to
  have not work as advertised! No matter how well documented :/

Nasty.  Is it necessarily a showstopper?

 The question though, is where can Deny from all be expected to work?

 Certainly not in Directory /foo - the cached entity no longer lives
 there.

I disagree.  If it came from there originally, then that's where it lives.
The principle we want here is that retrieving from cache should have
the same rules from a client PoV as retrieving an original unless a
sysop explicitly says otherwise.  Breaking that principle shouldn't
hit the sysop as a standard or default behaviour.

 Perhaps in Location /foo - but running the full handlers, dealing with
 all the regex'es all over again defeats the purpose of running a fast
 cache.

I'm not convinced by that either.  In fact, I dislike the whole run it in a
quick handler principle - it runs a supertanker through the KISS principle,
and has consequently left us with a cache that never really worked.
Even if we fix this, it's sure to have a high bugrate for the forseeable
future precisely because it violates KISS.

The main purpose of caching is to relieve the pressure on a big, slow backend.
In real life, most of that bigness and slowness is pretty much always going to 
live in a handler or a backend, so what we save by running quick_handler is
inherently of secondary importance.  And there are several things we can
do about that if we make cache a normal handler:
  * provide quick versions of early hooks.  For example, an authn that looks 
up cached headers and thus bypasses any potential trip to DBD or LDAP.
Similarly we may be able to bypass rewriterules, content negotiation,
or any trip to htaccess based on cache lookup matching; maybe your
CachedLocation proposal.
In a sense, that's modularising the quick handler concept!
  * Write a caching performance doc that discusses the issue, and makes
 clear the effect of anything complex in a hook that can't be bypassed.

 Certainly in VirtualHost www.cachedhost.example.com ... although authnz
 doesn't work correctly there in the first place ;-)

I take it that just refers to standard per-dir-config behaviour?  A directive
that's not even syntactically valid outside a directory-context can be
forgiven for not working there!


 And certainly globally, if I ran a large mass vhost, yet knew full well
 that a list of proxies would corrupt my content, I might

Deny from 10.123.55.0/24

 but again, authn/authz doesn't work globally.

Making it do so is a feature-enhancement, not a bugfix.  And we'd need to
have a proposal for implementation without undue complexity to consider
such an enhancement.

 We can discuss 'enabling' the map to storage for Location  and running
 the authz stack, but we would have to ensure we bypass the filesystem
 dir/files entities.  The deepest relevant level is Location .

In the present architecture that makes some sense.  But having chopped out
the hooks, adding them back in piecemeal seems reminiscent of Heath Robinson.

 And maybe, have you considered a CachedLocation  / CachedLocationMatch 
 container for mod_cache?  This would have the benefit that very long lists
 of directives would be ignored/not merged, in favor of a much shorter and
 very specific list that benefits the cache by keeping it fast, while giving
 the user the option to tweak the behavior of content, once cached.

You mean as a tool for sysops to accept/decline serving from cache?
That could potentially have merit, and would work best in a quick-translation
hook to bypass any more complex/expensive rules.
The danger is if it grows some nightmarishly confusing relationship
to normal Location semantics: the existing Location vs Directory
is bad enough for non-expert users!

-- 
Nick Kew


Re: [vote] 2.1.9 as beta

2005-11-03 Thread Joe Orton
On Wed, Nov 02, 2005 at 03:08:47PM -0500, Joshua Slive wrote:
 
 Colm MacCarthaigh wrote:
 I think the text Deny from all is a particularly dangerous thing to
 have not work as advertised! No matter how well documented :/
 
 Sure, but in truth, apache configuration is really complex and deny 
 from all doesn't always really mean what it says.
...
 All that to say, I'm fine with the document that mod_cache ignores 
 mod_access solution to this problem.

Agreed, and I don't see why this is a showstopper either if this has 
been the behaviour of mod_cache forever anyway.  showstopper === 
regression

joe



Re: [vote] 2.1.9 as beta

2005-11-03 Thread Colm MacCarthaigh
On Thu, Nov 03, 2005 at 03:27:43PM +, Joe Orton wrote:
 Agreed, and I don't see why this is a showstopper either if this has
 been the behaviour of mod_cache forever anyway.  showstopper ===
 regression

I've taken this out of the show-stopper section, I'll just live with
documentation as a fix for now, and maybe fill that out a bit more.

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-03 Thread Ruediger Pluem


On 11/03/2005 11:01 AM, Nick Kew wrote:
 On Wednesday 02 November 2005 20:26, William A. Rowe, Jr. wrote:
 
Colm MacCarthaigh wrote:


[..cut..]


Certainly not in Directory /foo - the cached entity no longer lives
there.
 
 
 I disagree.  If it came from there originally, then that's where it lives.
 The principle we want here is that retrieving from cache should have
 the same rules from a client PoV as retrieving an original unless a
 sysop explicitly says otherwise.  Breaking that principle shouldn't
 hit the sysop as a standard or default behaviour.
 

I agree with Nick on this as this creates the potential for confusion
on user side. It also make the behaviour inconsistent for already cached data
and data that gets cached the first time. From developer perspective I 
understand
the performance penalties of doing so.


 
 I'm not convinced by that either.  In fact, I dislike the whole run it in a
 quick handler principle - it runs a supertanker through the KISS principle,
 and has consequently left us with a cache that never really worked.
 Even if we fix this, it's sure to have a high bugrate for the forseeable
 future precisely because it violates KISS.
 
 The main purpose of caching is to relieve the pressure on a big, slow backend.
 In real life, most of that bigness and slowness is pretty much always going 
 to 
 live in a handler or a backend, so what we save by running quick_handler is
 inherently of secondary importance.  And there are several things we can

In the cases you describe here (and for what I use it personally) I agree, but 
for other
cases, e.g. the one Colm uses it (storing cached data on faster disks, than
the uncached data) I think running in handler is a pain.
I think in principle, it would be fine to choose via a configuration directive 
where to
run mod_cache as Paul's patch suggested. What is currently bothering me about 
this idea
is that this is a nice thing for people with inside view, but it is not really
transparent to users. So what this configuration option actually does from the 
users
perspective must be expressed more clearly by doing this configuration.
Maybe a good name for this directive can be enough.


[..cut..]

 
And maybe, have you considered a CachedLocation  / CachedLocationMatch 
container for mod_cache?  This would have the benefit that very long lists
of directives would be ignored/not merged, in favor of a much shorter and
very specific list that benefits the cache by keeping it fast, while giving
the user the option to tweak the behavior of content, once cached.
 
 
 You mean as a tool for sysops to accept/decline serving from cache?
 That could potentially have merit, and would work best in a quick-translation
 hook to bypass any more complex/expensive rules.
 The danger is if it grows some nightmarishly confusing relationship
 to normal Location semantics: the existing Location vs Directory
 is bad enough for non-expert users!
 

I also agree with this. While I understand the performance benefits from the
developer perspective, I fear the confusion from the user and administrators 
perspective.
Having a clear configuration is not only about having non-expert
users getting it work but also to ease the job of expert administrators to 
understand
what they configured a year or so after they did :-).

Regards

Rüdiger



Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-03 Thread Justin Erenkrantz
--On November 3, 2005 8:44:02 PM +0100 Ruediger Pluem [EMAIL PROTECTED] 
wrote:



I also agree with this. While I understand the performance benefits from
the developer perspective, I fear the confusion from the user and
administrators perspective. Having a clear configuration is not only
about having non-expert
users getting it work but also to ease the job of expert administrators
to understand what they configured a year or so after they did :-).


In my performance analyses that I did when redoing mod_cache last year, a 
substantial part of the time in httpd was spent in all of the hooks prior 
to the handler.  Things like BrowserMatch (which do regex's) are 
ridiculously expensive.


So, moving the cache to a regular handler is not a minor performance 
penalty - it's a major one.  And, probably to the point where there's *no* 
performance increase for even having a cache - unless you are combining it 
with a backend proxy.  -- justin


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-03 Thread Paul Querna
Justin Erenkrantz wrote:
 --On November 3, 2005 8:44:02 PM +0100 Ruediger Pluem
 [EMAIL PROTECTED] wrote:
 
 I also agree with this. While I understand the performance benefits from
 the developer perspective, I fear the confusion from the user and
 administrators perspective. Having a clear configuration is not only
 about having non-expert
 users getting it work but also to ease the job of expert administrators
 to understand what they configured a year or so after they did :-).
 
 In my performance analyses that I did when redoing mod_cache last year,
 a substantial part of the time in httpd was spent in all of the hooks
 prior to the handler.  Things like BrowserMatch (which do regex's) are
 ridiculously expensive.
 
 So, moving the cache to a regular handler is not a minor performance
 penalty - it's a major one.  And, probably to the point where there's
 *no* performance increase for even having a cache - unless you are
 combining it with a backend proxy.  -- justin

Or any other Dynamic source, like CGIs, PHP, etc.  This can still be a
major win, it just depends on your environment.  This is why it should
be configurable.  To get the best out of caching, you need local knowledge.

-Paul


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-03 Thread Joshua Slive

Justin Erenkrantz wrote:

In my performance analyses that I did when redoing mod_cache last year, 
a substantial part of the time in httpd was spent in all of the hooks 
prior to the handler.  Things like BrowserMatch (which do regex's) are 
ridiculously expensive.


Interesting to think, however, about what the purpose of all those 
regexes was (they are no longer in the 2.1 config).  They were meant to 
fix protocol problems.  Given that mod_cache doesn't run those hooks, 
it seems there is no way to work around client protocol problems.  (Just 
sending Vary: User-Agent wouldn't fix the problem, because when the user 
agent matched a cached variant, the protocol adjustments still wouldn't 
be applied.)


Important?  I don't know.  But if it is easy to make mod_cache 
configurable with regards to running all these hooks, I am sure that it 
would find uses.


Joshua.


Re: cache trouble (Re: [vote] 2.1.9 as beta)

2005-11-03 Thread Justin Erenkrantz
On Thu, Nov 03, 2005 at 08:03:56PM -0500, Joshua Slive wrote:
 it seems there is no way to work around client protocol problems.  (Just 
 sending Vary: User-Agent wouldn't fix the problem, because when the user 
 agent matched a cached variant, the protocol adjustments still wouldn't 
 be applied.)

Why wouldn't it?  -- justin


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Joe Orton
On Sat, Oct 29, 2005 at 09:09:46PM -0700, Paul Querna wrote:
 2.1.9-Beta is available from:
 http://people.apache.org/~pquerna/dev/httpd-2.1.9/
 
 Please test and vote on releasing 2.1.9 as BETA.

+1 for beta, manual testing + httpd-test passes on all-the-Linuxes here.  
Thanks for RMing!

joe


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Sat, Oct 29, 2005 at 09:09:46PM -0700, Paul Querna wrote:
 2.1.9-Beta is available from:
 http://people.apache.org/~pquerna/dev/httpd-2.1.9/
 
 Please test and vote on releasing 2.1.9 as BETA.

+1 for beta, but some things that would apply to GA;

Doing a complete fresh install from tarball I got caught out by the
largefile support. Although we no longer need any magic CFLAGS for httpd
itself to handle  2GB files, we do for sendfile for to work
- at least on my platform (Linux IA64). 

APR isn't picking up sendfile64 for me. Besides getting APR fixed, we
may need to handle this more gracefully within httpd for the cases where
the bundled apr isn't being used. If we don't have a sendfile capable of
sending 2GB files, we shouldn't try to use it for files 2GB.

 As a reminder, if you know of any issues you consider a SHOW STOPPER for
 a 2.2.0 stable release, please add them to the branches/2.2.x STATUS file.

I'm tempted to suggest the mod_cache Vs mod_authz_host as a
show-stopper, but since this is going nowhere fast and the only way to
fix it has a veto, the only viable solution may be to remove mod_cache
prior to GA.

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Joe Orton
On Wed, Nov 02, 2005 at 10:25:54AM +, Colm MacCarthaigh wrote:
 Doing a complete fresh install from tarball I got caught out by the
 largefile support. Although we no longer need any magic CFLAGS for httpd
 itself to handle  2GB files, we do for sendfile for to work
 - at least on my platform (Linux IA64). 

What is the problem that you are seeing? 
 
 APR isn't picking up sendfile64 for me. Besides getting APR fixed, we
 may need to handle this more gracefully within httpd for the cases where
 the bundled apr isn't being used. If we don't have a sendfile capable of
 sending 2GB files, we shouldn't try to use it for files 2GB.

This sounds very confused.  On 64-bit platforms there are never any 
magic CFLAGS needed, no sendfile64() needed, and should be no problems 
handling 2Gb files in the first place.

joe


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Wed, Nov 02, 2005 at 11:21:18AM +, Joe Orton wrote:
 On Wed, Nov 02, 2005 at 10:44:02AM +, Colm MacCarthaigh wrote:
  On Wed, Nov 02, 2005 at 10:33:50AM +, Joe Orton wrote:
   This sounds very confused.  On 64-bit platforms there are never any 
   magic CFLAGS needed, no sendfile64() needed, and should be no problems 
   handling 2Gb files in the first place.
  
  Doesn't seem to be that way on IA64, my sys/sendfile.h has:
 
 off_t is 64-bit on IA64, the LFS stuff is probably just there for -m32 
 builds or compatibility.  I just tested a 3Gb download with 2.1.9 on my 
 x86_64 box and it worked fine.  (as it should with 2.0.x also; nothing 
 really has changed on this front in 2.1.x for 64-bit builds)

I think you're right, and I think I've found the source of the problem:
apr's network_io/unix/sendrecv.c has:

#if APR_HAS_LARGE_FILES  defined(HAVE_SENDFILE64)
apr_off_t off = *offset;
#define sendfile sendfile64

#elif APR_HAS_LARGE_FILES  SIZEOF_OFF_T == 4
/* 64-bit apr_off_t but no sendfile64(): fail if trying to send
 * past the 2Gb limit. */
off_t off;

if ((apr_int64_t)*offset + *len  INT_MAX) {
return EINVAL;
}

and it's the latter branch that gets trigged on IA64, because
APR_HAS_LARGE_FILES is defined but HAVE_SENDFILE64 isn't. The trouble is
that INT_MAX is;

#  define INT_MAX   2147483647

so, apr_socket_sendfile is returning EINVAL when it shouldn't be.

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Joe Orton
On Wed, Nov 02, 2005 at 11:31:26AM +, Colm MacCarthaigh wrote:
 I think you're right, and I think I've found the source of the problem:
 apr's network_io/unix/sendrecv.c has:
 
 #if APR_HAS_LARGE_FILES  defined(HAVE_SENDFILE64)
 apr_off_t off = *offset;
 #define sendfile sendfile64
 
 #elif APR_HAS_LARGE_FILES  SIZEOF_OFF_T == 4
 /* 64-bit apr_off_t but no sendfile64(): fail if trying to send
  * past the 2Gb limit. */
...
 and it's the latter branch that gets trigged on IA64

How have you managed to get SIZEOF_OFF_T == 4 as true on IA64?  Can you 
upload the APR config.log somewhere?

joe


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Wed, Nov 02, 2005 at 11:38:17AM +, Joe Orton wrote:
  and it's the latter branch that gets trigged on IA64
 
 How have you managed to get SIZEOF_OFF_T == 4 as true on IA64?  

Hmmm, no, it's 8. As is size_t. I'm going back to scratch at looking at
what's up with gdb.

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Wed, Nov 02, 2005 at 11:49:15AM +, Colm MacCarthaigh wrote:
 On Wed, Nov 02, 2005 at 11:38:17AM +, Joe Orton wrote:
   and it's the latter branch that gets trigged on IA64
  
  How have you managed to get SIZEOF_OFF_T == 4 as true on IA64?  
 
 Hmmm, no, it's 8. As is size_t. I'm going back to scratch at looking at
 what's up with gdb.

O.k., it seems sendfile() is buggy and really doesn't support files 2Gb
on Linux on IA64, at least with my kernel (2.6.12.1). 

   sendfile(10, 11, [0], 4686706688) = -1 EINVAL (Invalid argument)

Messing with defines sufficiently that APR uses sendfile64() is no help,
and ends up back at the same system call anyway. 

Anyone else got an IA64 Linux box they can confirm this on?

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Joe Orton
On Wed, Nov 02, 2005 at 01:13:41PM +, Colm MacCarthaigh wrote:
 On Wed, Nov 02, 2005 at 11:49:15AM +, Colm MacCarthaigh wrote:
  On Wed, Nov 02, 2005 at 11:38:17AM +, Joe Orton wrote:
and it's the latter branch that gets trigged on IA64
   
   How have you managed to get SIZEOF_OFF_T == 4 as true on IA64?  
  
  Hmmm, no, it's 8. As is size_t. I'm going back to scratch at looking at
  what's up with gdb.
 
 O.k., it seems sendfile() is buggy and really doesn't support files 2Gb
 on Linux on IA64, at least with my kernel (2.6.12.1). 
 
sendfile(10, 11, [0], 4686706688) = -1 EINVAL (Invalid argument)
 
 Messing with defines sufficiently that APR uses sendfile64() is no help,
 and ends up back at the same system call anyway. 

 Anyone else got an IA64 Linux box they can confirm this on?

Seems to work OK for me with RHEL4/IA64 (2.6.9-22.0.1.EL) with my normal 
sendfile test app over loopback.

open(6G.sparse, O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=6442450945, ...}) = 0
sendfile(1, 3, [0], 6442450945) = 6442450945

joe


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Wed, Nov 02, 2005 at 01:34:56PM +, Joe Orton wrote:
 Seems to work OK for me with RHEL4/IA64 (2.6.9-22.0.1.EL) with my normal 
 sendfile test app over loopback.
 
 open(6G.sparse, O_RDONLY) = 3
 fstat(3, {st_mode=S_IFREG|0664, st_size=6442450945, ...}) = 0
 sendfile(1, 3, [0], 6442450945) = 6442450945

Thanks, I'll try a new kernel/glibc at some point. In the meantime, I think
doing;

Index: core_filters.c
===
--- core_filters.c  (revision 330237)
+++ core_filters.c  (working copy)
@@ -561,9 +561,16 @@
 (void)apr_socket_opt_set(s, APR_TCP_NOPUSH, 0);
 }
 if (rv != APR_SUCCESS) {
-return rv;
+/* There are cases in which a buggy sendfile can fail, but 
+ * an ordinary write may succeed. Let the bucket pass 
through 
+ * to the non-sendfile write. If it's really a problem with
+ * the socket, we'll find out quickly. */
+did_sendfile = 0;
+}
+else {
+break;
 }
-break;
 }
 }
 #endif /* APR_HAS_SENDFILE */
 
is reasonable. 

Basically is sendfile() returns an error, rather than give up trying to write
the file, move on to using ordinary write/writev which may or may not work.  If
it's a real problem (ie a dead socket or something), we'll find out anyway, and
if it works we get one less annoyance for our users.

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Joe Orton
On Wed, Nov 02, 2005 at 02:39:12PM +, Colm MacCarthaigh wrote:
 On Wed, Nov 02, 2005 at 01:34:56PM +, Joe Orton wrote:
  Seems to work OK for me with RHEL4/IA64 (2.6.9-22.0.1.EL) with my normal 
  sendfile test app over loopback.
  
  open(6G.sparse, O_RDONLY) = 3
  fstat(3, {st_mode=S_IFREG|0664, st_size=6442450945, ...}) = 0
  sendfile(1, 3, [0], 6442450945) = 6442450945
 
 Thanks, I'll try a new kernel/glibc at some point. In the meantime, I think
 doing;

Really I don't think it's right to change the code at all to try to cope 
with the Nth latest sendfile is broken if... issue.  Just 
EnableSendfile off as should be default.

joe


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Wed, Nov 02, 2005 at 02:50:55PM +, Joe Orton wrote:
 Really I don't think it's right to change the code at all to try to
 cope with the Nth latest sendfile is broken if... issue.  Just
 EnableSendfile off as should be default.

Definitely not the preferred option in my case, don't know how many
people would be likewise affected though (I'm guessing a handful). But
fair enough, I'll add it to list of reasons to disable sendfile, and
just run the patch myself.

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread William A. Rowe, Jr.

Colm MacCarthaigh wrote:



As a reminder, if you know of any issues you consider a SHOW STOPPER for
a 2.2.0 stable release, please add them to the branches/2.2.x STATUS file.


I'm tempted to suggest the mod_cache Vs mod_authz_host as a
show-stopper, but since this is going nowhere fast and the only way to
fix it has a veto, the only viable solution may be to remove mod_cache
prior to GA.


Well if you see only one way to fix it, yes, I'm guessing it all remains
at an impass for the next few years.  You've claimed this is a bug, I claim
you propose an enhancement.  But I see merit in the desire to restrict
content based on physical authz topography (not on user based authnz which
is already defined by http caching headers.)  Also I suspect that there are
'other ways', not simply one way.

In any case, there's nothing in STATUS and I was prepared to see where the
discussion last ended, but didn't have time today to start trawling through
the maillist archives.  Try adding an entry in STATUS when an issue needs
to be addressed.

Bill


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Ruediger Pluem


On 11/02/2005 11:25 AM, Colm MacCarthaigh wrote:
 On Sat, Oct 29, 2005 at 09:09:46PM -0700, Paul Querna wrote:

[..cut..]

 
As a reminder, if you know of any issues you consider a SHOW STOPPER for
a 2.2.0 stable release, please add them to the branches/2.2.x STATUS file.
 
 
 I'm tempted to suggest the mod_cache Vs mod_authz_host as a
 show-stopper, but since this is going nowhere fast and the only way to

I do not regard this as a showstopper since we only have an admittedly serious
security problem in a *specific* configuration. I think it is enough to add a 
big
warning to the mod_cache documentation that protecting cached resources with
mod_authz_host does not work as expected. There are many ways to create an 
insecure
configuration if you do not take care, so this warning should be enough.
Even more as caching seems to me some sort of advanced configuration anyway that
will mostly be done by more experienced people.

 fix it has a veto, the only viable solution may be to remove mod_cache

Just for my remembrance: This was the quick_handler vs. handler issue, correct?
Who actually vetoes this fix? As far as I remember the fix made it configurable
where to run the cache handler (quick_handler / handler), right?

 prior to GA.
 

If we remove it before GA no one can use it and it would be a large step 
backward
as

- It makes caching forward proxies impossible (a regression to 1.3 / 2.0.x).
- Is a major drawback for reverse proxy configurations, whose possibilities 
have been improved
  by large in 2.1
- It would be a large step backward compared to 2.0.x where this problem is 
also present.

If we leave it in we only have a subgroup of users who cannot use it.
What is more important from my point of view is that we return to a discussion 
how to solve
this problem and solve the technical concerns expressed in the veto of the fix.


Regards

Rüdiger



Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Wed, Nov 02, 2005 at 01:10:09PM -0600, William A. Rowe, Jr. wrote:
 Well if you see only one way to fix it, yes, 

The only viable way anyway. I've been looking at this for a few months,
since I first reported to [EMAIL PROTECTED] (still waiting on a
response) and have tried to construct the logic on the authz side, but
it would require quite a complex ACL compiler and even then would not
solve the problem of Deny statements being added while content was
already cached.

 I'm guessing it all remains at an impass for the next few years.
 You've claimed this is a bug, I claim you propose an enhancement.  But
 I see merit in the desire to restrict content based on physical authz
 topography (not on user based authnz which is already defined by http
 caching headers.)  Also I suspect that there are 'other ways', not
 simply one way.

Great!

 In any case, there's nothing in STATUS and I was prepared to see where
 the discussion last ended, but didn't have time today to start
 trawling through the maillist archives.  Try adding an entry in STATUS
 when an issue needs to be addressed.

I'll add one now. 

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Colm MacCarthaigh
On Wed, Nov 02, 2005 at 08:41:18PM +0100, Ruediger Pluem wrote:
 I do not regard this as a showstopper since we only have an admittedly
 serious security problem in a *specific* configuration. I think it is
 enough to add a big warning to the mod_cache documentation that
 protecting cached resources with mod_authz_host does not work as
 expected. There are many ways to create an insecure configuration if
 you do not take care, so this warning should be enough.  Even more as
 caching seems to me some sort of advanced configuration anyway that
 will mostly be done by more experienced people.

I think the text Deny from all is a particularly dangerous thing to
have not work as advertised! No matter how well documented :/

 Just for my remembrance: This was the quick_handler vs. handler issue,
 correct?  Who actually vetoes this fix? As far as I remember the fix
 made it configurable where to run the cache handler (quick_handler /
 handler), right?

Yes, basically the map to storage hook needs to be run before mod_cache
makes the decision to serve the content. Coming before the
map_to_storage hook is the real main difference between a quickhandler
and an ordinary handler, so inserting this hook into mod_cache itself
makes little sense. 

Additionally for a pure proxy environment we don't need the overhead of
the map to storage hook, it's only for local content that it matters in
this way. 

  prior to GA.
 
 If we remove it before GA no one can use it and it would be a large
 step backward as

It'd be awful!

 If we leave it in we only have a subgroup of users who cannot use it.
 What is more important from my point of view is that we return to a
 discussion how to solve this problem and solve the technical concerns
 expressed in the veto of the fix.

The patch that's vetoed is at:

http://marc.theaimsgroup.com/?l=apache-httpd-devm=111597814015667w=2

And the concerns at:

http://marc.theaimsgroup.com/?l=apache-httpd-devm=111600137824345w=2

In an ideal world, I agree with Bills line of reasoning there (though
that's a slightly different problem in the actual thread), the best w ay
to solve this would be to have mod_authz_host detect that the rule for
the content being served would always be Allow from all - so it's safe
to cache. 

But doing that is very impractical, because even if we could traverse
the entire tree of possible allow/deny directives, and then decided it
was cacheable, the admin might then add a Deny. This would silently
take no effect until the entity expired from the cache, which is the
original problem all over again :/

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: [vote] 2.1.9 as beta

2005-11-02 Thread Joshua Slive


Colm MacCarthaigh wrote:

I think the text Deny from all is a particularly dangerous thing to
have not work as advertised! No matter how well documented :/


Sure, but in truth, apache configuration is really complex and deny 
from all doesn't always really mean what it says.


This configuration:
Location /
Allow from all
/Location
Directory /usr/local/apache2/htdocs
Order Allow,Deny
Deny from all
/Directory

allows unlimited access -- which would surprise many users.

All that to say, I'm fine with the document that mod_cache ignores 
mod_access solution to this problem.


Joshua.


Re: [vote] 2.1.9 as beta

2005-11-02 Thread William A. Rowe, Jr.

Colm MacCarthaigh wrote:


I think the text Deny from all is a particularly dangerous thing to
have not work as advertised! No matter how well documented :/


The question though, is where can Deny from all be expected to work?

Certainly not in Directory /foo - the cached entity no longer lives there.

Perhaps in Location /foo - but running the full handlers, dealing with all
the regex'es all over again defeats the purpose of running a fast cache.

Certainly in VirtualHost www.cachedhost.example.com ... although authnz
doesn't work correctly there in the first place ;-)

And certainly globally, if I ran a large mass vhost, yet knew full well that
a list of proxies would corrupt my content, I might

  Deny from 10.123.55.0/24

but again, authn/authz doesn't work globally.

We can discuss 'enabling' the map to storage for Location  and running the
authz stack, but we would have to ensure we bypass the filesystem dir/files
entities.  The deepest relevant level is Location .

And maybe, have you considered a CachedLocation  / CachedLocationMatch 
container for mod_cache?  This would have the benefit that very long lists
of directives would be ignored/not merged, in favor of a much shorter and
very specific list that benefits the cache by keeping it fast, while giving
the user the option to tweak the behavior of content, once cached.





security@ was Re: [vote] 2.1.9 as beta

2005-11-02 Thread Justin Erenkrantz
--On November 2, 2005 7:46:10 PM + Colm MacCarthaigh [EMAIL PROTECTED] 
wrote:



The only viable way anyway. I've been looking at this for a few months,
since I first reported to [EMAIL PROTECTED] (still waiting on a
response) and have tried to construct the logic on the authz side, but


Um, who do you think sits behind [EMAIL PROTECTED]  (Hint: committers.)

It's not like we have a dedicated group of people hiding out that only work 
on security problems.  Well.  ;-)  -- justin


Re: [vote] 2.1.9 as beta

2005-11-01 Thread Oden Eriksson
söndagen den 30 oktober 2005 05.09 skrev Paul Querna:
 2.1.9-Beta is available from:
 http://people.apache.org/~pquerna/dev/httpd-2.1.9/

 Please test and vote on releasing 2.1.9 as BETA.

 As a reminder, if you know of any issues you consider a SHOW STOPPER for
 a 2.2.0 stable release, please add them to the branches/2.2.x STATUS file.

 Thanks,

Tested on Mandriva Linux 2006.0 x86_64, works good. I ran the perl-framework 
from httpd-test, latest from SVN and it only choked on a couple of php tests. 
t/php/arg.t, t/php/func5.t and t/php/virtual.t.


-- 
Regards // Oden Eriksson
Mandriva: http://www.mandriva.com
NUX: http://nux.se


Re: [vote] 2.1.9 as beta

2005-11-01 Thread Brad Nicholes
+1 NetWare

Brad

 On 10/29/2005 at 10:09:46 pm, in message
[EMAIL PROTECTED],
[EMAIL PROTECTED] wrote:
 2.1.9-Beta is available from:
 http://people.apache.org/~pquerna/dev/httpd-2.1.9/ 
 
 Please test and vote on releasing 2.1.9 as BETA.
 
 As a reminder, if you know of any issues you consider a SHOW STOPPER
for
 a 2.2.0 stable release, please add them to the branches/2.2.x STATUS
file.
 
 Thanks,
 
 Paul


Re: [vote] 2.1.9 as beta

2005-11-01 Thread Justin Erenkrantz
On Sat, Oct 29, 2005 at 09:09:46PM -0700, Paul Querna wrote:
 2.1.9-Beta is available from:
 http://people.apache.org/~pquerna/dev/httpd-2.1.9/
 
 Please test and vote on releasing 2.1.9 as BETA.

+1 for beta.

Passes httpd-test on Ubuntu breezy/ppc.  -- justin


Re: [vote] 2.1.9 as beta

2005-11-01 Thread Justin Erenkrantz
On Sun, Oct 30, 2005 at 10:14:29AM -0600, William A. Rowe, Jr. wrote:
 They persist on /trunk/ if anyone wants to revisit them.  In the interim,
 they can simply be blasted on /branches/2.1.x/ - no?

Yes, that is the plan I think we agreed on.  -- justin


Re: [vote] 2.1.9 as beta

2005-10-31 Thread Jim Jagielski

There is a semi-known issue with the balancer code which mistakenly
does a case-insensitive match on worker and URL. I have a patch
that will be applied today. Not a show-stopper, IMO, but
something that will need to be fixed :)

On Oct 30, 2005, at 12:09 AM, Paul Querna wrote:


2.1.9-Beta is available from:
http://people.apache.org/~pquerna/dev/httpd-2.1.9/

Please test and vote on releasing 2.1.9 as BETA.

As a reminder, if you know of any issues you consider a SHOW  
STOPPER for
a 2.2.0 stable release, please add them to the branches/2.2.x  
STATUS file.


Thanks,

Paul






Re: [vote] 2.1.9 as beta

2005-10-30 Thread Nick Kew
On Sunday 30 October 2005 04:09, Paul Querna wrote:
 2.1.9-Beta is available from:
 http://people.apache.org/~pquerna/dev/httpd-2.1.9/

 Please test and vote on releasing 2.1.9 as BETA.

compiling 

 As a reminder, if you know of any issues you consider a SHOW STOPPER for
 a 2.2.0 stable release, please add them to the branches/2.2.x STATUS file.

Someone still needs to sort out experimental MPMs, and especially purge
perchild, as it tantalises the users so.

AFAICT leader and threadpool are also dead, and event is a potential
candidate for stable, yesno?

Should we have an official attic for dead MPMs?

-- 
Nick Kew


Re: [vote] 2.1.9 as beta

2005-10-30 Thread William A. Rowe, Jr.

Nick Kew wrote:


Someone still needs to sort out experimental MPMs, and especially purge
perchild, as it tantalises the users so.

AFAICT leader and threadpool are also dead, and event is a potential
candidate for stable, yesno?


They persist on /trunk/ if anyone wants to revisit them.  In the interim,
they can simply be blasted on /branches/2.1.x/ - no?

Bill


[vote] 2.1.9 as beta

2005-10-29 Thread Paul Querna
2.1.9-Beta is available from:
http://people.apache.org/~pquerna/dev/httpd-2.1.9/

Please test and vote on releasing 2.1.9 as BETA.

As a reminder, if you know of any issues you consider a SHOW STOPPER for
a 2.2.0 stable release, please add them to the branches/2.2.x STATUS file.

Thanks,

Paul