On 13/10/2014, at 11:26 PM, Benoit Chesneau <bchesn...@gmail.com> wrote:

> 
> 
> On Sun, Oct 12, 2014 at 11:38 PM, Robert Collins <robe...@robertcollins.net> 
> wrote:
> On 30 September 2014 11:47, Alan Kennedy <a...@xhaus.com> wrote:
> 
> > [Robert]
> >> So it sounds like it should be the responsibility of a middleware to
> >> renormalize the environment?
> >
> > In order for that to be the case, you have strictly define what
> > "normalization" means.
> 
> For a given deployment its well defined. I agree that in general its not.
> 
> > I believe that it is not possible to fully specify "normalization", and that
> > any attempt to do so is futile.
> >
> > If you want to attempt it for the specific scenarios that your particular
> > application has to deal with, then by all means code your version of
> > "normalization" into your application. Or write some middleware to do it.
> >
> > But trying to make "normalization" a part of a WSGI-style specification is
> > impossible.
> 
> I don't recall proposing that it should be in a WSGI-style spec.
> 
> -Rob
> 
> --
> Robert Collins <rbtcoll...@hp.com>
> Distinguished Technologist
> HP Converged Cloud
> _______________________________________________
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: 
> https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com
> 
> 
> All this issue looks like the problem raised (and not yet solved) recently in 
> Gunicorn when the REMOTE_ADDR has been handled more strictly and we removed 
> all the X-Forward-* headers handling:
> 
> https://github.com/benoitc/gunicorn/issues/797
> 
> There is another case to take in consideration, when your server is answering 
> on unix sockets, so you don't have any TCP address to present. For now we 
> answer with an empty field. 
> 
> Also some application frameworks recently removed the middleware handling 
> X-Forward-* headers. I wonder why.
> 
> 
> There is an RFC for forward headers: http://tools.ietf.org/html/rfc7239 . For 
> me instead of trying to change the strict behaviour of REMOTE_ADDR I wonder 
> if we shouldn't rather add a new field to the environ. Thoughts?

My prior thinking on this was that REMOTE_ADDR should be left alone.

If front end proxies support RFC-7239 and pass them through you are all good.

If you are in a situation where a front end proxy doesn't support RFC-7239 but 
uses the prior convention of X-Forwarded-* headers, then one could take the 
older headers and construct the new RFC-7239 headers and drop the old 
X-Fowarded headers.

In other words, converge on the new convention set by RFC-7239 by translating 
the old way of doing things to the new. This way a WSGI application can be 
coded up just to check for the new header and not have to deal with both.

The actual translation from old headers to new could be done by a WSGI 
middleware or an optionally enabled WSGI server feature. Either way it doesn't 
need to be part of the WSGI specification.

As noted by others, the issue though is how much you trust the information 
passed in by the headers and does it capture entirely the existence of multiple 
hops.

In the case of REMOTE_ADDR it is added by the web server based on actual socket 
information and so there is no way a client can supersede it.

The X-Fowarded-* and Forwarded headers have the problem that a client can set 
them itself.

In having multiple ways now of denoting it, which takes precedence and do you 
trust. If your proxies use X-Forwarded-* but a HTTP client sets Forwarded, what 
do you do.

Ultimately, whether you use a WSGI middleware or a WSGI server which provides a 
built function for the typical case (optionally enabled), it has to be 
configurable to the point of an administrator being able to say what are the 
trusted headers. You may also want to be able to say what the IPs of proxies 
are that you want to trust if practical. This must be something an 
administrator can do and not be be dependent on developers embedding it within 
an application, which is why a builtin mechanism with a WSGI server may be 
preferred.

Anyway, this way a system administrator can say whether it is expected that a 
proxy only sets X-Forwarded-* and not Forwarded or vice versa and who to trust. 
You likely can't just have a default strategy if you want to be safe.

Another issue to consider is header spoofing, which not all WSGI servers 
protect against at the moment.

The spoofing problem is because of the CGI rule around how header names are 
converted. That is:

   Meta-variables with names beginning with "HTTP_" contain values read
   from the client request header fields, if the protocol used is HTTP.
   The HTTP header field name is converted to upper case, has all
   occurrences of "-" replaced with "_" and has "HTTP_" prepended to
   give the meta-variable name.  The header data can be presented as
   sent by the client, or can be rewritten in ways which do not change
   its semantics.  If multiple header fields with the same field-name
   are received then the server MUST rewrite them as a single value
   having the same semantics.  Similarly, a header field that spans
   multiple lines MUST be merged onto a single line.  The server MUST,
   if necessary, change the representation of the data (for example, the
   character set) to be appropriate for a CGI meta-variable.
So this means that X-Forwarded-For is translated to HTTP_X_FOWARDED_FOR. The 
problem is that if a client itself sends X_Forwarded_For, then it would also 
map to the same thing.

By the rules above the two values would be concatenated if a proxy set one and 
the client sent the other, usually separating the values with a comma. If you 
are attempting to block certain clients based on this, then the header value 
could be poisoned and cause problems for such a scheme.

If using a WSGI middleware therefore, depending on the final usage, you may 
want to be making sure the WSGI server deals with this form of header spoofing 
as well.

FWIW, latest versions of mod_wsgi will only accept headers and convert using 
the above rule where they only contain alphanumerics and '-'. If any other 
characters are used the header is thrown away.

This behaviour is by virtue of Apache 2.4 doing the blocking.

There was however a bug in mod_wsgi which means that spoofed headers still got 
through in environ passed to mod_wsgi specific 
access/authentication/authorization hook extensions for Apache. This has been 
fixed in recent release. At the same time it was decided to apply the more 
strict rules about what was allowed back to older Apache 2.2 as well, since 
Apache 2.2 doesn't do the blocking that Apache 2.4 does.

Unfortunately because Linux distros ship out of date mod_wsgi versions, it can 
still be an issue there. Have been pondering turning the issue into a CERT just 
to force them to back port the fixes. :-)

Graham








_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to