What about mod_security, has a lot of similar checks and even more.
-----Original Message-----
From: Stefan Fritsch
Sent: Wednesday, November 7, 2012 12:26 Newsgroups: gmane.comp.apache.devel
To: dev@httpd.apache.org
Subject: Rethinking "be liberal in what you accept"
Hi,
considering the current state of web security, the old principle of "be
liberal in what you accept" seems increasingly inadequate for web servers.
It causes lots of issues like response splitting, header injection, cross
site scripting, etc. The book "Tangled Web" by Michal Zalewski is a good
read on this topic, the chapter on HTTP is available for free download at
http://nostarch.com/tangledweb .
Also, nowadays performance bottle necks are usually in other places than
request parsing. A few more cycles spent for additional checks won't make
much difference. Therefore, I think it would make sense to integrate some
sanity checks right into the httpd core. For a start, these would need to
be enabled in the configuration.
Examples for such checks [RFC 2616 sections in brackets]:
Request line:
- Don't interpret all kinds of junk as "HTTP/1.0" (like "HTTP/ab" or
"FOO") [3.1]
- If a method is not registered, bail out early.
This would prevent CGIs from answering requests to strange methods like
"HELO" or "http://foo/bar". This must be configurable or there must be
at least a directive to easily register custom methods. Otherwise, at
least forbid strange characters in the method. [The method is a token,
which should not contain control characters and separators; 2.2, 5.1]
- Forbid control characters in URL
- Forbid fragment parts in the URL (i.e. "#..." which should never be sent
by the browser)
- Forbid special characters in the scheme part of absoluteURL requests,
e.g. "<>"
Request headers:
- In Host header, only allow reasonable characters, i.e. no control
characters, no "<>&". Maybe: only allow ascii letters, digits, and
"-_.:[]"
- Maybe replace the Host header with the request's hostname, if they are
different. In:
GET http://foo/ HTTP/1.1
Host: bar
The "Host: bar" MUST be ignored by RFC 2616 [5.2]. As many webapps likely
don't do that, we could replace the Host header to avoid any confusion.
- Don't accept requests with multiple Content-Length headers. [4.2]
- Don't accept control characters in header values (in particular single
CRs,
which we don't treat specially, but other proxies may. [4.2]
Response headers:
- Maybe error out if an output header value or name contains CR/LF/NUL (or
all control characters?) [4.2]
- Check that some headers appear only once, e.g. Content-Length.
- Potentially check in some headers (e.g. Content-Disposition) that
key=value
pairs appear only once (this may go too far / or be too expensive).
Other:
- Maybe forbid control characters in username + password (after base64
decoding)
As a related issue, it should be possible to disable HTTP 0.9.
The dividing line to modules like mod_security should be that we only
check things that are forbidden by some standard and that we only look at
the protocol and not the body. Also, I would only allow to switch the
checks on and off, no further configurability. And the checks should be
implemented efficiently, i.e. don't parse things several times to do the
checks, normally don't use regexes, etc.
What do you think?
Cheers,
Stefan