On 07/18/2014 11:33 PM, Amos Jeffries wrote: > On 19/07/2014 2:07 a.m., Eliezer Croitoru wrote: >> This got my eyes but I am not reading all ietf httpbits mails and I >> would like to get a reference for this thread please?
> There are two type of removable headers: > a) headers which exist purely to bypass security > b) headers which exist due to intermediaries breaking them > > The post describing why the (b) group occur is here: > http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/0132.html The above email is talking about a "nnCoection: close" header which appears to be a result of a bug in some 15-year old software. Identifying that rare header would be overall harmful -- Squid would spend more resources on detecting that header presence than it will save by removing that header when it is found. > One of the posts which is making me think we could benefit from doing > something is: > http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1220.html > This lists the existing headers found in the data sets being analysed by > IETF as representative of HTTP web traffic. > What I can see in that listing is the following headers (by type above). > > (A) group: > > x-powered-by / x-aspnet-version / x-aspnetmvc-version / x-pb-mii - > exists to bypass server security measures applied on Server: header. Sounds like those headers exist to implement some above-HTTP functionality deemed useful to those who send and/or receive them. What will Squid break by removing those valid HTTP headers? Why breaking that functionality is a universally good thing justifying being the default behavior in an HTTP proxy? > x-served-by - same as X-powered-by but also crossing over to contain > X-forwarded-for: and Forwarded: header contents (but without the > security protections applied for them). > > x-host / x-forwarded-host - exist to bypass Browser same-origin > security measures. > > x-li-uuid - tracking cookie created to bypass Cookie header security > and legislative restrictions. > > x-fs-uuid - header for distributing the UUID of the server hard drive > out to the public network (seriously, what could go wrong with that huh?) > > x-radid - seems to be another disk drive tracking ID method. Same questions apply here. Please correct me if I am wrong, but it sounds like you are dividing HTTP-compliant agents into two categories: Those that use HTTP the way you want HTTP to be used and all others. The division appears to be based not on some HTTP MUSTs, but your view of which "security" model must be defended. IMO, Squid should strive to support all HTTP-compliant agents by default. We should not be the internet police because policing traffic requires making judgments of who is the "bad" guy, which is outside of software developers competence. Folks that want to enforce a particular security model may propose optional features and configuration excerpts that do so, of course. There are some gray areas like defense against request smuggling, but even there extreme care should be taken to avoid harming valid HTTP traffic. It is certainly not the area of "delete all bad headers in the parser" solutions. > (b) group worry me for the reasons given below: > > nncoection / cneonction / x-cnection - reason described in the above > email. I am a little bit worried that in HTTP/1.1 these may have > actually contained lists of headers which were to be dropped by the > earlier intermediary. But obscuring the "Connection:" name we are > potentially transmitting headers like Upgrade: or with private details > that should be elided. I do not see why honoring _and_ then dropping what we think is a former Connection header helps more than it hurts (by default). In fact, that sounds like a useful smuggling attack vector to me -- "we know Squid will drop these headers but others will pass them on, so let's use that for our evil needs". > ntcoent-length / cteonnt-length - Given the reason behind 16-bit rotate > on header name any of the mandatory HTTP/1.1->1.0 and connection:close > addition required to make this safe will alter the checksum. So will > content adaptation if that was the point. I do not understand how header changes affect content checksums. Those checksums do not include headers. > I am left with assuming that this is done to smuggle messages in a > pipeline through the receiving server as a single request/reply. Your assumption seems to contradict what we know for a fact is going on in many (probably most!) cases of such header name adaptations -- converting standard header names into extension header names to avoid buffer copies. > There are also a bunch of other headers which can best be called > "garbage". Relatively harmless though. > > Old HTTP features and mechanisms which are now not supposed to be sent: > > pragma:close - dead HTTP/1.0 feature. Not to be emitted by HTTP/1.1 > software. > p3p - dead standard, removed from service due to privacy violations. > x-pad - supposedly an HTTPS-only feature for "fixing" IETF does not have the power to make something "dead" (thankfully!). If some old software uses an old feature, we should default to supporting it (all other factors being equal). Again, it is perfectly fine to offer an "only good modern agents are proxied" feature/configuration in Squid, but we are discussing > proxy-connection - dead non-standard. we already drop this one Dropping hop-by-hop headers (from old or new standards, does not matter) is a requirement we should follow, of course. Is it a good idea to drop Proxy-Connection in the parser, without an opportunity to honor it (in some cases)? I am pretty sure that will break some installations. > debug headers that are mostly useless (we could help clean this up by > only enabling our x-cache headers based on a debug config option) > > x-cache / x-cache-lookup / x-cache-action / x-cache-hits / x-cache-age > / x-fb-debug / x-mii-cache-hit / bk-server I agree that these should be _emitted_ by Squid only if Squid is configured to do so. We can discuss the right configuration option and its default setting. However, I disagree that we should drop them by default in the parser. Doing so will break installations that rely on what you consider "mostly useless" headers. Finally, your RFC is about bandwidth savings. I bet that deleting all of the "security-related" headers and the vast majority of other headers you listed will not give you noticeable bandwidth savings. You may adjust your RFC to focus on security or other aspects, of course, but if bandwidth savings remain the goal, then the examples of rare "security-bypass" and "dead" headers are not convincing at all! Cheers, Alex. >> On 07/18/2014 10:32 AM, Amos Jeffries wrote: >>> Some of the statisticas being brought up in the IETF HTTP/2 discussions >>> is highlighting certain garbage headers which are unfortunately quite >>> common. >>> >>> I have wondered about creating a registry of known garbage and simply >>> dropping those headers on arrival in the parser. This would be in >>> addition to the header registry lookup and masking process we have for >>> hop-by-hop headers. >>> >>> Any other thoughts on this? >>> >>> Amos