On Sun, Jan 15, 2012 at 12:53:00PM -0800, Adam Barth wrote: > On Sun, Jan 15, 2012 at 12:41 PM, Willy Tarreau <w...@1wt.eu> wrote: > > On Sun, Jan 15, 2012 at 11:52:38AM -0800, Adam Barth wrote: > >> The requirement in the spec is what we intend. The rule applies only > >> to that exact octet sequence. > > > > But then what are the impacts of not matching the correct content-type ? > > I'm not sure I understand your question. Can you explain a scenario > in which something happens that causes someone to be sad with the > current requirements?
The draft presents an algorithm to determine a content type based on available information, but ignores some important information such as valid header values when presented in an unexpected form. For instance, if I get a file advertised like this : Content-type: text/plain; charset=us-ascii then it will not be interpreted as text/plain but rather based on what is found in the contents. This can make it impossible to read some documents purposely sent as plain text (eg: HTML or PS source code). It could also also have security impacts such as causing some plugins to be fed with the mis-identified data. For instance, imagine that I'm browsing behind a filtering proxy which checks that PDF documents are exempt of any exploit. This proxy will most likely consider the advertised content-type and won't fire the PDF analyser on text/plain documents. But the browser at the end of the chain ignores the text/plain and decides to launch the plugin. As you know, in general, having multiple components interprete different things from similar contents is dangerous for interoperability and for security. Note that it is possible that I missed something as it's not trivial to get right the way it's written :-/ Best regards, Willy _______________________________________________ websec mailing list websec@ietf.org https://www.ietf.org/mailman/listinfo/websec