Hello.

This message asks for opinions and suggestions how to make Mozilla products understand «Content-Type» of a Web resource exactly as specified in HTTP(s) headers.

Content sniffing in browsers is a compromise between standards and interoperability with “poor” Web sites. It creates vulnerabilities and, generally, breaks compatibility with (original) HTTP/1.1. In some cases it conceals protocol data from such end user’s tools as Ctrl-I (information on page). See http://www.superstructure.info/browser/compromised/toxic-sniffing.html for some generally less known information about it.

I have particular concerns about two scenarios.
First is “media type (a.k.a. MIME) sniffing”, when browser overrides media type/subtype. This is implemented in toolkit/components/mediasniffer/nsMediaSniffer.cpp component (and possibly others, don’t know). There is a proposal https://bugzilla.mozilla.org/show_bug.cgi?id=471020 to make behaviour of Firefox compatible with MS Internet Explorer and https://mimesniff.spec.whatwg.org/#supplied-mime-type-detection-algorithm , using «X-Content-Type-Options: nosniff» to switch the sniffing off.

Second scenario is a less known “UTF sniffing”, applicable only to text media types. Browser respects the type proper, but overrides «charset=» label with own guesses. This is implemented in netwerk/base/nsUnicharStreamLoader.cpp ; such implementation is based on HTML5 encoding sniffing that isn’t applicable (reasonably) to text/plain. In the case of text/plain it leads to bugs. Simple test cases are available at http://course.irccity.ru/ya-yu-9-amp.txt (toxic UTF-16 “BOM”) and http://course.irccity.ru/p-guillemet-yi-ya.txt (toxic UTF-8 “BOM”). It poses less immediate security risk, but still can cause data corruption whenever arbitatry data are allowed into (beginning of) text/plain documents. The toxic UTF sniffing was observer in Firefox, MSIE, Google Chrome, and Safari
and doesn’t seemingly correlate with «X-Content-Type-Options» mentioned above.

Possible approaches to the toxic UTF sniffing include:
• Just fix it (certainly would cause backlash from people eager to burn 
anything except UTF-8).
• Something along the lines of the no-sniff flag.
• Make a new Firefox preferences value (e. g. network.http.charset_quirk_level) controlling browser’s behaviour.
• Make patches for the source code to be used only by those who are interested.

Possible approaches to relation between two scenarios include:
• Extend the meaning of the «X-Content-Type-Options: nosniff» to banning the 
toxic UTF sniffing.
• Make interpretation of «X-Content-Type-Options» depend on preferences.
• Invent a new value for X-Content-Type-Options, or a new header at all, in a hope other browsers and Web applications will ultimately adopt it.
• Treat two problems completely separately.

Opinions?

Please note, I’m not (yet) a browser developer and my main agenda is making a browser I could trust myself.

Regards, Incnis Mrsi

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to