Content sniffing: seeking reliable protection of a text HTTP resource

Incnis Mrsi Thu, 01 Oct 2015 14:41:22 -0700

Hello.

This message asks for opinions and suggestions how to makeMozilla products understand «Content-Type» of a Web resourceexactly as specified in HTTP(s) headers.

Content sniffing in browsers is a compromise between standardsand interoperability with “poor” Web sites.It creates vulnerabilities and, generally, breaks compatibility with (original) HTTP/1.1.In some cases it conceals protocol data from such end user’s tools as Ctrl-I (information on page).See http://www.superstructure.info/browser/compromised/toxic-sniffing.htmlfor some generally less known information about it.


I have particular concerns about two scenarios.

First is “media type (a.k.a. MIME) sniffing”,when browser overrides media type/subtype.This is implemented in toolkit/components/mediasniffer/nsMediaSniffer.cpp component(and possibly others, don’t know).There is a proposal https://bugzilla.mozilla.org/show_bug.cgi?id=471020to make behaviour of Firefox compatible with MS Internet Explorerand https://mimesniff.spec.whatwg.org/#supplied-mime-type-detection-algorithm ,using «X-Content-Type-Options: nosniff» to switch the sniffing off.

Second scenario is a less known “UTF sniffing”,applicable only to text media types. Browser respects the type proper,but overrides «charset=» label with own guesses.This is implemented in netwerk/base/nsUnicharStreamLoader.cpp ;such implementation is based on HTML5 encoding sniffingthat isn’t applicable (reasonably) to text/plain.In the case of text/plain it leads to bugs. Simple test cases are availableat http://course.irccity.ru/ya-yu-9-amp.txt (toxic UTF-16 “BOM”)and http://course.irccity.ru/p-guillemet-yi-ya.txt (toxic UTF-8 “BOM”).It poses less immediate security risk, but still can cause data corruptionwhenever arbitatry data are allowed into (beginning of) text/plain documents.The toxic UTF sniffing was observer in Firefox, MSIE, Google Chrome, and Safari

and doesn’t seemingly correlate with «X-Content-Type-Options» mentioned above.

Possible approaches to the toxic UTF sniffing include:
• Just fix it (certainly would cause backlash from people eager to burn 
anything except UTF-8).
• Something along the lines of the no-sniff flag.

• Make a new Firefox preferences value(e. g. network.http.charset_quirk_level) controlling browser’s behaviour.

• Make patches for the source code to be used only by those who are interested.

Possible approaches to relation between two scenarios include:
• Extend the meaning of the «X-Content-Type-Options: nosniff» to banning the 
toxic UTF sniffing.
• Make interpretation of «X-Content-Type-Options» depend on preferences.

• Invent a new value for X-Content-Type-Options, or a new header at all,in a hope other browsers and Web applications will ultimately adopt it.

• Treat two problems completely separately.

Opinions?

Please note, I’m not (yet) a browser developer andmy main agenda is making a browser I could trust myself.


Regards, Incnis Mrsi

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Content sniffing: seeking reliable protection of a text HTTP resource

Reply via email to