Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Boris Zbarsky

On 11/29/12 2:32 AM, Boris Zbarsky wrote:

On 11/29/12 2:07 AM, Gordon P. Hemsley wrote:

I imagine this ties in, too, to the issues with sniffing CSS files
that has been raised elsewhere:

https://bugzilla.mozilla.org/show_bug.cgi?id=560388
https://bugzilla.mozilla.org/show_bug.cgi?id=562377


Neither one of those has anything to do with application/octet-stream as
far as I can tell.  Those cover cases in which data is sent with either
no Content-Type header or with such a header which can't even be parsed
as major/minor.  Neither of which is true if the data says
appliction/octet-stream.


Oh, and the other important bit is that there is no sniffing CSS 
involved.  If the load is for a link rel=stylesheet and the server 
doesn't send a content type or sends one that can't be parsed, Gecko 
just treats the data as CSS.  That's not the same thing as sniffing the 
data.  ;)


-Boris


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Gordon P. Hemsley
On Thu, Nov 29, 2012 at 2:32 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 11/29/12 2:07 AM, Gordon P. Hemsley wrote:

 So perhaps a more useful question would be what to do in situations
 like that—should mimesniff treat application/octet-stream as a type
 supported by the browser for the purposes of sniffing images, audio
 or video, fonts, or other media types?


 The way it works right now is that
 http://www.whatwg.org/specs/web-apps/current-work/#mime-types says:

   The MIME type application/octet-stream with no parameters is never
   a type that the user agent knows it cannot render. User agents must
   treat that type as equivalent to the lack of any explicit
   Content-Type metadata when it is used to label a potential media
   resource.

 So for the purpose of sniffing media loads specifically, that type is
 treated just like no type at all.

 But first you have to know it's a media load.

Oh, this is probably the location where the HTML spec doesn't
currently, but eventually should, reference the rules for sniffing
audio and video specifically in mimesniff. (Is this where Opera
implements such rules?)

Is it just me (and my late-night reading), or is that section
contradictory on how to treat application/octet-stream?

At one point it says, The MIME type application/octet-stream with
no parameters is never a type that the user agent knows it cannot
render. User agents must treat that type as equivalent to the lack of
any explicit Content-Type metadata when it is used to label a
potential media resource.

But later it says, The canPlayType(type) method must return the empty
string if type is a type that the user agent knows it cannot render or
is the type application/octet-stream;

This seems to me to be unclear as to when sniffing of the audio/video
resource occurs, and what it is used for.

 I imagine this ties in, too, to the issues with sniffing CSS files
 that has been raised elsewhere:

 https://bugzilla.mozilla.org/show_bug.cgi?id=560388
 https://bugzilla.mozilla.org/show_bug.cgi?id=562377

 Neither one of those has anything to do with application/octet-stream as far
 as I can tell.  Those cover cases in which data is sent with either no
 Content-Type header or with such a header which can't even be parsed as
 major/minor.  Neither of which is true if the data says
 appliction/octet-stream.

I was grouping them together because they both rely on context clues
for modifying the sniffing (fallback) behavior, but we can discuss
them separately if that's easier.

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Boris Zbarsky

On 11/29/12 2:53 AM, Gordon P. Hemsley wrote:

At one point it says, The MIME type application/octet-stream with
no parameters is never a type that the user agent knows it cannot
render. User agents must treat that type as equivalent to the lack of
any explicit Content-Type metadata when it is used to label a
potential media resource.

But later it says, The canPlayType(type) method must return the empty
string if type is a type that the user agent knows it cannot render or
is the type application/octet-stream;


What's the contradiction?  We have set S = { types the user agent knows 
it cannot render }.  We have set T = S union { application/octet-stream }


What the above statements tell us so far is:

1)  T != S
2)  canPlayType(type) must return empty string for all types in T.

But later on in the resource selection algorithm there are certain 
actions taken for elements of S only.



This seems to me to be unclear as to when sniffing of the audio/video
resource occurs, and what it is used for.


It's used for actually showing the video even if it's sent as 
application/octet-stream.



I was grouping them together because they both rely on context clues
for modifying the sniffing (fallback) behavior


So first of all, sniffing and default handling are not the same 
thing at all.


But yes, context matters for determining default handling and also for 
determining sniffing.


-Boris


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Gordon P. Hemsley
On Thu, Nov 29, 2012 at 3:02 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 11/29/12 2:53 AM, Gordon P. Hemsley wrote:

 At one point it says, The MIME type application/octet-stream with
 no parameters is never a type that the user agent knows it cannot
 render. User agents must treat that type as equivalent to the lack of
 any explicit Content-Type metadata when it is used to label a
 potential media resource.

 But later it says, The canPlayType(type) method must return the empty
 string if type is a type that the user agent knows it cannot render or
 is the type application/octet-stream;


 What's the contradiction?  We have set S = { types the user agent knows it
 cannot render }.  We have set T = S union { application/octet-stream }

 What the above statements tell us so far is:

 1)  T != S
 2)  canPlayType(type) must return empty string for all types in T.

 But later on in the resource selection algorithm there are certain actions
 taken for elements of S only.


 This seems to me to be unclear as to when sniffing of the audio/video
 resource occurs, and what it is used for.


 It's used for actually showing the video even if it's sent as
 application/octet-stream.

The apparent contradiction occurs when, e.g., an Opus file is tagged
as application/octet-stream.

If I understand correctly, a UA would return  when canPlayType() is
called against such a file—but then the file would actually play
because it is later sniffed as application/ogg.

Am I missing something?

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Boris Zbarsky

On 11/29/12 12:45 PM, Gordon P. Hemsley wrote:

The apparent contradiction occurs when, e.g., an Opus file is tagged
as application/octet-stream.

If I understand correctly, a UA would return  when canPlayType() is
called against such a file


canPlayType is not called against a file.  It's called with a single 
argument which is a string MIME type.  If you pass 
application/octet-stream, it will return .  Its behavior does not 
depend on any state of the element it's called on (like what it's 
actually pointing to, etc); only on the string passed in.



but then the file would actually play
because it is later sniffed as application/ogg.

Am I missing something?


I think you're misunderstanding what canPlayType does?

-Boris



Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Gordon P. Hemsley
On Thu, Nov 29, 2012 at 12:57 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 canPlayType is not called against a file.  It's called with a single
 argument which is a string MIME type.  If you pass
 application/octet-stream, it will return .  Its behavior does not depend
 on any state of the element it's called on (like what it's actually pointing
 to, etc); only on the string passed in.

Oh, I see. My mistake. (One should never attempt to understand
something after 2 AM.)

So... are there any additional places where application/octet-stream
should be treated as if the media type was undefined? Or is this
conversation moot now?

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Boris Zbarsky

On 11/29/12 1:11 PM, Gordon P. Hemsley wrote:

So... are there any additional places where application/octet-stream
should be treated as if the media type was undefined? Or is this
conversation moot now?


To my knowledge, the only places in the web platform that special-case 
application/octet-stream like this are media and object... And for 
object I believe it falls back to @type, not to data sniffing, but 
it's been a while.


-Boris


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Ian Hickson
On Thu, 29 Nov 2012, Gordon P. Hemsley wrote:
 
 The apparent contradiction occurs when, e.g., an Opus file is tagged as 
 application/octet-stream.
 
 If I understand correctly, a UA would return  when canPlayType() is 
 called against such a file—but then the file would actually play 
 because it is later sniffed as application/ogg.

canPlayType() isn't called against files, it's called against MIME type 
strings.

The type application/octet-stream isn't a video or audio file type, so 
we know that the browsers can't play files actually of that type, any more 
than the browsers can't play videos of type text/css.

But when a file is labeled with that type, we know it's probably 
mislabeled, so we try to do something more useful.


On another note, the spec's current behavior with media elements and 
sniffing is actually very much up in the air, since the last time I worked 
on this I could not get browser vendors to agree on what to implement. 
Search for the note starting This specification does not currently say 
whether or how to check the MIME types of the media resources.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Adam Barth
On Wed, Nov 28, 2012 at 10:30 PM, Gordon P. Hemsley gphems...@gmail.com wrote:
 Based on my reading of the source code, it seems that Gecko treats a
 resource served as 'application/octet-stream' as an unknown type which
 is sniffed as if no Content-Type was specified.

 Are there security implications with doing this?

Yes, there are very large security consequences.  I'm sorry that I
don't have time to respond to all of these threads in detail, but I'm
worried that you don't understand the consequences of the changes
you're proposing to this specification.

I'm not sure how to help you succeed here, but tweaking things in the
spec without a compelling reason for doing so is not likely to lead to
a useful specification.  I spent a great deal of time and effort
studying the behaviors of many user agents and of a massive amount of
content on the web.  I'm certainly willing to believe that the spec
can be improved, but if you don't understand these sorts of basic
things about content sniffing, I worry that changes that you make to
the spec won't be improvements.

Adam


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-29 Thread Gordon P. Hemsley
On Thu, Nov 29, 2012 at 2:30 PM, Adam Barth w...@adambarth.com wrote:
 On Wed, Nov 28, 2012 at 10:30 PM, Gordon P. Hemsley gphems...@gmail.com 
 wrote:
 Based on my reading of the source code, it seems that Gecko treats a
 resource served as 'application/octet-stream' as an unknown type which
 is sniffed as if no Content-Type was specified.

 Are there security implications with doing this?

 Yes, there are very large security consequences.  I'm sorry that I
 don't have time to respond to all of these threads in detail, but I'm
 worried that you don't understand the consequences of the changes
 you're proposing to this specification.

 I'm not sure how to help you succeed here, but tweaking things in the
 spec without a compelling reason for doing so is not likely to lead to
 a useful specification.  I spent a great deal of time and effort
 studying the behaviors of many user agents and of a massive amount of
 content on the web.  I'm certainly willing to believe that the spec
 can be improved, but if you don't understand these sorts of basic
 things about content sniffing, I worry that changes that you make to
 the spec won't be improvements.

 Adam

I and others have already made clear that I was misreading the Mozilla
source code.

I'm aware of the security implications of interpreting a resource as
something other than what the Content-Type header says. The whole
reason I sent the original e-mail was because I thought Mozilla was
sniffing application/octet-stream in a way that it shouldn't, and I
wanted to clarify whether there was something I was missing.

I think you need to tone down your worry about my changes to the spec.
If I didn't have concern for the security implications for a change, I
wouldn't be sending an e-mail to the list about them, would I?

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-28 Thread Michal Zalewski
There are substantial negative security consequences to sniffing
content on MIME types that are commonly used as default fallback
values by web servers or web application developers. This includes
text/plain and application/octet-stream.

/mz


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-28 Thread Boris Zbarsky

On 11/29/12 1:30 AM, Gordon P. Hemsley wrote:

Based on my reading of the source code, it seems that Gecko treats a
resource served as 'application/octet-stream' as an unknown type which
is sniffed as if no Content-Type was specified.


Only for media (video and audio) loads.  Note that the HTML spec 
requires this behavior for those.



Are there security implications with doing this?


In general, yes.  Doing this for document loads would be a security 
nightmare, for example.


-Boris


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-28 Thread Gordon P. Hemsley
On Thu, Nov 29, 2012 at 1:30 AM, Gordon P. Hemsley gphems...@gmail.com wrote:
 Based on my reading of the source code, it seems that Gecko treats a
 resource served as 'application/octet-stream' as an unknown type which
 is sniffed as if no Content-Type was specified.

Oh, wait, I forgot what I was reading—Gecko does this specifically in
the context of sniffing for an audio or video resource. So, if a
resource tagged as 'application/octet-stream' is included in audio
or video, for example, it will be treated as unknown for the
purposes of identifying its true nature. This never follows a path of
scriptable privilege escalation, AFAICT.

So perhaps a more useful question would be what to do in situations
like that—should mimesniff treat application/octet-stream as a type
supported by the browser for the purposes of sniffing images, audio
or video, fonts, or other media types?

I imagine this ties in, too, to the issues with sniffing CSS files
that has been raised elsewhere:

https://bugzilla.mozilla.org/show_bug.cgi?id=560388
https://bugzilla.mozilla.org/show_bug.cgi?id=562377
https://bugzilla.mozilla.org/show_bug.cgi?id=808593

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing

2012-11-28 Thread Boris Zbarsky

On 11/29/12 2:07 AM, Gordon P. Hemsley wrote:

So perhaps a more useful question would be what to do in situations
like that—should mimesniff treat application/octet-stream as a type
supported by the browser for the purposes of sniffing images, audio
or video, fonts, or other media types?


The way it works right now is that 
http://www.whatwg.org/specs/web-apps/current-work/#mime-types says:


  The MIME type application/octet-stream with no parameters is never
  a type that the user agent knows it cannot render. User agents must
  treat that type as equivalent to the lack of any explicit
  Content-Type metadata when it is used to label a potential media
  resource.

So for the purpose of sniffing media loads specifically, that type is 
treated just like no type at all.


But first you have to know it's a media load.


I imagine this ties in, too, to the issues with sniffing CSS files
that has been raised elsewhere:

https://bugzilla.mozilla.org/show_bug.cgi?id=560388
https://bugzilla.mozilla.org/show_bug.cgi?id=562377


Neither one of those has anything to do with application/octet-stream as 
far as I can tell.  Those cover cases in which data is sent with either 
no Content-Type header or with such a header which can't even be parsed 
as major/minor.  Neither of which is true if the data says 
appliction/octet-stream.


-Boris