[websec] #17: Use the magic numbers in the media type IANA registry instead of an explicit table

2011-10-23 Thread websec issue tracker
#17: Use the magic numbers in the media type IANA registry instead of an explicit table The Internet Media Type IANA registry contains a field for Magic Number which is intended to represent how the type can be sniffed. The tables in this document and the IANA registry should be aligned,

[websec] #18: Describe use of file extension in sniffing from file: and ftp: URIs

2011-10-23 Thread websec issue tracker
#18: Describe use of file extension in sniffing from file: and ftp: URIs while file extensions should not be used in sniffing data over http:, this isn't the case for ftp and file system access, I don't believe. -- +

[websec] #19: Do not sniff PDF

2011-10-23 Thread websec issue tracker
#19: Do not sniff PDF There should be a strong advice not to sniff PDF -- if the data is mislabeled as something else, then sending it to a PDF interpreter is likely just an error. -- + Reporter: masinter@… | Owner:

[websec] #20: Sniffing should be opt in on a case-by-case basis

2011-10-23 Thread websec issue tracker
#20: Sniffing should be opt in on a case-by-case basis The way the document is written as a normative algorithm makes it hard to say this, but: Every implementation should be free to opt out of sniffing based on other information it has (previous experience with the site, information based

[websec] #21: sniffing of text/html shouldn't override polyglot label of application/xhtml+xml

2011-10-23 Thread websec issue tracker
#21: sniffing of text/html shouldn't override polyglot label of application/xhtml+xml (I have to double check that this is true): In general, sniffing is dangerous in polyglot cases where the same content CAN be served with different media types, where the meaning is the same or related.

[websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread websec issue tracker
#22: content-type sniffing should include charset sniffing the HTML5 spec contains some algorithms for sniffing charset, overriding labeled charset, etc. MIME parameters like charset are as much a part of the content-type as the base internet media type, and any sniffing of parameters and

[websec] #24: ensure XML packaging is in scope

2011-10-23 Thread websec issue tracker
#24: ensure XML packaging is in scope http://www.w3.org/TR/widgets/ Widget Packaging and XML Configuration makes normative reference to (some version) of this document, as: [SNIFF]Media Type Sniffing. A. Barth and I. Hickson. IETF (Work in Progress). Make sure it is clear that the use

Re: [websec] Are all the issues filed? (was: Re: Using IETF Tracker for issues on MIME sniffing?)

2011-10-23 Thread Larry Masinter
I'd meant to do more careful write-ups of the issues I put into the tracker, but I've put in the main issues left over from draft-masinter-mime-sniff. The document also contains several sections (on sniffing fonts, for example) which are left as TBD. I suppose each of those is a separate issue

Re: [websec] #19: Do not sniff PDF

2011-10-23 Thread Tobias Gondrom
hat=individual Am not sure I understand this issue: - in which way is it more certain that there is no mislabeled PDF than a mislabeled jpg or mislabeled rtf? - what about scenarios in which there is no content-type (e.g. ftp, filesystem), should in this case sniffing not be done? Kind

Re: [websec] #20: Sniffing should be opt in on a case-by-case basis

2011-10-23 Thread Tobias Gondrom
hat=individual Agree with this one. With one addition: it must be clear, that if you opt-in for sniffing, than you MUST (SHOULD?) follow the mime-sniffing algorithm. Kind regards, Tobias On 24/10/11 00:48, websec issue tracker wrote: #20: Sniffing should be opt in on a case-by-case basis

[websec] font sniffing - Re: Are all the issues filed? (was: Re: Using IETF Tracker for issues on MIME sniffing?)

2011-10-23 Thread Tobias Gondrom
On 24/10/11 01:04, Larry Masinter wrote: I'd meant to do more careful write-ups of the issues I put into the tracker, but I've put in the main issues left over from draft-masinter-mime-sniff. The document also contains several sections (on sniffing fonts, for example) which are left as TBD. I

Re: [websec] #19: Do not sniff PDF

2011-10-23 Thread Larry Masinter
- in which way is it more certain that there is no mislabeled PDF than a mislabeled jpg or mislabeled rtf? I don't think this is relevant. There is likely mislabeled PDF. But I had specific feedback from implementors of PDF readers that sniffing from other content-type resulted in a worse

Re: [websec] font sniffing - Re: Are all the issues filed? (was: Re: Using IETF Tracker for issues on MIME sniffing?)

2011-10-23 Thread Adam Barth
On Sun, Oct 23, 2011 at 8:17 PM, Tobias Gondrom tobias.gond...@gondrom.org wrote: On 24/10/11 01:04, Larry Masinter wrote: I'd meant to do more careful write-ups of the issues I put into the tracker, but I've put in the main issues left over from draft-masinter-mime-sniff. The document also

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Tobias Gondrom
hat=individual I tend not to agree with that. The fact that charset sniffing might happen at the same time as mime-sniffing does not seem like a strong argument to include this in the draft. Furthermore I would rather have these issues separate: First you determine the content-type and then

Re: [websec] #19: Do not sniff PDF

2011-10-23 Thread Adam Barth
On Sun, Oct 23, 2011 at 8:21 PM, Larry Masinter masin...@adobe.com wrote: - in which way is it more certain that there is no mislabeled PDF than a mislabeled jpg or mislabeled rtf? I don't think this is relevant. There is likely mislabeled PDF. But I had specific feedback from implementors

Re: [websec] #20: Sniffing should be opt in on a case-by-case basis

2011-10-23 Thread Larry Masinter
Agree with this one. With one addition: it must be clear, that if you opt-in for sniffing, than you MUST (SHOULD?) follow the mime-sniffing algorithm. I don't think that's possible. I think the crux of this issue is that I don't think the mime-sniffing algorithm is currently structured in a

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Adam Barth
The charset sniffing is also complicated by the fact that sometimes user agents need to parse some of the HTML to find a meta element. In some situations, user agents need to restart the parsing algorithm, which is quite delicate and better to describe in the same document as HTML parsing (at

Re: [websec] #20: Sniffing should be opt in on a case-by-case basis

2011-10-23 Thread Adam Barth
On Sun, Oct 23, 2011 at 8:26 PM, Larry Masinter masin...@adobe.com wrote: Agree with this one. With one addition: it must be clear, that if you opt-in for sniffing, than you MUST (SHOULD?) follow the mime-sniffing algorithm. I don't think that's possible. I think the crux of this issue is

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Larry Masinter
First you determine the content-type and then after that you may want to determine the charset used within that content-type That's wishful thinking that doesn't match what has to happen ... the mime-sniffing document ALREADY is looking at the charset, by looking for byte-order-mark

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Adam Barth
I mean, that's how the code works, so it must be possible. :) Adam On Sun, Oct 23, 2011 at 8:32 PM, Larry Masinter masin...@adobe.com wrote: I know it's complicated, but scanning text is necessarily part of determining which application/something+xml  you have.  I think (but should really