Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-24 Thread Anne van Kesteren
On Mon, 24 Oct 2011 15:47:46 +0900, Larry Masinter wrote: The charset sniffing documentation in the HTML5 document isn't all that complicated, anyway. You have to run the HTML parser for it. It is orders of magnitude more complicated than MIME type sniffing. Sniffing for an encoding alwa

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-24 Thread Tobias Gondrom
ters -- well, that's just a superficial work-around. Larry -Original Message- From: "Martin J. Dürst" [mailto:due...@it.aoyama.ac.jp] Sent: Sunday, October 23, 2011 11:37 PM To: Larry Masinter Cc: Adam Barth; websec@ietf.org Subject: Re: [websec] #22: content-type sniffing sh

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Larry Masinter
.aoyama.ac.jp] Sent: Sunday, October 23, 2011 11:37 PM To: Larry Masinter Cc: Adam Barth; websec@ietf.org Subject: Re: [websec] #22: content-type sniffing should include charset sniffing I agree with Adam and Tobias that we should not pull all of charset sniffing into this document. Many charset detai

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Martin J. Dürst
mbarth.com] Sent: Sunday, October 23, 2011 8:37 PM To: Larry Masinter Cc: Tobias Gondrom; websec@ietf.org Subject: Re: [websec] #22: content-type sniffing should include charset sniffing I mean, that's how the code works, so it must be possible. :) Adam On Sun, Oct 23, 2011 at 8:32 PM,

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Adam Barth
ns must follow this pattern.) > > Larry > > > > > -Original Message- > From: Adam Barth [mailto:i...@adambarth.com] > Sent: Sunday, October 23, 2011 8:37 PM > To: Larry Masinter > Cc: Tobias Gondrom; websec@ietf.org > Subject: Re: [websec] #22: content-type

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Larry Masinter
23, 2011 8:37 PM To: Larry Masinter Cc: Tobias Gondrom; websec@ietf.org Subject: Re: [websec] #22: content-type sniffing should include charset sniffing I mean, that's how the code works, so it must be possible. :) Adam On Sun, Oct 23, 2011 at 8:32 PM, Larry Masinter wrote: > I kno

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Adam Barth
lf Of > Adam Barth > Sent: Sunday, October 23, 2011 8:28 PM > To: Tobias Gondrom > Cc: websec@ietf.org > Subject: Re: [websec] #22: content-type sniffing should include charset > sniffing > > The charset sniffing is also complicated by the fact that sometimes user > agents

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Larry Masinter
Message- From: websec-boun...@ietf.org [mailto:websec-boun...@ietf.org] On Behalf Of Adam Barth Sent: Sunday, October 23, 2011 8:28 PM To: Tobias Gondrom Cc: websec@ietf.org Subject: Re: [websec] #22: content-type sniffing should include charset sniffing The charset sniffing is also complica

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Adam Barth
On Sun, Oct 23, 2011 at 8:29 PM, Larry Masinter wrote: >> First you determine the content-type and then after that you may want to >> determine the charset used within that content-type > > That's wishful thinking that doesn't match what has to happen ... the > mime-sniffing document ALREADY is

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Larry Masinter
> First you determine the content-type and then after that you may want to > determine the charset used within that content-type That's wishful thinking that doesn't match what has to happen ... the mime-sniffing document ALREADY is looking at the charset, by looking for byte-order-mark signatu

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Adam Barth
The charset sniffing is also complicated by the fact that sometimes user agents need to parse some of the HTML to find a element. In some situations, user agents need to restart the parsing algorithm, which is quite delicate and better to describe in the same document as HTML parsing (at least for

Re: [websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread Tobias Gondrom
I tend not to agree with that. The fact that charset sniffing might happen at the same time as mime-sniffing does not seem like a strong argument to include this in the draft. Furthermore I would rather have these issues separate: First you determine the content-type and then after that you

[websec] #22: content-type sniffing should include charset sniffing

2011-10-23 Thread websec issue tracker
#22: content-type sniffing should include charset sniffing the HTML5 spec contains some algorithms for sniffing charset, overriding labeled charset, etc. MIME parameters like charset are as much a part of the content-type as the base internet media type, and any sniffing of parameters and oth