Re: [whatwg] Internal character encoding declaration

Lachlan Hunt Mon, 13 Mar 2006 06:12:35 -0800

Henri Sivonen wrote:

If a meta element whose http-equiv attribute has the value"Content-Type" (compare case-insensitively) and whose content attributehas a value that begins with "text/html; charset=", the string in thecontent attribute following the start "text/html; charset=" is taken,white space removed from the sides and considered the tentative encodingname.


This will need to handle common mistakes such as the following:

<meta ... content="application/xhtml+xml;charset=X">
<meta ... content="foo/bar;charset=X">
<meta ... content="foo/bar;charset='X'">
<meta ... content="charset=X">
<meta ... charset="X">

I'm not sure which browsers support each one, they'll all need to be tested.

Authors are adviced not to use the UTF-32 encoding or legacy encodings.(Note: I think UTF-32 on the Web is harmful and utterly pointless,


I agree about it being pointless, but why is it considered harmful?

 I'd like to have some text in the spec that justifies whining
about legacy encodings.

What are your reasons for whining about legacy encodings and what wouldyou like the spec to say?

Also, the spec should probably give guidance on what encodings need tobe supported. That set should include at least UTF-8, US-ASCII,ISO-8859-1 and Windows-1252.


And probably UTF-16 as well.

--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Internal character encoding declaration

Reply via email to