Re: [whatwg] Default encoding to UTF-8?

Jukka K. Korpela Wed, 30 Nov 2011 15:48:42 -0800

2011-12-01 1:28, Faruk Ates wrote:

My understanding is that all browsers* default to Western Latin (ISO-8859-1)

> encoding by default (for Western-world downloads/OSes) due to legacycontent on the web.

Browsers default to various encodings, often windows-1252 (rather thanISO-8859-1). They may also investigate the actual data and make a guessbased on it.

I'm wondering if it might not be good to start encouraging defaulting to UTF-8,

It would not. There’s no reason to recommend any particular defaulting,especially not something that deviates from past practices.

It might be argued that browsers should do better error detection andreporting, so that they inform the user e.g. if the document’s encodinghas not been declared at all and it cannot be inferred fairly reliably(e.g., from BOM). But I’m afraid the general feeling is that browsersshould avoid warning users, as that tends to contradict authors’purposes – and, in fact, mostly things that are serious problems inprinciple aren’t that serious in practice.

We like to think that “every web developer is surely building things in UTF-8 
nowadays”

> but this is far from true.

There’s a large amount of pages declared as UTF-8 but containing Asciionly, as well as pages mislabeled as UTF-8 but containing e.g. ISO-8859-1.

I still frequently break websites and webapps simply by entering my name (Faruk 
Ateş).

That’s because the server-side software (and possibly client-sidesoftware) cannot handle the letter “ş”. It would not help if the pagewere interpreted as UTF-8. If the author knows that a server-side form


Yucca

Re: [whatwg] Default encoding to UTF-8?

Reply via email to