To solve the situation, it would be smarter if the W3C was not referencing the Microsoft standard itself but a standardized version of it, explaining explicitly how to handle the unassigned code positions. The W3C coud describe the expected mapping of these positions explicitly in its own standard, or could publish a RFC and, possibly, register another code in the IANA registry (but then it would cause another nightmare because it would need a new alias than "windows-1252".
My opinion is that the W3C was wrong by indicating that the "ISO-8859-1" charset had to be handled specially (as well as the "windows-1252"). In my opinion it should just reference the "cp1252" charset name, changing its aliasing to "windows-1252" and making it a full charset by its own, endorsed by the W3C and described in its standard or in a published RFC, making sure that all unassigned code positions of "windows-1252" will be mapped **irrevocably** to C1 controls (even if later Microsoft decides to map some other characters in its own "windows-1252" charset, like it did several times and notably when the Euro symbol was mapped). Charset aliases supported in HTML5 will then point to a new single canonical charset. If "cp1252" is not appropriate for the choice of canonical charset name, the w3c should propose another canonical name such as "w3c-1252". And then the HTML5 standard will explain explicitly the complete list of charset names to support as aliases to this canonical charset, and should instruct web designers to use the canonical charset name and nothing else, for HTML5 developments. If webdesigners are not satisfied because HTML5 pages would loose their compatibility with HTML4, they should use UTF-8 instead which is still treated identically in both HTML4 and HTML5. Anyway, the support of HTML4 renderers is almost impossible to support completely with web designs for HTML5 and most frequently web sites made for HTML5 will offer an alternate navigation or presentation for HTML4 renderers (with their known "quirks" modes that complicate things a lot). But if web designers have some good skills (and enough money!) they can attempt to build sites that will work on HTML4-only renderers as well as HTML5 renderers. But this challenge requires HUGE and complex development with lots of tiny adjustments (in scripts, stylesheets), with an extensive system for handling user support requests and testing the various solutions on separate development and test platforms. And today this challenge has only been used successfully by Wikimedia sites (which also support multiple viweing modes, including for mobile devices). But thanks Wikimedia chose to support in MediaWiki these multiple viewing mode only with UTF-8 (and it does not care about how to handle windows-1252, which is not even supported). Even large corporate companies are unable to support both HTML4 and HTML5 : sites generally are moving from one to the other, even if they know that their content may no longer be accessible to users of older browsers. The challenge is somewhere else today: it lives in the proliferation of mobile browing modes; as it does not work very well, web designers prefer developing mobile "applications" that will be installed on tablets and smartphones, using separate (and incompatible) development for the client application, but at least allowing to have applications that will work with the same server-side application (communicating with their more basic protocol, most often with XML or JSON data requests, and HTTP to download images). Mobile applications have completely changed the way to develop web sites. There's a clear separation now between client side developments and server-side back-ends (with the advantage of possibly removing the server-side front end, if the web site will only be available to users of mobile apps, or users of modern browsers that support client-side deployments for specific browsers supporting an API for such deployments, such as Chrome and now IE8+ and the recent Windows 8 application store). In that cas the server-side part of the website only needs ONE charset for everything: UTF-8 (with data compression where needed). It's just simply easier for web desiggners to develop a few separate client-side "appplication" front-ends (for iPhones, Android, Windows, sometimes for a few other brands) than supporting zillions of web browsers and their versions. The bad side of it is about the absence of common standard supported really supported across these client-side environments, so they tend to split the Internet into separate worlds (and the dream of interoperability becomes just that: a dream. Not even more ideal because even these proprietary platforms start being fragmented as well, including in the iPhone world, due to their versions and growing APIs not supported in many older versions). HTML5 is certainly the standard that should reconciliate these proprietary platforms to use a common interoperable framework. But nobody knows if it will be really well supported by those proprietary client-side development platforms and by device manufacturers implementing them. The need for interoperability has never been so acute than it is today. But the battle is no longer within the charsets. Everybody now knows that UTF-8 is THE charset for everything. 2012/11/20 Doug Ewell <d...@ewellic.org> > Buck Golemon <buck at yelp dot com> wrote: > > > What effort has been spent? This is not an either/or type of > > proposition. If we can agree that it's an improvement (albeit small), > > let's update the mapping. > > Is it much harder than I believe it is? > > ISO/IEC 8859-1 is, uh, an ISO/IEC standard. CP1252 is a Microsoft > corporate standard. One does not simply "update" someone else's > standard, the WHATWG document and mapping tables notwithstanding. > >