> No-one would be more happy than me if we could just ditch all the legacy 
> encodings and all switch to Unicode everywhere, but that will never happen. 
> There is enough legacy content out there that will never be converted.

That's sort of exactly the point: 

*NEW* content should be UTF-8 (or UTF-16) because everyone's learned how nasty 
encodings are.

*LEGACY* content is playing by whatever old rules it was using when it was 
created.  You can't fix that by updating or changing the standards that it 
might have been correctly or incorrectly depending on.  All that does is add 
more ambiguity to the existing content.

- Shawn


Reply via email to