> No-one would be more happy than me if we could just ditch all the legacy > encodings and all switch to Unicode everywhere, but that will never happen. > There is enough legacy content out there that will never be converted.
That's sort of exactly the point: *NEW* content should be UTF-8 (or UTF-16) because everyone's learned how nasty encodings are. *LEGACY* content is playing by whatever old rules it was using when it was created. You can't fix that by updating or changing the standards that it might have been correctly or incorrectly depending on. All that does is add more ambiguity to the existing content. - Shawn