> With UTF-8 you ... reduce to (almost) zero the chance the site
> will be viewed with a wrong encoding.

Oh yeah?

Just yesterday I ran into a page that had what looked at first to
be unrecognized windows-1255 or iso-8859-8 encoding (with Hebrew
characters appearing as lowercase accented Latin characters), but
it turned out that these Latin characters were in UTF-8.

I was even able to read the text by performing mental substitutions
(a-grave = alef, a-acute = bet etc), but I wasn't able to find a
way to convert these characters to Hebrew. What I needed and didn't
have was a UTF-8 > Latin-1 filter.

Granted, it's pretty dumb to write Hebrew with UTF-8-encoded
lowercase accented Latin characters, but these monstrosities do
exist on the web.

-Ron.


P.S. Here's my guess as to how the monstrosity came into being:
it originated as an RTF file in windows-1255 encoding, and was
converted automatically to html. The conversion program was not
smart enough to recognize the original encoding, but it was
"smart" enough to convert what it thought to be Latin-1 into
UTF-8. Which screws any attempt by the user to view the file
with the appropriate encoding...


--------------------------------------------------------------------------
Haifa Linux Club Mailing List (http://www.haifux.org)
To unsub send an empty message to [EMAIL PROTECTED]


Reply via email to