Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-13 Thread Erik van der Poel

Frank da Cruz wrote:
 
 Doug Ewell wrote:
  
  That last paragraph echoes what Frank said about "reversing the layers,"
  performing the UTF-8 conversion first and then looking for escape
  sequences.  True UTF-8 support, in terminal emulators and in other
  software as well, really should depend on UTF-8 conversion being
  performed first.
 
 The irony is, when using ISO 2022 character-set designation and invocation,
 you have to handle the escape sequences first to know if you're in UTF-8.
 Therefore, this pushes the burden onto the end-user to preconfigure their
 emulator for UTF-8 if that is what is being used, when ideally this should
 happen automatically and transparently.

I may be misunderstanding the above, but ISO 2022 says:

  ESC 2/5 F shall mean that the other coding system uses
  ESC 2/5 4/0 to return;

  ESC 2/5 2/15 F shall mean that the other coding system
  does not use ESC 2/5 4/0 to return (it may have an alternative
  means to return or none at all).

Registration number 196 is for UTF-8 without implementation level, and
its escape sequence is ESC 2/5 4/7. I believe that ISO 2022 was designed
that way so that a decoder that does not know UTF-8 (or any other coding
system invoked by ESC 2/5 F) could simply "skip" the octets in that
encoding until it gets to the octets ESC 2/5 4/0.

This means that it does not need to decode UTF-8 just to find the escape
sequence ESC 2/5 4/0. UTF-8 does not do anything special with characters
below U+0080 anyway (they're just single-byte ASCII), so it works, no?

Of course, if you wanted to include any C1 controls inside the UTF-8
segment, they would have to be encoded in UTF-8, but ESC 2/5 4/0 is
entirely in the ASCII range (less than 128), so those octets are encoded
as is.

Erik



Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-13 Thread Frank da Cruz

Erik van der Poel wrote:
 Frank da Cruz wrote:
  The irony is, when using ISO 2022 character-set designation and invocation,
  you have to handle the escape sequences first to know if you're in UTF-8.
  Therefore, this pushes the burden onto the end-user to preconfigure their
  emulator for UTF-8 if that is what is being used, when ideally this should
  happen automatically and transparently.
 
 I may be misunderstanding the above, but ISO 2022 says:
 
   ESC 2/5 F shall mean that the other coding system uses
   ESC 2/5 4/0 to return;
 
   ESC 2/5 2/15 F shall mean that the other coding system
   does not use ESC 2/5 4/0 to return (it may have an alternative
   means to return or none at all).
 
 Registration number 196 is for UTF-8 without implementation level, and
 its escape sequence is ESC 2/5 4/7. I believe that ISO 2022 was designed
 that way so that a decoder that does not know UTF-8 (or any other coding
 system invoked by ESC 2/5 F) could simply "skip" the octets in that
 encoding until it gets to the octets ESC 2/5 4/0.
 
 This means that it does not need to decode UTF-8 just to find the escape
 sequence ESC 2/5 4/0. UTF-8 does not do anything special with characters
 below U+0080 anyway (they're just single-byte ASCII), so it works, no?
 
Yes, but I was thinking more about the ISO 2022 invocation features than the
designation ones:  LS2, LS3, LS1R, LS2R, LS3R, SS2, and SS3 are C1 controls.
The situation *could* arise where these would be used prior to announcing
(or switching to) UTF-8.  In this case, the end-user would have to configure
the software in advance to know whether the incoming byte stream is UTF-8.

Not a big deal; just an illustration of what happens when we can't use the
normal layering.

- Frank




Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-13 Thread Erik van der Poel

Frank da Cruz wrote:
 
 Yes, but I was thinking more about the ISO 2022 invocation features than the
 designation ones:  LS2, LS3, LS1R, LS2R, LS3R, SS2, and SS3 are C1 controls.
 The situation *could* arise where these would be used prior to announcing
 (or switching to) UTF-8.  In this case, the end-user would have to configure
 the software in advance to know whether the incoming byte stream is UTF-8.

Shouldn't the UTF-8 segment switch back to ISO 2022 before invoking any
of those C1 controls? This way, the decoder wouldn't have to know UTF-8,
and could skip over it reliably.

Erik



Re: Euro character in ISO

2000-07-12 Thread Roozbeh Pournader



On Tue, 11 Jul 2000, Asmus Freytag wrote:

 The only safe way to encode a Euro in HTML appears to be to use Unicode - 
 e.g. by using 8859-1 together with the numeric character reference (NCR) of 
 #x20AC;

euro; is much safer. Netscape 4 doesn't recognize hexadecimal character
references.

--roozbeh




Re: Euro character in ISO

2000-07-12 Thread Michael Everson

Ar 15:30 -0800 2000-07-11, scríobh Asmus Freytag:
At 01:25 PM 7/11/00 -0800, Leon Spencer wrote:
Has ISO addressed the Euro character?

Yes. It's at 0x20AC in ISO/IEC 10646-1.

This is not a standard notation. Please use U+20AC or one of the other
standard notations to refer to UCS code positions.

ME





Re: Euro character in ISO

2000-07-12 Thread Michael Everson

Ar 18:19 -0800 2000-07-11, scríobh Robert A. Rosenberg:

The problem would go away if the ISO would get their heads out of
their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
put the CP125x codes there.

Excuse me, but that is not appropriate. The ISO/IEC 8859 series is
conformant with ISO/IEC 2022, and protocols which adhere to that standard
should not be compromised by what you suggest.

Then when you said you used 8859-21 you'd get CP-1252 and Windows
would no longer need to lie (or tell the truth by admitting it is
CP-1252).

The problem is that some companies do/did not correctly identify their code
pages. The world can live with Latin-1 and CP-1252. It shouldn't have to
live with CP-1252 being identified as Latin-1.

Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Vox +353 1 478 2597 ** Fax +353 1 478 2597 ** Mob +353 86 807 9169
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire





Re: Euro character in ISO

2000-07-12 Thread Antoine Leca

Robert A. Rosenberg wrote:
 
 At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro
 character in ISO:
 
 There has been an attempt to create a series of 'touched up' 8859
 standards. The problem with these is that you get all the issues of
 character set confusion that abound today with e.g. Windows CP 1252
 mistaken for 8895-1 with a vengeance:
 
 The problem would go away if the ISO would get their heads out of
 their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
 put the CP125x codes there.

Sorry. It may work for CP1252/iso-8859-1, and CP1254/iso-8859-9,
but won't for the others. Since Windows starts with the same letter as
Word --or is the reason that they both come from the same company.
No! I cannot believe that-- there are a couple of requirements
that makes effectively the "other" codepages slighty incompatible,
such as the necessary presence for · at position B5 (because this
is the character Word uses when you ask it to "display" the spaces,
and this is hard-coded in the product).


 Then when you said you used 8859-21 you'd get CP-1252 and Windows
 would no longer need to lie (or tell the truth by admitting it is
 CP-1252).

Even if 8859-21 is defined to be exactly the same as some stage of
CP1252, and everyone in the standardization community admits this
as such, habits are so much entrenched, and love against Microsoft
so rare in the Unix world, that you may bet a lot that such a
standard will never gain wide acceptance.

Furthermore, this is completely unnecessary, as nowadays such
a standard exists, and it is used to be called 'charset=windows-1252'...

The real problem is that:
- Windows browsers/MAs did not know that until 1999 (as it seems)
- Windows HTML-tools/MAs are reluctant to add the test for presence
of non-Latin1 characters to either tag as iso-8859-1 or
windows-1252. Apparently they are too lazy (because they already
did such a test for ASCII).
Well, I am angry, because probably nowadays browsers do the job correctly.


Antoine



Re: Euro character in ISO

2000-07-12 Thread brendan_murray



Robert A. Rosenberg wrote:
Then when you said you used 8859-21 you'd get CP-1252 and Windows
would no longer need to lie (or tell the truth by admitting it is
CP-1252).
And because certain companies had (and still have) bugs in their comms products, incorrectly identifying CP1252 data as ISO 8859-1, ISO standards should reject ISO-2022 and populate C1 with graphic characters?

I suppose other inconsiderate incompatibilities such as the incorrect encoding of half-pitch kana in ISO-2022-JP is the fault of ISO too?

Perhaps those companies that have these major bugs in their software, all of which have been repeatedly pointed out, should fix the probems there. The rest of the industry bends over backwards to accomodate these corrupt data, so a little effort on the part of the guilty would help a lot, and might prevent misguided postings like the above.

B=


Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-12 Thread Frank da Cruz

 Frank da Cruz [EMAIL PROTECTED] wrote:
 
  . If you send a code in the 0x80-8x9f range to such a terminal or 
emulator, it properly treats it as a control code.  If it was
intended as a graphic character ("smart quote" or somesuch) the
result is a fractured screen, sometimes even a frozen session.
 
 This is the widely reported compatibility problem between UTF-8 and
 terminals.  I know I read somewhere, possibly on Markus Kuhn's Unicode
 page, possibly somewhere else, that ISO 2022 codes exist to switch out
 of "ISO 2022 mode" and into "UTF-8 mode" and to either allow or prevent
 switching back to 2022.  Is there any progress on implementing this so
 terminals and emulators can live with UTF-8?
 
Maybe Markus can clarify.  I would be surprised if there's anything in
ISO 2022 about UTF8, except that it does provide a way to switch out of
and back into ISO 2022 mode, allowing the use of character sets that do
not comply with ISO 2022 and 4873.  That's what the designating escape
sequences "with standard return" and "without standard return" are for.

But that's not quite the same thing.  There is no good reason why UTF-8
couldn't be used by (say) a VT320 emulator without switching out of the
ISO 2022 regime, except that UTF-8 contains C1 control codes as data.
This was discussed here a while back and "the other Markus" showed how
a C1-safe form of UTF-8 could have been designed:

  http://www.mindspring.com/~markus.scherer/utf-8c1.html

But, as they say, "it's too late now".  Therefore, those of us who want
to make use of UTF-8 within the ISO 2022 regime must reverse the layers.
First decode the UTF-8, then parse for escape sequences.  Of course your
emulator can get into awful trouble that way if the data stream isn't
really UTF-8.  But overall it's not that bad; we can live with it, and
in fact have done it this way in practice in our own emulator.

- Frank




Re: Euro character in ISO

2000-07-12 Thread Robert A. Rosenberg

At 04:27 AM 07/12/2000 -0800, Michael Everson wrote:
Ar 18:19 -0800 2000-07-11, scríobh Robert A. Rosenberg:

 The problem would go away if the ISO would get their heads out of
 their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
 put the CP125x codes there.

Excuse me, but that is not appropriate. The ISO/IEC 8859 series is
conformant with ISO/IEC 2022, and protocols which adhere to that standard
should not be compromised by what you suggest.

 Then when you said you used 8859-21 you'd get CP-1252 and Windows
 would no longer need to lie (or tell the truth by admitting it is
 CP-1252).

The problem is that some companies do/did not correctly identify their code
pages. The world can live with Latin-1 and CP-1252. It shouldn't have to
live with CP-1252 being identified as Latin-1.

Which is what I am saying when I talk about admitting that you are using 
CP-1252 not
ISO-8859-1 (in your MIME/HTML headers) at least in the case where there are 
glyphs in the
x80-x9F range in use. If a system can claim US-ASCII if no codes in the 
x80-xFF range appear and ISO-8859-1 otherwise (as many MUAs do), it should 
have the smarts to claim CP-1252 if in its scan it found a x80-x9F glyph).


Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Vox +353 1 478 2597 ** Fax +353 1 478 2597 ** Mob +353 86 807 9169
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire




Re: Euro character in ISO

2000-07-12 Thread Robert A. Rosenberg

At 08:56 PM 07/11/2000 -0800, Geoffrey Waigh wrote:
On Tue, 11 Jul 2000, Robert A. Rosenberg wrote:

  At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro
  character in ISO:
 
  There has been an attempt to create a series of 'touched up' 8859
  standards. The problem with these is that you get all the issues of
  character set confusion that abound today with e.g. Windows CP 1252
  mistaken for 8895-1 with a vengeance:
 
  The problem would go away if the ISO would get their heads out of
  their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
  put the CP125x codes there.

Except that would break all the systems that understand that C1 "junk,"
and a number of systems do so because they are adhering to other
ISO standards.  If you are going to force someone to change their
datastreams to something new, they might as well go to some flavour
of Unicode anyways.

Who is going to get broken if I say on my MIME header (or HTML) that my 
CHARSET is (example) ISO-8859-21? You are talking about uses where the 
computer is talking to a device and needs the C1 range to tell it what to 
do not another computer (where it is just passing a text stream). The C1 
codes are DEVICE CONTROL and have no purpose (except to occupy slots that 
are better used for extra GLYPHS) in EMAIL or HTML transfer. I am NOT 
asking for anyone to change their mode of operation - only for ISO-8859-x 
codes that are designed for transfer of printable data. UNICODE is not a 
viable option since all we are talking about is the ability to select from 
a number of 256 codepoint 8-bit tables not go over to UTF-8 or UTF-16 
(which would require changes to the program code).


Geoffrey
"tilting at terminal emulators, err windmills."




Re: Euro character in ISO

2000-07-12 Thread Frank da Cruz

On Wed, 12 Jul 2000 10:43:59 -0800, Robert A. Rosenberg wrote:
 At 08:56 PM 07/11/2000 -0800, Geoffrey Waigh wrote:
 On Tue, 11 Jul 2000, Robert A. Rosenberg wrote:
   At 15:30 -0800 on 07/11/00, Asmus Freytag wrote:
   There has been an attempt to create a series of 'touched up' 8859
   standards. The problem with these is that you get all the issues of
   character set confusion that abound today with e.g. Windows CP 1252
   mistaken for 8895-1 with a vengeance:
  
   The problem would go away if the ISO would get their heads out of
   their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
   put the CP125x codes there.
 
 Except that would break all the systems that understand that C1 "junk,"
 and a number of systems do so because they are adhering to other
 ISO standards.  If you are going to force someone to change their
 datastreams to something new, they might as well go to some flavour
 of Unicode anyways.
 
 Who is going to get broken if I say on my MIME header (or HTML) that my 
 CHARSET is (example) ISO-8859-21?

We go through this exercise about twice a year.  First, let's recognize
that ISO is not about to revoke Standards 4873 and 2022, so there's not
much point in suggesting it.  Second, think of a terminal that complies
with these standards.  A physical terminal such as a VT320.  I am using it
to access my mail host in text mode, and I'm reading mail with (say) Unix
'mail'.  The terminal does not interpret the MIME headers.  It doesn't
parse HTML.  It implements a very straightforward finite state automaton
that implements the ISO 2022 based terminal.  Unix 'mail' sends to my
terminal the bytes of the message, period.

Perhaps you're suggesting the Unix 'mail' should become a translation
agent between the character set of the mail and that of the user's
terminal?  I hope not, since given that practically any character set
anybody can dream up is "MIME-compliant" as long as it's tagged, then
every mail program must know how to convert from every character set in
existence to every other one.  Or is it the mail transfer agent?  Or both?
It's really quite a mess; let's not go out of our way to make it worse.

To understand the implications of using 8-bit character sets that contain
graphic characters in the C1 area FOR INTERCHANGE, imagine trying to do
the same thing to the C0 area.

- Frank




Re: Euro character in ISO

2000-07-12 Thread Frank da Cruz

 On Wed, 12 Jul 2000, Frank da Cruz wrote:
 
  Perhaps you're suggesting the Unix 'mail' should become a translation
  agent between the character set of the mail and that of the user's
  terminal?  I hope not, since given that practically any character set
  anybody can dream up is "MIME-compliant" as long as it's tagged, then
  every mail program must know how to convert from every character set in
  existence to every other one.
 
 Yes, it damn well should. And this is easy, as there is a standard Unix
 function that knows how to do this. (it's called iconv).
 
I'm logged into unix right now:

  $ iconv
  bash: iconv: command not found
  $

How standard can it be?  And what about VMS, VMS/CMS, VOS, OS/390, OS/400,
Tandem, and all the others?

How does the mail client know what character set my terminal has?

Anyway, between you and me, there are potentially lots of places where
character-set conversion can occur.  Your mail client, your MTA, my MTA,
my mail client, my Telnet server, my Telnet client, my terminal emulator.
Let's think carefully about this before we have random combinations of
these clients, agents, and servers stepping on each others' toes.

- Frank




Re: Euro character in ISO

2000-07-12 Thread Rick McGowan

 There are lots of Unixes:
   http://www.columbia.edu/kermit/unix.html
 How many of them have an iconv function?

rangda 47: man iconv
man: no entry for iconv in the manual.
rangda 48: cat /etc/motd
Welcome to Darwin!
rangda 49: well, hmmm...
zsh: command not found: well,
rangda 50: 



RE: Euro character in ISO

2000-07-12 Thread Chris Wendt

The trick is HTML4.

Since you sent the message in HTML format, the Euro is encoded as numeric
character reference. Exchange knows how to decode HTML and generate RTF,
depending on what your email client needs.

If you had sent plain text, the Euro would have turned into ?. As is the
case in the plain text part of the multipart message.

This is the case for Outlook Express 5. Older versions of OE treated
Windows-1252 and iso-8859-1 the same.

Here is the source of the message from my Outlook Express Sent Mail folder.
(To see the source, open message and press Ctrl-F3).

From: "Chris Wendt" [EMAIL PROTECTED]
To: "Chris Wendt" [EMAIL PROTECTED]
Subject: Euro test
Date: Wed, 12 Jul 2000 15:17:49 -0700
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="=_NextPart_000_0005_01BFEC14.57202A10"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400

This is a multi-part message in MIME format.

--=_NextPart_000_0005_01BFEC14.57202A10
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

abcdef ? abcdef

--=_NextPart_000_0005_01BFEC14.57202A10
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
HTMLHEAD
META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type
META content=3D"MSHTML 5.00.3103.1000" name=3DGENERATOR
STYLE/STYLE
/HEAD
BODY bgColor=3D#ff
DIVFONT color=3D#008000 face=3DVerdana size=3D2abcdef #8364;=20
abcdef/FONT/DIV/BODY/HTML

--=_NextPart_000_0005_01BFEC14.57202A10--


-Original Message-
From: Leon Spencer [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, July 12, 2000 2:38 PM
To: Unicode List
Subject: RE: Euro character in ISO


Is Microsoft playing tricks in MS Outlook or IE?
If I send text from Outlook Express to my exchange
account, with charset set to iso-8859-1 but containing
the Trademark symbol ((tm)) in the body, it shows up
okay. The body of the message is in text/html.

Is it possible that MS Outlook's HTML ActiveX control
(which I'm assuming to be the same used for IE) is
defaulting to Cp1252/Windows-1252 when it sees iso-8859-1?

Leon

BTW, The body also contains the Euro!



RE: Euro character in ISO

2000-07-11 Thread Leon Spencer


Does anyone know where I can easily download the
latest ISO-8859-X specs? The ones at ftp.unicode.org
seem to be dated 1996.

Also, does anyone know which ISO-8859-X contains 
the Euro?

Thanks.
  Leon

 -Original Message-
 From: Murray Sargent [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, July 11, 2000 2:44 PM
 To: Unicode List
 Cc: Unicode List
 Subject: RE: Euro character in ISO
 
 
 The two statements are correct. ISO has addressed the problem 
 by adding more
 ISO-8859-x standards, since changing 8859-1 would cause 
 problems.  The best
 thing to do is to use Unicode and avoid the codepage confusion :-)
 
 Murray
 
  -Original Message-
  From:   Leon Spencer [SMTP:[EMAIL PROTECTED]]
  Sent:   Tuesday, July 11, 2000 2:26 PM
  To: Unicode List
  Subject:Euro character in ISO
  
  The Euro does not exist in iso-8859-1. It
  is in Cp1252 (WinLatin1) - Microsoft's code page
  superset of iso-8859-1. 
  
  Is this correct? Has ISO addressed the Euro character?
  If so, it the issue more of vendors implementing it?
  
  Leon
 



Re: Euro character in ISO

2000-07-11 Thread Asmus Freytag

At 01:25 PM 7/11/00 -0800, Leon Spencer wrote:
Has ISO addressed the Euro character?

Yes. It's at 0x20AC in ISO/IEC 10646-1.

There has been an attempt to create a series of 'touched up' 8859 
standards. The problem with these is that you get all the issues of 
character set confusion that abound today with e.g. Windows CP 1252 
mistaken for 8895-1 with a vengeance: Not only is 8859-15 slightly 
different from 8859-1, but the difference involves codes that are perfectly 
valid in 8859-1.

Because for 99% of all text, there is no difference, people are almost 
certain to mix them up, mislabel HTML files etc. etc.

The only safe way to encode a Euro in HTML appears to be to use Unicode - 
e.g. by using 8859-1 together with the numeric character reference (NCR) of 
#x20AC;

A./



Re: Euro character in ISO

2000-07-11 Thread Robert A. Rosenberg

At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro 
character in ISO:

There has been an attempt to create a series of 'touched up' 8859
standards. The problem with these is that you get all the issues of
character set confusion that abound today with e.g. Windows CP 1252
mistaken for 8895-1 with a vengeance:

The problem would go away if the ISO would get their heads out of 
their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and 
put the CP125x codes there.
Then when you said you used 8859-21 you'd get CP-1252 and Windows 
would no longer need to lie (or tell the truth by admitting it is 
CP-1252).



Re: Euro character in ISO

2000-07-11 Thread Michael \(michka\) Kaplan

Robert,

I am a big fan of the Windows code pages, they often make my life easier.
However, there is a disadvantage to the fact that even over the course of a
few service packs (let alone a few operating systems!) the code pages have
changed, and there is simply no good documentation that will tell you when
(for example) Farsi characters U+06A9 and U+06AF were added to Windows
CP1256 (Arabic) . All that one knows for certain is that it was before
Windows 98 SE and before NT4 SP5 (although it did no ship with NT4).

When you cannot figure out why an application works on one platform and not
another, it can make you pine for a more stationary standard! :-)

My ISP moved to Windows 2000 so I do not have to worry about making them
install things like newer code page files on the web server, but for a long
time thse differences plagued me heavily.

michka


- Original Message -
From: "Robert A. Rosenberg" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Cc: "Unicode List" [EMAIL PROTECTED]
Sent: Tuesday, July 11, 2000 7:19 PM
Subject: Re: Euro character in ISO


 At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro
 character in ISO:

 There has been an attempt to create a series of 'touched up' 8859
 standards. The problem with these is that you get all the issues of
 character set confusion that abound today with e.g. Windows CP 1252
 mistaken for 8895-1 with a vengeance:

 The problem would go away if the ISO would get their heads out of
 their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
 put the CP125x codes there.
 Then when you said you used 8859-21 you'd get CP-1252 and Windows
 would no longer need to lie (or tell the truth by admitting it is
 CP-1252).