Michael Everson wrote:
> At 06:09 +0430 2001-04-05, Roozbeh Pournader wrote:
> >No. Such a mapping table does not respect the difference 
> between the Rial
> >sign and the string "Reh Yeh Alef Lam" in the ISIRI 3342 
> encoded file.
> 
> Why would you want to preserve that distinction, if you are satisfied 
> to use the string only in Unicode text?

There are many usages for Unicode; one of them is acting as the internal
representation of text in *internationalized* LOW layers of software, as
opposed to *localized* HIGH layers using legacy charsets.

The "low layer" (using Unicode) is things like operating system API's,
database engines, network transport components, etc.

The "high level" (using legacy charsets) is things like keyboard drivers,
plain text files, fonts, etc.

This is, e.g., the way Windows NT works: Unicode is used to handle text in
the OS core, but it is mapped to/from "code pages" as soon as it has to
become visible to the user.

But if the mapping is not faithful in both directions, the user is going to
have problems. Especially, irreversible one-to-many mappings may be
dangerous because you may drive old applications to violate size limits.

Take the example of Iranian rial.  An old application running on an old
operating system used this character in a single-character field <Currency
Symbol>.  When the application is ported to a newer operating system, which
uses Unicode internally, it will break because it will attempt to insert a
4-character string in that 1-character field.  The programmer reviews the
code and gets puzzled, because she sees that the program actually puts 1
character there!   So who changes it to a 4-character word?  Sooner or
later, she will discover that it is the Unicode conversion inside the
operating system that causes this problem, and she will of course blame
Unicode -- who else?

My natural reaction to such a story would be: "Why don't you fix that bloody
program? How can a <Currency Symbol> field be limited to 1 character? What
will you do with 'FF', 'DM', 'L.', 'US$', 'GB£', etc.?"

And that is also my first reaction with X keyboards: just go and change them
to allow an arbitrary string to be attached to each key.  But I know that,
in practice, it is not always possible to do these "small" fixes. The
existing software base can be enormous, and fixing all of it may be very
complicated.

Do you remember the millions dollars and man/hours lost to fix the Y2K bug?
Well, that was the most stupid bug I have ever heard of: it simply required
that all 2-digit year fields be changed to 4 digits...

_ Marco

Reply via email to