Consider the botched French-language text

A Montréal, a la fin des années 80, . . .

It should of course be

A Montréal, à la fin des années 80 . . .

The difficulty arises when a convention for representing 'é'  as two
successive byte values of the form

<-minuscule-e code point><accent-aigu code point>

in one code page collides with the single-byte representation of 'Ã'
and '©' as just these two unique code points in another code page.

Regrettably, Unicode has carried alternative support for the generic

<basic alphabetic code point><modifier code point>

scheme forward; and its availability and heavy use in some contexts
needs to figure in the sorts of discussions that have been going on
here during the last few days.   It greatly complicates translation in
a fashion that is of no conceptual interest but is messy.


John Gilmore, Ashland, MA 01721 - USA

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to