Ernest Cline <ernestcline at mindspring dot com> wrote:

> It would have been better in my opinion to have encoded upper and
> lower case forms of both characters separate from the ordinary I.
> That would have placed language specific burdens not on the casing
> algorithm of Unicode but on the transfer of data from legacy
> character sets.

This added expense *needed* to be at the Unicode end, not in the
character-set conversion process.  Unicode-aware processes are supposed
to be able to understand such things anyway.  And what do you do about
keyboards if Turkish undotted I and dotted i are different code points
from "ordinary" I and i?

I used to think I understood the situation with dotless j, when Unicode
experts stated confidently that it would not be encoded because no
writing system used it.  Now that it is scheduled for encoding on
math-notation grounds, but in a normal Latin-extensions block -- not
just as a U+1Dxxx math symbol -- I don't know what to believe.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/


Reply via email to