Re: The Case Against Autodecode

docandrew via Digitalmars-d Sun, 05 Jun 2016 11:41:04 -0700

On Saturday, 4 June 2016 at 08:12:47 UTC, Walter Bright wrote:

On 6/3/2016 11:17 PM, H. S. Teoh via Digitalmars-d wrote:
On Fri, Jun 03, 2016 at 08:03:16PM -0700, Walter Bright viaDigitalmars-d wrote:
It works for books.
Because books don't allow their readers to change the font.
Unicode is not the font.
This madness already exists *without* Unicode. If you have apage with asingle glyph 'm' printed on it and show it to an Englishspeaker, hewill say it's lowercase M. Show it to a Russian speaker, andhe will say
it's lowercase Т.  So which letter is it, M or Т?
It's not a problem that Unicode can solve. As you said, themeaning is in the context. Unicode has no context, and tries tosolve something it cannot.
('m' doesn't always mean m in english, either. It depends onthe context.)
Ya know, if Unicode actually solved these problems, you'd havea case. But it doesn't, and so you don't :-)
If you're going to represent both languages, you cannot getaway from
needing to represent letters abstractly, rather than visually.
Books do visually just fine!
So should O and 0 share the same glyph or not? They'revisually the same
thing,
No, they're not. Not even on old typewriters where every keywas expensive. Even without the slash, the O tends to be fatterthan the 0.
The very fact that we distinguish between O and 0,independently of what
Unicode did/does, is already proof enough that going by visual
representation is inadequate.
Except that you right now are using a font where they aredifferent enough that you have no trouble at all distinguishingthem without bothering to look it up. And so am I.
In other words toUpper and toLower does not belong in thestandard
library. Great.
Unicode and the standard library are two different things.

Even if a character in different languages share a glyph or lookidentical though, it makes sense to duplicate them with differentcode points/units/whatever.

Simple functions like isCyrillicLetter() can then do a simpleless-than / greater-than comparison instead of having a lookuptable to check different numeric representations scatteredthroughout the Unicode table. Functions like toUpper and toLowerbecome easier to write as well (for SOME languages anyhow), it'ssimply myletter +/- numlettersinalphabet. Redundancy here is veryhelpful.


Maybe instead of Unicode they should have called it Babel... :)

"The Lord said, “If as one people speaking the same language theyhave begun to do this, then nothing they plan to do will beimpossible for them. Come, let us go down and confuse theirlanguage so they will not understand each other.”"


-Jon

Re: The Case Against Autodecode

Reply via email to