RE: Clones (was RE: Hexadecimal)

Jim Allan Tue, 19 Aug 2003 09:39:46 -0700

Jill Ramonsky posted on the minus sign:

Yeah, I know. But like I said, who uses this?

Books are normally produced today using computer typesetting. Look in any mathematics text or any well printed book for minus signs. Hyphens and minus signs are distinct (except when showing computer programming in a non-spacing font). Hyphen and minus sign have always been different characters.

TeX and SGML and other pre-Unicode legacy typographical systems support this difference which has always existed.

On common computer systems like the Macintosh and Windows which didn't support the difference globally in their standard character sets in pre-Unicode days it was customary to use the en-dash instead of a minus sign in formatted text. Or you switched to special math-symbol fonts when entering mathematical signs and other symbols.

Style sheets and books of tips for word processing and desktop publishing almost always go into some detail about the various kinds of dashes and the minus sign. So does the Unicode manual in its section on punctuation.

And I also have to ask ... if I am actually WRITING a C++ compiler, should I allow the use of MINUS SIGN to mean minus sign? (Actually, that question may be answered by the specification of C++, so let's push it a bit further. If I am inventing some successor language to C++, and am free to invent my own specification, should I _then_ allow the use of MINUS SIGN?)

The symbols to be used for any computer language are part of the definition of that computer language. Currently you can't legally use U+2212 for any computer language I know of.

However I will be surprised if computer languages do not start to take advantage of the additional characters that are universally available though Unicode.

I only ask that the charts make clear what each character is FOR, in sufficient detail that the answer to questions like the above becomes obvious.

Currently the manual assumes that a user who wants to use a character will mostly already know what it is FOR or the user wouldn't want to use it. That's a reasonable assumption to make to avoid expanding the manual to five or six volumes at least. A small amount of typographical and usage information on some characters is provided for the convenience of font makers.

I would personally love to see an expanded version of the Unicode manual, a sort of multi-volume encylopedia of characters and their history and uses.

Meanwhile Unicode tells us that a particular glyph is a normal glyph for MINUS SIGN. That really should be enough. Most people know that math symbols are generally not (yet?) implemented to actually DO their function on computers. And it is hardly necessary of the purpose of the manul that, for examples, under % we should be told about its use for modulus or introducing a comment in some computer languages.

You don't complain that the charts doen't tell you what U+00D7 MULTIPLICATION SIGN is for or U+00F7 DIVISION SIGN or U+0026 AMPERSAND.

As to supporting all of Unicode, see 2.12 in http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf.

Must a cell phone, for example, support all of Unicode?

Must every font contain every Unicode character?

Partial support is quite conformant provided that what is supported is supported according to the standard and data is not corrupted.

That doesn't mean that full support and impecable rendering is not desireable. It is in the long run. But a lap top user who generally uses only English may not wish have disk space taken up by East Asian fonts or top-of-the line publishing software that handles east Indian scripts impeccably.

Government software for various governments may purposely support only a particular subset of the Unicode character set.

Jim Allan

RE: Clones (was RE: Hexadecimal)

Reply via email to