Re: Clones (was RE: Hexadecimal)
On Roman number signs Jill Ramonski scripsit; > I confess, I hadn't read ch14.pdf, and I probably should have done. My > fault. But I still believe that there should be something in the > machine-readable code charts themselves that says, of the Roman numerals, > "Don't use these characters - use the the normal Latin letters instead". > If > they really are there _SOLELY_ for round trip compliance with East Asian > standards, then, if I wish to put the year MMIII in a web page, I should > _NOT_ use the Roman letters. Furthermore, if I write software to interpret > Roman Numbers, I only need to interpret the Basic Latin letters, not the > Roman ones. My life as a webmaster and programmer is made so much SIMPLER > by > not having to use the Roman letters. I would really like it if these, and > every single other character which is "only there for reasons of round > ... In - I think, not only - German quality printing the Roman numerals and the related letters usually are not equal. At least the numerals got a reduced advanced width. Metal fonts usually had no extra Roman numerals punches, but the typesetters filed the punches a bit slimmer. The I, the V, and the X may also have connecting top- and bottom bars, the latters not necessarily at the base line. So you cannot say, they were simply cloned letters. Ok, this might be a matter of smart font technologies, hopefully available one day in standard PC applications, but as there are code points defined for these numerals, they are and certainly will be used in Latin script for a well understandable reason. Is there another solution for non smart fonts? In my opinion the advice, not to use these codepoints will not solve the problem. Actually there are fonts, containing very clearly distinct Roman numerals, for example the Titus Cyberbit font of the Titus project at the Frankfurt (Main) university. Gerd Schumacher
Re: Clones (was RE: Hexadecimal)
> "John" == John Jenkins <[EMAIL PROTECTED]> writes: John> (Apple's LastResort font [contains every Unicode character], John> of course, but by virtually of rampant reuse of glyphs.) Does this Generate glyphs like the following ascii- & utf8-art? +--+┌──┐ |AB|│AB│ |CD|│CD│ +--+└──┘ (Both included for the benefit of the utf8-impaired.) I find it interesting, if so, that Apple uses a font to acheive that rather than a bit of code in the rendering libs. I beleive that pango (Παν語) does it in the lib. -JimC
Re: Clones (was RE: Hexadecimal)
On 2003 ¦~ 8 ¤ë 19 ¤é ¬P´Á¤G, at 9:18 AM, Jim Allan wrote (rhetorically): Must every font contain every Unicode character? FWIW, it's no longer possible for a TrueType/OpenType font to contain every Unicode character with a distinct glyph. (Apple's LastResort font does it, of course, but by virtually of rampant reuse of glyphs.) John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage..mac.com/jhjenkins/
RE: Clones (was RE: Hexadecimal)
Jill Ramonsky posted on the minus sign: Yeah, I know. But like I said, who uses this? Books are normally produced today using computer typesetting. Look in any mathematics text or any well printed book for minus signs. Hyphens and minus signs are distinct (except when showing computer programming in a non-spacing font). Hyphen and minus sign have always been different characters. TeX and SGML and other pre-Unicode legacy typographical systems support this difference which has always existed. On common computer systems like the Macintosh and Windows which didn't support the difference globally in their standard character sets in pre-Unicode days it was customary to use the en-dash instead of a minus sign in formatted text. Or you switched to special math-symbol fonts when entering mathematical signs and other symbols. Style sheets and books of tips for word processing and desktop publishing almost always go into some detail about the various kinds of dashes and the minus sign. So does the Unicode manual in its section on punctuation. And I also have to ask ... if I am actually WRITING a C++ compiler, should I allow the use of MINUS SIGN to mean minus sign? (Actually, that question may be answered by the specification of C++, so let's push it a bit further. If I am inventing some successor language to C++, and am free to invent my own specification, should I _then_ allow the use of MINUS SIGN?) The symbols to be used for any computer language are part of the definition of that computer language. Currently you can't legally use U+2212 for any computer language I know of. However I will be surprised if computer languages do not start to take advantage of the additional characters that are universally available though Unicode. I only ask that the charts make clear what each character is FOR, in sufficient detail that the answer to questions like the above becomes obvious. Currently the manual assumes that a user who wants to use a character will mostly already know what it is FOR or the user wouldn't want to use it. That's a reasonable assumption to make to avoid expanding the manual to five or six volumes at least. A small amount of typographical and usage information on some characters is provided for the convenience of font makers. I would personally love to see an expanded version of the Unicode manual, a sort of multi-volume encylopedia of characters and their history and uses. Meanwhile Unicode tells us that a particular glyph is a normal glyph for MINUS SIGN. That really should be enough. Most people know that math symbols are generally not (yet?) implemented to actually DO their function on computers. And it is hardly necessary of the purpose of the manul that, for examples, under % we should be told about its use for modulus or introducing a comment in some computer languages. You don't complain that the charts doen't tell you what U+00D7 MULTIPLICATION SIGN is for or U+00F7 DIVISION SIGN or U+0026 AMPERSAND. As to supporting all of Unicode, see 2.12 in http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf. Must a cell phone, for example, support all of Unicode? Must every font contain every Unicode character? Partial support is quite conformant provided that what is supported is supported according to the standard and data is not corrupted. That doesn't mean that full support and impecable rendering is not desireable. It is in the long run. But a lap top user who generally uses only English may not wish have disk space taken up by East Asian fonts or top-of-the line publishing software that handles east Indian scripts impeccably. Government software for various governments may purposely support only a particular subset of the Unicode character set. Jim Allan
RE: Clones (was RE: Hexadecimal)
Compatibility characters: The recommendations for compatibility characters are necessarily vague, since their use in legacy data (and legacy environments) is strongly dependent on what is (or was) customary in a given environment. If a process merely warehouses text data (or parses only a very small subset of characters for special purpose, such as an HTML parser) then merely preserving legacy characters is often the best strategy. However, take the opposite example, of a process that actually scans the text for roman numerals. In that case, ignoring the compatibility characters would be a mistake, since legacy data of the kind for which these compatibility characters were added would *only* contain roman numerals in this form. They would *not* use the ASCII characters. Processes that modify legacy data for re-export to a legacy system obviously need to be intimately familiar with the legacy conventions, in a way that could not possibly be documented in the Unicode Standard in all details for every character and every legacy system. Documentation in the code charts: I agree with several of the comments that "hiding" the information about special characters in running text makes it unnecessarily difficult to work with the information. On the other hand, not everything can be succinctly expressed in machine readable tables (some characters have complicated usages), and even annotations in the name list have limits. They are definitely not the place for lengthier discussions. For Unicode 4.0 we have attempted to improve the situation by systematically extracting the line-breaking related information into UAX#14, which at least allows task-focused access. Information about mathematical usage of characters is now collected in one place in UTR#25, partially duplicating, and partially extending the information in the text of the standard, but providing a single place of access. Further improvements are possible. Personally I'd be in favor of some icon in the character names list that simply indicates that a character is more fully discussed elsewhere - that would make the code charts more useful as an index into the description of the characters. Mathematical operators: Future extensions of programming languages should allow not only the MINUS sign as operator, but many other charactesr, for example LOGICAL AND and LOGICAL OR, and as many other operators as appropriate for the language. Input of the operators doesn't have to necessarily be done via a special purpose keyboard. The use of input macros, editor substitution or similar input technologies (e.g. turning && into LOGICAL AND) would be more flexible. Some editors already support the display of highly formatted program source code even though the underlying text backbone uses the standard ASCII conventions of current programming languages. Just one example is Source Insight from www.sourceinsight.com, which not only represents >= etc. by singly symbols, but can also correctly increase the size of outer parentheses for nested expressions. A./
RE: Clones (was RE: Hexadecimal)
Well that just proves my point then. There are indeed some things that DO need to support the whole of Unicode (more or less). Jill -Original Message- From: Peter Kirk [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 19, 2003 10:30 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Clones (was RE: Hexadecimal) On 19/08/2003 01:58, [EMAIL PROTECTED] wrote: >I disagree. > >A post-Windows, post-Linux, Operating System for the 21st century intended >for global use, should ideally support the whole of Unicode. > >There are, in fact, people working on such projects. >Jill > > > > Well, whatever might be new about this OS, it is not its Unicode support. Windows XP and Linux already support the whole of Unicode, more or less. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Clones (was RE: Hexadecimal)
On 19/08/2003 01:58, [EMAIL PROTECTED] wrote: I disagree. A post-Windows, post-Linux, Operating System for the 21st century intended for global use, should ideally support the whole of Unicode. There are, in fact, people working on such projects. Jill Well, whatever might be new about this OS, it is not its Unicode support. Windows XP and Linux already support the whole of Unicode, more or less. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
RE: Clones (was RE: Hexadecimal)
I disagree. A post-Windows, post-Linux, Operating System for the 21st century intended for global use, should ideally support the whole of Unicode. There are, in fact, people working on such projects. Jill -Original Message- From: Jim Allan [mailto:[EMAIL PROTECTED] Sent: Monday, August 18, 2003 11:41 PM To: [EMAIL PROTECTED] Subject: Re: Clones (was RE: Hexadecimal) No system has to support all of Unicode.
RE: Clones (was RE: Hexadecimal)
Yeah, I know. But like I said, who uses this? I have a QWERTY keyboard in front of me. I use a standard en-GB key mapping. Now I _could_ customise my keymap such that Right-Alt + HYPHEN MINUS yielded MINUS SIGN. Wouldn't that be great? Then I could write things like "x = -5;" unambiguously. But it would completely screw my C++ compiler. And I also have to ask ... if I am actually WRITING a C++ compiler, should I allow the use of MINUS SIGN to mean minus sign? (Actually, that question may be answered by the specification of C++, so let's push it a bit further. If I am inventing some successor language to C++, and am free to invent my own specification, should I _then_ allow the use of MINUS SIGN?) I'm not being Devil's advocate. I don't necessarily even expect anyone to have a definitive answer. I only ask that the charts make clear what each character is FOR, in sufficient detail that the answer to questions like the above becomes obvious. Jill -Original Message- From: John Cowan [mailto:[EMAIL PROTECTED] Sent: Monday, August 18, 2003 4:39 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Clones (was RE: Hexadecimal) > U+2212 (minus sign) - an obvious clone of U+002D (hyphen-minus). Who > uses this? The ASCII characters, because they have had to do double or triple duty over the years when we had a very limited 7-bit character set, often have several near-equivalents in Unicode that disambiguate their *typographically* different purposes. Thus hyphen, minus sign, en dash, and em dash have separate Unicode representations, though in ASCII they are often written -, -, -- or -, and --- or -- respectively.
Re: Clones (was RE: Hexadecimal)
Peter Kirk posted: Well, that's what was puzzling me about the recommendations not to use these characters. In my opinion, there needs to be a clear statement with each character definition (not somewhere in the text not linked to it) of its status in such respects. Is it for compatibility use only? Is it a presentation form not for use in general information interchange? Is it a formatting variant of another character, which should be used if that special formatting is to be indicated although the two might be collated together? Perhaps a cross-reference to areas in the main text where that particular character or kind of character is discussed when there is some special mention in the main text. Otherwise the various indications of distinction and compabitility decomposition and canonical decomposition usually indicate a lot, if the reader looks at them and learns to understand them. But indeed the standard is somewhat inconsistant in sometimes coming close to recommending not using compatibility characters at all and in other cases recommending particular ones. For example, if I want a superscript 2 to indicate "squared" (which someone used on this list recently), am I supposed to use U+00B2, or should I avoid using it and instead use a higher level markup (which implies I need to use HTML e-mail)? Maybe the text tells me somewhere, but it certainly doesn't in the code chart. Well if you are using unformatted text and want to use a superscript 2 then you don't have much choice. I suppose I could have sent "E=mc^2" or "E=mc{squared}" "E=mc2" or something, but why would I when I have Unicode? :-) Actually superscript 2 is also in the Latin-1 character set. :-) In http://www.unicode.org/versions/Unicode4.0.0/ch14.pdf it states: << Therefore, the preferred means to encode superscripted letters or digits, such as “1st” or “DC0016”, is by style or markup in rich text. >> I would think that statement obvious since in technical writing and mathematical writing it is theoretically possible for any displayable character in Unicode to be superscripted or subscripted, and even superscripted or subscripted to an already superscript or subscript character, and so on. Also in the code chart (http://www.unicode.org/charts/PDF/U0080.pdf) U+00BS SUPERSCRIPT TWO is given a compatibility decomposition to " *0032* 2". Similarly with other superscript characters. But beyond all recommendations in the Unicode standard what is done depends on what the user wants to do for a particular purpose in a particular environment with particular fonts. There is no one correct way that fits all users at all places and times, nor should there be. If I am printing out a document on a particular system with particular software and fonts in which plain text superscripts look to me better than superscripts created by formatting regular numbers by the word processor I am using then I will naturally in that time and place use Unicode plain text superscripts. That Unicode gives me the choice is a benefit I should take advantage of without worrying that formatting regular numbers as superscript is theoretically better than using compatibility characters. Unicode is messy and complex mostly because character usage is messy and complex and display technology is messy and complex and there are always edge-cases and things that don't fit well. But Unicode's keeping deprecated individual character encodings while allowing applications to freely throw away non-deprecated canonical decomposable encodings (which supposedly only exist because they should not be thrown away) confuses me also. I thought even deprecated ones were supposed to be usable, in that a system should process them correctly. It depends on what is meant by "usable" and the "system" and "correctly". No system has to support all of Unicode. Accordingly I would not expect systems to support deprecated control characters or fonts to go out of their way to support deprecated characters. A system that does not support deprecated control codes (and even some of the non-deprecatated control codes) and does not support particular characters (perhaps only because there are no fonts on the system with those characters) can still be conformant to Unicode in what it supports. A text editor that supports only fixed width fonts will probably not support the special-width spaces properly but may still be Unicode conformant. Jim Allan
Re: Clones (was RE: Hexadecimal)
Someone suggested... > It would be much simpler if each such character were clearly labelled in > the code charts etc. DO NOT USE!, and with its glyph presented on a grey > background or in some other way to indicate its special status. Well, sure, I agree that it might be nice to somewhere document some of the discouraged and deprecatd characters in a way that people could find easily, putting gray boxes in the charts isn't the way. Perhaps we should also put in blinking bold neon letters the disclaimer that is posted at the top of every chart PDF file: > Disclaimer > These charts are provided as the on-line reference to the character > contents of the Unicode Standard, Version 4.0 but do not provide all > the information needed to fully support individual scripts using the > Unicode Standard. For a complete understanding of the use of the > characters contained in this excerpt file, please consult the > appropriate sections of The Unicode Standard, Version 4.0 > (ISBN 0-321-18578-1), as well as Unicode Standard Annexes #9, > #11, #14, #15, #24 and #29, the other Unicode Technical Reports > and the Unicode Character Database, which are available on-line. Before using things in the standard, people really should check out what they are using! There are lοts of things that look really similar but have wildly different semantics and оne might n০t want t੦ use things indiscriminantly based s๐lely ᅌn what's in the charts... Rick
Re: Clones (was RE: Hexadecimal)
On 18/08/2003 11:32, Jim Allan wrote: Peter Kirk posted: It would be much simpler if each such character were clearly labelled in the code charts etc. DO NOT USE!, and with its glyph presented on a grey background or in some other way to indicate its special status. I don't think people should be told so directly to NOT use an official Unicode character unless the character is actually deprecated. OK, DO NOT USE! is too strong, but something like NOT RECOMMENDED! could be used instead. Over the years recommendations about particular characters in the standard have sometimes changed and no-one can see all possible uses for characters or all ways that applications might use some of them. Well, such things need not be frozen from version to version. And a note could read NOT RECOMMENDED except in the case of... But greying the chart area for deprecated characters and singleton canonical decomposable characters seems to me a good idea. As to compatibility characters, remember some of them, for example spaces with varying widths, make essential differences in formatting. The standard warns applications not to be hasty in unifyng compatibility characters for presentation. Well, that's what was puzzling me about the recommendations not to use these characters. In my opinion, there needs to be a clear statement with each character definition (not somewhere in the text not linked to it) of its status in such respects. Is it for compatibility use only? Is it a presentation form not for use in general information interchange? Is it a formatting variant of another character, which should be used if that special formatting is to be indicated although the two might be collated together? For example, if I want a superscript 2 to indicate "squared" (which someone used on this list recently), am I supposed to use U+00B2, or should I avoid using it and instead use a higher level markup (which implies I need to use HTML e-mail)? Maybe the text tells me somewhere, but it certainly doesn't in the code chart. If it is not deprecated a character should be usable. I thought even deprecated ones were supposed to be usable, in that a system should process them correctly. But some more obivous graphic indication would be nice to more obviously indicate that perhaps a user should think carefully about using that particular encoded character. Agreed. Jim Allan -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Clones (was RE: Hexadecimal)
Peter Kirk posted: It would be much simpler if each such character were clearly labelled in the code charts etc. DO NOT USE!, and with its glyph presented on a grey background or in some other way to indicate its special status. I don't think people should be told so directly to NOT use an official Unicode character unless the character is actually deprecated. Over the years recommendations about particular characters in the standard have sometimes changed and no-one can see all possible uses for characters or all ways that applications might use some of them. But greying the chart area for deprecated characters and singleton canonical decomposable characters seems to me a good idea. As to compatibility characters, remember some of them, for example spaces with varying widths, make essential differences in formatting. The standard warns applications not to be hasty in unifyng compatibility characters for presentation. If it is not deprecated a character should be usable. But some more obivous graphic indication would be nice to more obviously indicate that perhaps a user should think carefully about using that particular encoded character. Jim Allan
Re: Clones (was RE: Hexadecimal)
On 18/08/2003 09:06, Jim Allan wrote: Jill Ramonsky posted: I would really like it if these, and every single other character which is "only there for reasons of round trip compatibility" with something else, were explicity marked in the machine-readable charts with something meaning "Don't introduce this character, at all, ever. Don't try to interpret it. Just preserve it, in case it ever gets turned back to its original character set". That would probably be too strong. If characters are available then some people will use them. :-( See section 2.3 at http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf Unicode 3.0 contained under section D21 on compatibility characters: << Their use is discouraged other than for legacy data. >> I don't know whether this statement was intentionally removed was accidently dropped in the changes in 4.0 which distinguish "compatitiblity character" from "compatibility composite character". In any case people can't be prevent from doing things that are officially discouraged, especially as for some particular use it might be wrong to discourage them. So if you are handling Roman numerals in an application and wish your handling to be complete then unfortunately you do have to take the compatibility Roman numerals into account. Yes, but people can be clearly discouraged from using them, and that is not currently happening. It seems that currently if you come across a character by browsing through the charts and want to discover if use of it is officially discouraged you have to wade through huge databases and hundreds of pages of text to find out if a particular set of properties implies that use is discouraged. Well, even that won't tell me definitively, for I read, "The compatibility decomposable characters are precisely defined in the Unicode Character Database, whereas the compatibility characters in the more inclusive sense are not." (from section 2.3) - and it is the latter whose use is discouraged. But is it in fact safe to assume that the list of such characters includes, but is not limited to, those which have defined compatibility mappings? It would be much simpler if each such character were clearly labelled in the code charts etc. DO NOT USE!, and with its glyph presented on a grey background or in some other way to indicate its special status. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Clones (was RE: Hexadecimal)
Jill Ramonsky posted: I would really like it if these, and every single other character which is "only there for reasons of round trip compatibility" with something else, were explicity marked in the machine-readable charts with something meaning "Don't introduce this character, at all, ever. Don't try to interpret it. Just preserve it, in case it ever gets turned back to its original character set". That would probably be too strong. If characters are available then some people will use them. :-( See section 2.3 at http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf Unicode 3.0 contained under section D21 on compatibility characters: << Their use is discouraged other than for legacy data. >> I don't know whether this statement was intentionally removed was accidently dropped in the changes in 4.0 which distinguish "compatitiblity character" from "compatibility composite character". In any case people can't be prevent from doing things that are officially discouraged, especially as for some particular use it might be wrong to discourage them. So if you are handling Roman numerals in an application and wish your handling to be complete then unfortunately you do have to take the compatibility Roman numerals into account. U+2212 (minus sign) - an obvious clone of U+002D (hyphen-minus). Who uses this? People concerned with proper appearance of the symbol in proportional fonts. Almost all proportional fonts use a narrow hyphen dash rather than a minus-width dash for the hyphen-minus character. In some older-style fonts it is even a slanting character. See http://www.unicode.org/versions/Unicode4.0.0/ch06.pdf in 6.2 for a detailed discussion of the various dash characters. U+2217 (asterisk operator) - an equally obvious clone of U+002A (asterisk) They look much the same in a typewriter style font. They don't do so in proportional fonts where the regular asterisk tends to appear somewhat like a superscript. Unicode provides support both for good typographical usage as well as traditional data-processing typographical usage based based on typewriter technology. U+223C (tilde operator) - a clone of U+007E (tilde) See http://www.unicode.org/versions/Unicode4.0.0/ch07.pdf and look for "Spacing Clones of Diacritics". The ASCII tilde was originally intended to be a non-spacing diacritic tilde to be applied to other characters by backspace. In part because of the low resolution of many early data-processing printers it was often realized in a tilde operator form. That has now become its most normal form in fonts. But for good typography you do want to distinguish them and the overloading of tilde as ASCII 7E means that a font may render a mathemtical full-character tilde when you want to show a diacritic or render a spacing diacritic when you wanted a mathematical operator. Unicode is intended for typesetting applications as well as entering computer code in a traditional typewriter style character set with typewriter limitations. and then there's U+2223 (divides) - hell, that looks to me remarkably like U+007C (vertical line) The do look close. But U+007C usually extends below the base line and and U+2223 usually doesn't. For example: U+2264 (less than or equal to) - compare with U+2A7D (less than or slanted equal to) I have no idea. You will probably have to ask the MathML people about that one. See http://www.w3.org/TR/2001/REC-MathML2-20010221. Mathematicians seem to think they need to distinguish the two. As a non-mathematician I find many of these distinctions bewildering and seemingly only typographical. But if mathematicians in some field make fine distinctions based on such differences then it is important that Unicode allow such distinctions to be maintained in plain text. In defence of this argument, I point out that the complementary relation, NOT equal to, has codepoint U+2270, and this is represented in the code charts as having a slanted equal to, so it OUGHT to be the complement of U+2A7D. (Unless I've missed it, there appears to be no "not equal to with horizontal equals" character). The chart at http://www.unicode.org/charts/PDF/U2200.pdf does not show a slanted equals. For some discussion of the math symbols see also http://www.unicode.org/unicode/reports/tr25/tr25-5.html. Part of the problem is that differences that are in most environments only typographical style differences may indicate semantic differences in particular disciplines. It is impossible to establish a firm line as to how important or common would would normally be a stylistic variation must be before it should be encoded in Unicode for plain text distinctions. For example open-loop _g_ is distinguished from close-loop _g_ in the International Phonetic Alphabet and so Unicode encodes it separately at U+0261. A normal Latin Letter font would probably not have U+0261 in it at all and might display U+0067 with either closed or open l
Re: Clones (was RE: Hexadecimal)
[EMAIL PROTECTED] scripsit: > "Don't use these characters - use the the normal Latin letters instead". That's essentially the implication of being a compatibility character. > Secondly, I believe that the code charts SHOULD provide machine-readable > information about the hexadecimal values of the letters "A" to "F". 0030;0 0031;1 0032;2 0033;3 0034;4 0035;5 0036;6 0037;7 0038;8 0039;9 0041;10 0042;11 0043;12 0044;13 0045;14 0046;15 0061;10 0062;11 0063;12 0064;13 0065;14 0066;15 FF10;0 FF11;1 FF12;2 FF13;3 FF14;4 FF15;5 FF16;6 FF17;7 FF18;8 FF19;9 FF21;10 FF22;11 FF23;12 FF24;13 FF25;14 FF26;15 FF41;10 FF42;11 FF43;12 FF44;13 FF45;14 FF46;15 Thuryago. > U+2212 (minus sign) - an obvious clone of U+002D (hyphen-minus). Who > uses this? The ASCII characters, because they have had to do double or triple duty over the years when we had a very limited 7-bit character set, often have several near-equivalents in Unicode that disambiguate their *typographically* different purposes. Thus hyphen, minus sign, en dash, and em dash have separate Unicode representations, though in ASCII they are often written -, -, -- or -, and --- or -- respectively. > Conversely, there are also things that look different, but mean the same. > For example: > U+2264 (less than or equal to) - compare with U+2A7D (less than or > slanted equal to) It turns out that in some math contexts one or the other is strongly enough preferred that it's worth having two characters so as to avoid getting the "wrong" glyph. > So, yes, I agree with Jim. Let's not have too many duplicates. But I still > have to ask why there are so many already? "History there is, and no history." --The High Inquest "Every character has its story." --various Unicode tribal elders -- John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.comhttp://www.ccil.org/~cowan .e'osai ko sarji la lojban. Please support Lojban! http://www.lojban.org