At 2001-12-01 11:24:04 Pacific Standard Time, [EMAIL PROTECTED] (Stefan Persson) wrote:
> I was thinking if this was encoded: > > 1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and") > with a line below. In handwritten text it is almost always used instead of > &, in machine-written text I don't think I've ever seen it. This might be a character in its own right, as different from the ampersand as U+204A TIRONIAN SIGN ET. Or it might be simply a glyph variant of the ampersand. If you have never seen o-underbar in machine-written text, I doubt that this will help your cause much. You might try U+006F U+0332, though this will probably not give you the vertical spacing you expect. (As a side note, this "o-underbar" form reminds me of the "c-underbar" which is sometimes used in handwritten English to mean "with." Does anyone know the origin of this symbol? Is it possibly derived from the Latin word cum, meaning "with"? Does it have any claim to being a character in its own right?) > 2.) Fractions with any number, see "bråk.bmp." U+2044 FRACTION SLASH is exactly what you are looking for. Whether your browser or other rendering engine will display it the way you want is another matter. On page 154 of TUS 3.0, there is a two-paragraph description of the use of U+2044. Note particularly the sentence: "The standard form of a fraction built using the fraction slash is defined as follows: Any sequence of one or more decimal digits, followed by the fraction slash, followed by any sequence of one or more decimal digits." This would give you the results you expect for "123/456" but not for "x/y" or even "14658.48/13789". However, it is not clear to me that this "standard form" is normative, and it is conceivable that a fraction-slash-aware renderer could generalize this to "one or more non-space characters, fraction slash, one or more non-space characters." > 3.) Roman numerals. I know I-XII are encoded, but what if you want to use > higher numbers? Typing "XX," you might suggest. The set of Roman numerals, at least through 4999, can be completely specified with the characters U+2160 "I", U+2164 "V", U+2169 "X", U+216C "L", U+216D "C", U+216E "D", and U+216F "M" (or, of course, with the equivalent Latin letters). According to TUS 3.0, page 299, "Upper- and lowercase variants of the Roman numerals through 12, plus L, C, D, and M, have been encoded for compatibility with East Asian standards." Requests for additional precomposed Roman numerals will almost certainly be denied. > This is not always > sufficient; in Sweden we often put a line under and one above the numbers, > see "Roma.bmp." Sounds like a glyph-variant issue. Font designers might want to ensure that the glyphs for the Roman numeral forms do have the over- and underlines. Then, if a user doesn't want them, she can always use the plain Latin letters instead. > And what about ten thousands? Neither "X¯" nor "X¯" are > displayed properly! They should be; that's what the combining characters are there for. (Hint: you want U+0305 COMBINING OVERLINE, not U+0304 COMBINING MACRON.) To be fair to Stefan, most rendering engines have a long way to go to catch up with the Unicode ideal of being able to attach arbitrary combining marks (like U+0305) to arbitrary base characters (like U+2169). Many renderers simply replace the sequence with a precomposed glyph. This approach looks really sharp IF such a glyph is available, but breaks down otherwise. -Doug Ewell Fullerton, California