Re: Character identities
David Starner wrote: > Likewise, ä is printed as a with e above in old texts.* Would it be > acceptable to make a font with a a^e glyph for ä? It's not even > changing the meaning of the character in any way. Indeed, that is exactly what Sütterlin fonts do. (Then again, Sütterlin fonts assign the long-s glyph to U+0073 and make you type $ to get a round s, so they may not be the best example.) Stefan Persson replied: > Unicode defines "a^e" as U+0061 U+0364 (though it's exactly the same > character as "ä"). Why? They're not exactly the same, except in this particular German example. Combining superscript e was encoded along with combining superscript a, i, o, u, c, d, h, m, r, t, v, and x, none of which evolved into a "real" diacritical mark the way e did. Combining e had non-German uses as well, as in early modern English "Yͤ" (which did not become "Ÿ"). As for the diaeresis, its use in French, English ("coöperate"), and other languages often has no relationship to the letter e. Indeed, in the sequence "güe" in Spanish, the diaeresis serves as a sort of anti-e, ensuring the separate pronunciation of the u when the e would otherwise prevent it! Historically speaking, I and J were once equivalent, and U and V were once equivalent, but they are all encoded today. -Doug Ewell Fullerton, California
Re: Character identities
On Wed, Oct 23, 2002 at 06:49:38PM -0400, David J. Perry wrote: > > First, is it compliant with Unicode for an Antiqua font to use an s > > glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur > > fonts possible, and it is arguably the glyph given to the middle s > > in modern Antiqua fonts. > > If you are sure that the font will only be used for printing German > this might be OK as a stopgap. Why? Yes, if you want to use a true long s, you're going to need a different font. But I can see this paired with an old Antiqua font, too, if you want to use it for an exact copy of the American Constitution or something. > However, even with German, here's the > problem: if a user searched for a word containing -s at the end, and > typed it using the s key, then it would not be matched (unless the > search engine already knew that long s and s are equivalent). You've got the long s and s reversed. In old printing, the s is the letter that appears at the end. I don't see it as a problem; if you typed in the long s, search for the long s. It might get confusing if more general purpose fonts started doing this, but unless you have a need to exactly reproduce the original document, you probably shouldn't use the long s anyway. > An > OpenType font that is smart enough to substitute a long s glyph at the > right spots is the much superior long-term solution. There are two problems with this; one, German has had a number of orthography changes, each time changing slightly when you're supposed to use the long s (IIRC). Secondly, no matter what the convention, it requires a dictionary lookup for various case; I'm not sure you can do that in an OpenType font, and it's not something I'm sure I want a renderer doing in the first place. -- David Starner - [EMAIL PROTECTED] Great is the battle-god, great, and his kingdom-- A field where a thousand corpses lie. -- Stephen Crane, "War is Kind"
Unicode Display
Hi, can you tell me how to display unicode in RichEdit VB Applicatioan. I am trying to develop multilangual application . I will be very glad of you if you will help me for displaying Unicode. Thanx a lot Bye for now Markus Scherer <[EMAIL PROTECTED]> wrote: David Starner wrote:> First, is it compliant with Unicode for an Antiqua font to use an s> glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur> fonts possible, and it is arguably the glyph given to the middle s in> modern Antiqua fonts. > > Likewise, ä is printed as a with e above in old texts.* Would it be> acceptable to make a font with a a^e glyph for ä? It's not even changing> the meaning of the character in any way.In my opinion, this is all reasonable and should be allowed.Viel Erfolg!> As a third case, I looked briefly at information and advocacy of the> duodecimal system. Chi and epsilon have been used as glyphs for 10 and ...I assume that the answer will be that these things are just alternate uses of existing characters.markus-- Opinions expressed here may not reflect my company's positions unless otherwise noted.Do you Yahoo!? Y! Web Hosting - Let the expert host your web site
Re: Character identities
David Starner wrote: First, is it compliant with Unicode for an Antiqua font to use an s glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur fonts possible, and it is arguably the glyph given to the middle s in modern Antiqua fonts. Likewise, ä is printed as a with e above in old texts.* Would it be acceptable to make a font with a a^e glyph for ä? It's not even changing the meaning of the character in any way. In my opinion, this is all reasonable and should be allowed. Viel Erfolg! As a third case, I looked briefly at information and advocacy of the duodecimal system. Chi and epsilon have been used as glyphs for 10 and ... I assume that the answer will be that these things are just alternate uses of existing characters. markus -- Opinions expressed here may not reflect my company's positions unless otherwise noted.
Re: Character identities
- Original Message - From: "David Starner" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, October 23, 2002 7:00 PM Subject: Character identities > Likewise, ä is printed as a with e above in old texts.* Would it be > acceptable to make a font with a a^e glyph for ä? It's not even changing > the meaning of the character in any way. Unicode defines "a^e" as U+0061 U+0364 (though it's exactly the same character as "ä"). Why? Stefan _ Gratis e-mail resten av livet på www.yahoo.se/mail Busenkelt!
Character identities
I have several questions about character identities. First, is it compliant with Unicode for an Antiqua font to use an s glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur fonts possible, and it is arguably the glyph given to the middle s in modern Antiqua fonts. Likewise, ä is printed as a with e above in old texts.* Would it be acceptable to make a font with a a^e glyph for ä? It's not even changing the meaning of the character in any way. (I suspect the answer is it's not technically complaint, but nobody cares.) (To my surprise, I came across a text from 1920 that used the e-above instead of a diearsis. The only other texts I've see with this date before 1810. It was "Islands Kultur zur Wikingerzeit" by Felix Niedner, in the series (?) "Thule: Altnordische Dichtung und Prosa", which leads me to believe, based off my limited German, that it's a deliberate anacronism. Right?) As a third case, I looked briefly at information and advocacy of the duodecimal system. Chi and epsilon have been used as glyphs for 10 and 11, as well as an upside-down 2 and 3, a chi and reversed pound symbol (? I'd need at that one again . . .) and * and #. Unified, they might a proposal here, if someone still cares enough to make it. Would it be unreasonable to unify them? There's quite a disparity in glyphs, but not much argument against them all being the same character, and I don't think there's anyone wanting to make the distinction. -- David Starner - [EMAIL PROTECTED] Great is the battle-god, great, and his kingdom-- A field where a thousand corpses lie. -- Stephen Crane, "War is Kind"
Taiwanese proposal
The WG2 home page was updated today to add a link to document N2507, "Draft of Proposal to add Latin characters required by Latinized Taiwanese Holo language to ISO/IEC 10646" [1], by a group called the Department of Language Education of National Taitung Teachers College. The document is dated either 2002-03-11 or 2002-03-31, depending on what part of the title page you look at. This document proposes a COMBINING RIGHT DOT ABOVE for use in a popular Latin-script orthography of the Taiwanese Holo language. Some time ago (I can't look up exactly when because the unicode.org archives are unavailable), I wrote that this combining character should be added in lieu of a largish collection of precomposed characters. Ken Whistler responded that the issue had already been debated, and a solution already presented to use U+0307 COMBINING DOT ABOVE (possibly incorporating a Taiwanese font-specific glyph variation to move the dot to the right). Evidently the Taiwanese teachers did not consider this satisfactory, as they have responded with this new proposal to encode a separate COMBINING RIGHT DOT ABOVE. Whether this new combining character makes sense, however, the rest of the proposal clearly does not. The group has proposed no less than 42 precomposed Latin characters, all of which can be formed using existing Latin letters and combining marks (together with the proposed RIGHT DOT ABOVE). The 42 precomposed letters are proposed "to be added to Latin Extended-B," which is a puzzle to me since that block has only 25 available code positions as of Unicode 4.0. Much more troubling, however, is the fact that this group has apparently ignored or disregarded the Unicode/10646 policy against standardizing new precomposed letters that can be composed with existing characters. The document says: "The precomposed characters are proposed to ensure compatibility with the existing font "HoloWin" in the word-processing software HOTSYS widely employed in the user community. We have been promised composing characters in major (Microsoft etc.) implementations since 1997. Now, 5 years later, we still have nothing." Compatibility with 8-bit legacy fonts and software is *not* sufficient cause for encoding new precomposed characters. The WG2 "Principles and Procedures" document [2] specifically states that a precomposed character should not be encoded "if solely intended to overcome short-term deficiency of rendering technology." The Taiwanese document does not say which "major (Microsoft etc.) implementation" fails to support composition using combining marks, but as a previous thread on this list has shown, there is at least some support in Internet Explorer for such characters. Try this experiment: One of the precomposed characters proposed by the Taiwanese teachers is LATIN SMALL LETTER N WITH CIRCUMFLEX. Here it is, encoded properly as U+006E U+0302: n̂ Some of you will be able to see this character, others will not. Rendering technology is not perfect yet. But this is the correct way to create new accented letters in Unicode/10646, not by adding more precomposed characters. The proposal for a new COMBINING RIGHT DOT ABOVE may or may not have merit -- I'm not going to commit firmly to the idea that it does, like I did last time -- but the 42 precomposed letters have no business being encoded and should not be debated further. -Doug Ewell Fullerton, California -Doug Ewell Fullerton, California [1] http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2507.pdf [2] http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2352r.pdf
Another candidate for the squared precomposed Latin block
Ken Whistler especially may be pleased to learn that the following was discussed on the TYPO-L list today: At 09:06 -0400 2002-10-23, Richard Kegler wrote: > From: Joe Clark <[EMAIL PROTECTED]> > Reply-To: Discussion of Type and Typographic Design > Date: Sun, 20 Oct 2002 20:33:12 -0400 > To: [EMAIL PROTECTED] > Subject: Obsessive typography in Building Accessible Websites We used every ligature you can name wouldn't a ligature for "http://"; be nice and handy. :^) Richard Kegler -- Michael Everson * * Everson Typography * * http://www.evertype.com 48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)