Encode THIS in the PUA
To promote the new "Men in Black II" movie, Burger King is handing out kids' toys with "secret messages" displayed in these glyphs: http://www.burgerking.com/mibdecoder/ It's a straight cipher for the Latin alphabet, so don't bother suggesting it for ConScript. They have a policy against ciphers, even historic ones like the Utopian "alphabet" originally printed in 1516: http://www.adh.brighton.ac.uk/schoolofdesign/MA.COURSE/05/LL47.html ConScript is also not the place to propose ciphers invented for other recent movies, such as the Mara "alphabet" from "Indiana Jones and the Temple of Doom": http://www.mouseplanet.com/al/docs/indy.htm or the 29-letter Atlantean script from "Atlantis: The Lost Empire": http://omniglot.com/writing/atlantean.htm (Note: Unicode hobbyists who go to the Disney site and choose "Character Gallery" may not find what they expect.) But of course someone could still encode them in the PUA. Is anyone planning to start up that separate PUA mailing list? -Doug Ewell Fullerton, California
Re: Can browsers show text? I don't think so!
> > > http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html > I loaded the beginning of that document, and it looks like just a bunch of characters from the start of a list of characters in "aiueo-jun" (Japanese "alphabetical order"). Not a real "document", Is what you want something like what you can find at www.shodouka.com? Like if you are trying to view your message board on an American library computer and all you get is mojibake instead of a Japanese message. Shodouka will display images for text. There is a Web site that can do furigana, kind of. (Its mistakes are sometimes funny, but if you are a student of Japanese trying to read Japanese Web pages, it can be a lot of help.) If you do a search on kids.goo.ne.jp, and choose "furigana ari", it will give you your furigana. I wonder if there is a romaji version we could use? $B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B _ $B%&%#%k%9%a!<%k!"LBOG%a!<%kBP:v$J$i(B MSN Hotmail http://www.hotmail.com/JA
Re: RE: Can browsers show text? I don't think so!
>My point is that it would be great if browsers supported all languages, no >matter how complicated the language is. Still, even with languages that does >not require shaping, you have problems. For example, a typical Western >Mac/Win/Unix user may not have a Georgian/Chinese/(insert your favorite >language here) font on his machine. This is a problem that is solved with >CSS 2. Still, there is not any wide spread support for web fonts in modern >browsers. I wonder why? Most people have fonts to display their own language - they came with their operating system. I'm a Unicode geek, and it doesn't really matter if I can't see whether it displays correctly or not. My friends couldn't care less. >The link below will take you to a web page that shows 500 Japanese >characters (courtesy of Morisawa Co Ltd) and a fairly large point size >(18px). And I can't change the point size, which sucks. I can install a language pack, which will let me change the point size, and work for all pages, whether or not they share the font resources. >This scales up very well as well, because pages may share font resources. A >font with 2000 characters would be 80kb in this case, and would perhaps work >for hundres of pages. And would fail the instant someone added a new character. >Also, what is it with people and the lack of interest in using fonts. Do >people actually think that you only need one font, possibly in bold, italic >and regular style? Do they think that other languages, e.g. Chinese, do not >use styles? Text should be beautiful to look at too! But text should be readable first. Typographers will probably flame me for this, but for English, there's only two or three distinct readable fonts (with a thousand minor variations on the form.) I'd usually prefer to see my serif font, instead of some bitmap font someone else chose, as mine will be scalable and anti-aliased. Pictures work better than fonts for fancy titles, and are already used for that.
Re: ZWJ and Latin Ligatures
Michael Everson scripsit: > I have to confess I don't understand what you are talking about at > all. Get me them tools, John! Ligature tables at a high level tell you things like "The glyph 'a' and the glyph 'acute accent' should be merged to form the glyph 'aacute'." Internally, though, it reads more like "A #502 followed by a #397 should be replaced by a #929", where the numbers (or names, in some contexts) *represent* the actual glyph outlines. You could write "#202 followed by #999 becomes SHAVIAN PEEP glyph" without there being any actual outlines for #202 or #999, but as John says, if something actually called for a #202 to be imaged, the rendering software would go belly-up. I hope this helps. -- John Cowan[EMAIL PROTECTED] At times of peril or dubitation, http://www.ccil.org/~cowan Perform swift circular ambulation,http://www.reutershealth.com With loud and high-pitched ululation.
Re: (long) Re: Chromatic font research
[*groans in the audience*] I know, I know -- another contribution in the endless thread... In re: > The Respectfully Experiment > I used it as evidence that ideas about what should not be > included in Unicode can change over a period of time as new scientific > evidence is discovered. Having been intimately involved in nearly all the decisions made about what was included in Unicode over the last 13 years, and also being formally trained as a scientist, I think I may be qualified to dispute this conclusion. Most of the change in ideas about what can be included in Unicode have been the result of two types of influence: A. The encountering of legacy practice in preexisting character encodings which had to be accomodated for interoperability reasons. This accounts for many, if not all of the hinky little edge cases where Unicode appears to depart from its general principles for how to encode characters. B. The development of new processing requirements that required special kinds of encoded characters. This accounted for strange animals such as the bidi format controls, the BOM, the object replacement character, and the like. There is a very narrow window of opportunity for *scientific* evidence contributing to this -- namely, the result of graphological analysis of previously poorly studied ancient or minority scripts, which conceivably could turn up some obscure new principle of writing systems that would require Unicode to consider adding a new type of character to accomodate it. But at this point, with Unicode having managed to encode everything from Arabic to Mongolian to Han to Khmer..., I consider it rather unlikely that scientific graphological study is going to turn up many new fundamental principles here. As a scientific *hypothesis* I think this surmise is proving to hold up rather well, as our premier encoder of historic and minority scripts, Michael Everson, has managed to successfully pull together encoding proposals, based on current principles in Unicode, for dozens more scripts, with little difficulty except for that inherent in extracting information about rather poorly documented writing systems. > it just seems to me that some > extra ligature characters in the U+FB.. block would be useful. Best practice, and near unanimous consensus in the Unicode Technical Committee and among the correspondents on this list, would be aligned with exactly the opposite opinion. > In the > light of this new evidence, I am wondering whether the decision not to > encode any new ligatures in regular Unicode could possibly be looked at > again. As others have pointed out, "The Respectfully Experiment" did not constitute new *evidence* of anything in this regard. In any case, the UTC is quite unlikely to look at that decision again. The exception that the UTC *has* considered recently was the Arabic bismillah ligature, and the reason for doing so again was the result of considering legacy practice. This thing exists in implemented character encodings as a single encoded character. And furthermore, it is used as a unitary symbol, in such a way that substituting out an actual (long) string of Arabic letters and expecting the software to ligate it correctly precisely in the contexts where it was being used as a symbol, would place an unnecessary burden on both users and on software implementations. That is *quite* different from the position that claims that one, two, or dozens more Latin ligatures of two letters need to be given standard Unicode encodings. >if it cannot be done or would cause great anguish and > arguments, well, that is that, forget it. Good idea. --Ken
Re: Can browsers show text? I don't think so!
- Original Message - From: "Michael Jansson" <[EMAIL PROTECTED]> To: "'David Starner'" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Tuesday, July 02, 2002 11:16 PM Subject: RE: Can browsers show text? I don't think so! > http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html Let me see... And if you'd like to copy some text from that page and paste it into some document...? Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
RE: Can browsers show text? I don't think so!
Michael Jansson says: > There are no technical reasons for why css/html4/xhtml can not produce every bit as high quality > as any other page layout format. Sadly this is currently far from the case. HTML/CSS even including CSS3 is far from a professional document publishing format. It doesn't even have center/right/decimal tabs and tab leaders, which virtually all WP systems have. The list of DTP omissions goes on and on. Defining their own XMLs is the direction that WP systems are going in for interchange. XSLT can be used to translate between these XMLs to the extent that the features are translatable. XHTML/CSS is only used as a fallback for browsers. Which isn't to say that XHTML/CSS isn't cool. It is. But currently it's a weak DTP format at best. Murray
RE: Can browsers show text? I don't think so!
See below. > -Original Message- > From: David Starner [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, July 02, 2002 11:50 PM > To: Michael Jansson; '[EMAIL PROTECTED]' > Subject: Re: Can browsers show text? I don't think so! > This is text. Changing fonts is just flash. And frankly, no > modern browser has > any trouble with anything from Latin, Cyrillic, Greek, or > CJK; Hebrew, Thai, > Arabic and any script that doesn't shape or combine is > usually supported too. > I'd say they show text just fine. Are you saying that browser do support all languages, as long as you exclude all languages that are not supported? Well ;-) My point is that it would be great if browsers supported all languages, no matter how complicated the language is. Still, even with languages that does not require shaping, you have problems. For example, a typical Western Mac/Win/Unix user may not have a Georgian/Chinese/(insert your favorite language here) font on his machine. This is a problem that is solved with CSS 2. Still, there is not any wide spread support for web fonts in modern browsers. I wonder why? Being able to manually install fonts is not helpful in many cases either. Mere mortals (you know, "ordinary" people that just want to surf the net) don't know how to install fonts, nor would they no where to find fonts, or even know that they are anyway. > > >(I do not consider solutions where you have > >to download a 10MB+ language package to see a page in a > foreign language. > >It's not a viable solution.) > > So you'd rather download the fonts every time you want to view a page, > rather than just once? It's not like any one can't afford > 10MB of space > anymore. You would not need 10MB to show a single web page, or even a full web site. The link below will take you to a web page that shows 500 Japanese characters (courtesy of Morisawa Co Ltd) and a fairly large point size (18px). The file size of the pages and the font data is roughly 20kb (font: <20kb, page: ~1kb) if you are using a popular browser on Mac OS9 or Windows. It would take a modem users ~3-7s to load this page. Size is not an issue w.r.t. web fonts. http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html This scales up very well as well, because pages may share font resources. A font with 2000 characters would be 80kb in this case, and would perhaps work for hundres of pages. Note also that I would only need to download this data _once_ (it would stick stay in the browser cache), and it would be done without any user interaction (i.e. no manual labour). You click on the link and you see the text. It should not have to be more complicated than that. > > >So what we have today are applications called "web browsers" > that are very > >good at showing images, and animations. They are not very > good at showing > >text, other than unformatted English text. > > If you want to nitpick, they aren't that good at showing images; > look at how modern browsers fail the PNG transparency test > one of these days. > And for most animations, you have to download 10MB+ plugins. OK, so they are good for nothing then... (just kidding ;-) > > Every web browser since the beginning of time has supported > at least bold, > italics and headings. And HTML has become a very common > medium for formatted > text, and not just for English. Yes, they have failures in > complex situations > that haven't had much work in them; no, not every font has or > will have every > language in it. And if you want Adobe Acrobat, you know where > to find it; web > browsing was never intended to give full control over fonts > and display to > the creator of the documents; it was intended to give control > over _meaning_. I think that pretty much sums up peoples expectations on web browsers, which is a shame. There are no technical reasons for why css/html4/xhtml can not produce every bit as high quality as any other page layout format. Also, what is it with people and the lack of interest in using fonts. Do people actually think that you only need one font, possibly in bold, italic and regular style? Do they think that other languages, e.g. Chinese, do not use styles? Text should be beautiful to look at too! Regards, em2 Solutions Michael Jansson
Q: Online multilingual text projects and handling missingchars./variants
I am assembling a list of online multilingual text projects, including online foreign language instruction projects. My current interest is in projects created at or for the university, but is not limited to this category. I was wondering how such real-life projects (if indeed their creators read this list) currently handle (a) missing Unicode characters, and (b) being able to specify needed variants of characters. I'd be very grateful for any input. With many thanks, Deborah Anderson Researcher, Dept. of Linguistics UC Berkeley
Re: ZWJ and Latin Ligatures
On Tuesday, July 2, 2002, at 12:51 PM, Marco Cimarosti wrote: > The next step could be standardizing the values of the glyph indexes, so > that the entire "GSUB"/"morx" table can be copied in from a template, and > type designers can concentrate on drawing the outlines. > The typical approach these days is for the tools that provide advanced layout table support to be keyed to glyph name. Apple's tools allow glyph name, glyph number, of Unicode code point as glyph identifiers. As you say, it makes it possible to cut-and-paste source files and is very handy. == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: Can browsers show text? I don't think so!
At 03:39 PM 7/2/02 +0200, Michael Jansson wrote: >Modern browsers know how to show the characters 'A'-'Z' and a few other >characters as long as you don't expect to format the text with a specific >font. You will get into trouble as soon as you want to use a font or >characters from other languages. You may find a solution for some languages >and some fonts on some platforms. Yet again, this is far from claiming that >modern browsers can show text. This is text. Changing fonts is just flash. And frankly, no modern browser has any trouble with anything from Latin, Cyrillic, Greek, or CJK; Hebrew, Thai, Arabic and any script that doesn't shape or combine is usually supported too. I'd say they show text just fine. >(I do not consider solutions where you have >to download a 10MB+ language package to see a page in a foreign language. >It's not a viable solution.) So you'd rather download the fonts every time you want to view a page, rather than just once? It's not like any one can't afford 10MB of space anymore. >So what we have today are applications called "web browsers" that are very >good at showing images, and animations. They are not very good at showing >text, other than unformatted English text. If you want to nitpick, they aren't that good at showing images; look at how modern browsers fail the PNG transparency test one of these days. And for most animations, you have to download 10MB+ plugins. Every web browser since the beginning of time has supported at least bold, italics and headings. And HTML has become a very common medium for formatted text, and not just for English. Yes, they have failures in complex situations that haven't had much work in them; no, not every font has or will have every language in it. And if you want Adobe Acrobat, you know where to find it; web browsing was never intended to give full control over fonts and display to the creator of the documents; it was intended to give control over _meaning_.
Re: ZWJ and Latin Ligatures
At 12:15 -0600 2002-07-02, John H. Jenkins wrote: >On Tuesday, July 2, 2002, at 11:39 AM, John Cowan wrote: > >> >>>1) If you map directly from multiple characters to a single glyph, you don' >>>t have to include glyphs in your font for all the "pieces" if they're >>>never supposed to appear by themselves. As an extreme example, if I >>>implemented astral character support via ligating surrogate pairs, I'd >>>need to include glyphs for the unpaired surrogates. >> >>More precisely, you need to have glyph *indexes* that are never mapped >>to glyphs. The actual outlines themselves don't need to exist, AFAIK. >> > >True. I tend to avoid that, because if something goes wrong and the >system attempts to actually *display* one of these virtual glyphs, >disaster would ensue. (Dave Opstad and I have had long debates on >the safety of doing this.) I have to confess I don't understand what you are talking about at all. Get me them tools, John! -- Michael Everson *** Everson Typography *** http://www.evertype.com
RE: ZWJ and Latin Ligatures
John Cowan wrote: > More precisely, you need to have glyph *indexes* that are never mapped > to glyphs. The actual outlines themselves don't need to exist, AFAIK. Yes, of course. E.g., I guess that the ZWJ "glyph" can be a pseudo-index which doesn't actually index anything. The next step could be standardizing the values of the glyph indexes, so that the entire "GSUB"/"morx" table can be copied in from a template, and type designers can concentrate on drawing the outlines. :-) _ Marco
Re: Inappropriate Proposals FAQ
At 10:01 AM 7/2/2002 -0400, Suzanne M. Topping wrote: >I have a few ideas for fictional proposals to use as examples (my room >layout idea, and Mark's 3-D Mr. Potato Head representation), but I could >use another one or two if anyone feels creative. The closer to being >believable, the better, I suppose. (An alternative would be to use >real-life proposals, and state why they were not accepted, but I thought >it more politic to keep it fictional...) There was a discussion last year about a symbol to represent pi/2 or pi/4 or something like that. If you want to fictionalize that to some other fraction of a mathematical constant, that might work (e/2 perhaps?) Barry Caplan www.i18n.com
Re: ZWJ and Latin Ligatures
On Tuesday, July 2, 2002, at 11:39 AM, John Cowan wrote: > >> 1) If you map directly from multiple characters to a single glyph, you >> don' >> t have to include glyphs in your font for all the "pieces" if they're >> never supposed to appear by themselves. As an extreme example, if I >> implemented astral character support via ligating surrogate pairs, I'd >> need to include glyphs for the unpaired surrogates. > > More precisely, you need to have glyph *indexes* that are never mapped > to glyphs. The actual outlines themselves don't need to exist, AFAIK. > True. I tend to avoid that, because if something goes wrong and the system attempts to actually *display* one of these virtual glyphs, disaster would ensue. (Dave Opstad and I have had long debates on the safety of doing this.) == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: Inappropriate Proposals FAQ
How about symbols from electronics and hydraulics? Schematic symbols. Wm Seán Glen - Original Message - From: Suzanne M. Topping To: Unicode (E-mail) Sent: Tuesday, 02 July, 2002 7:01 Subject: Inappropriate Proposals FAQ I have a few ideas for fictional proposals to use as examples (my roomlayout idea, and Mark's 3-D Mr. Potato Head representation), but I coulduse another one or two if anyone feels creative. Thanks in advance for your input,Suzanne ToppingBizWonk Inc.[EMAIL PROTECTED]
Re: Inappropriate Proposals FAQ
At 12:38 -0400 2002-07-02, ÇÎÅZÅZÅZÅZ ÇÎÅZÅZÅZ wrote: >I have a few ideas: > >Fictional scripts that would probably be rejected, such as the >script of the Codex Seraphinianus Certainly not. Tengwar and Cirth are certain to be encoded. The Codex script would probably not be encoded because it occurs in only one manuscript and is undeciphered. -- Michael Everson *** Everson Typography *** http://www.evertype.com
Re: ZWJ and Latin Ligatures
John H. Jenkins scripsit: > 1) If you map directly from multiple characters to a single glyph, you don' > t have to include glyphs in your font for all the "pieces" if they're > never supposed to appear by themselves. As an extreme example, if I > implemented astral character support via ligating surrogate pairs, I'd > need to include glyphs for the unpaired surrogates. More precisely, you need to have glyph *indexes* that are never mapped to glyphs. The actual outlines themselves don't need to exist, AFAIK. -- John Cowan http://www.ccil.org/~cowan[EMAIL PROTECTED] To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --The Hobbit
Re: ZWJ and Latin Ligatures
On Tuesday, July 2, 2002, at 10:55 AM, Marco Cimarosti wrote: > I mean: isn't this two-step mapping: > > code point -> glyph ID > component glyph ID's -> ligature glyph ID > > functionally equivalent to an hypothetical one-step mapping? > > component code points -> ligature glyph ID > > Am I missing something? > Functionally, the two are equivalent. There are, however, two subtle differences: 1) If you map directly from multiple characters to a single glyph, you don' t have to include glyphs in your font for all the "pieces" if they're never supposed to appear by themselves. As an extreme example, if I implemented astral character support via ligating surrogate pairs, I'd need to include glyphs for the unpaired surrogates. As it is, Windows and the Mac *do* support mapping paired surrogates directly to glyphs, so you don't need these extra glyphs which are never seen. 2) A mapping directly from multiple characters to single glyphs expressly makes the process something not to percolate up to the UI. The indirect process means that there are some actions in glyph space which *are* optional and which the user can turn on and off, and others which aren't. In OpenType, this is less of an issue since this was always the case and applications are expected to do the UI work themselves. In AAT, we originally assumed (back in the days of the Technology That Must Not Be Named) that all layout features are optional and can be turned on and off, and that the UI would always reflect the entire suite of available features. We had to rewrite our tools to allow for required actions which cannot be turned off. Poor Michael is saddled with older versions of our tools which are hard to use and don't let him do this. We're working on getting newer and better ones to him. == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
RE: ZWJ and Latin Ligatures
Michael Everson wrote: > At 09:41 -0600 2002-07-02, John H. Jenkins wrote: > > >Alas, but that's technically impossible. Both OT and AAT (I'm not > >sure about Graphite) require that single characters map to single > >glyphs, which are then processed. I am confused by this statement; perhaps some expert in fonts can help me checking my understanding. The OpenType specs published on the Adobe site states that table GSUB has a subtable to handle ligatures ("LookupType 4: Ligature Substitution Subtable": http://partners.adobe.com/asn/developer/opentype/gsub.html#LSF1). It says that "A Ligature Substitution (LigatureSubst) subtable identifies ligature substitutions where a single glyph replaces multiple glyphs" (multiple *glyphs*, not multiple characters). OK: literally speaking, it is true that OT maps single characters to single glyphs, but then it maps multiple glyphs to ligature glyphs, so what's the difference? I mean: isn't this two-step mapping: code point -> glyph ID component glyph ID's -> ligature glyph ID functionally equivalent to an hypothetical one-step mapping? component code points -> ligature glyph ID Am I missing something? _ Marco
Re: Inappropriate Proposals FAQ
I have a few ideas: Fictional scripts that would probably be rejected, such as the script of the Codex Seraphinianus A "fictional" Hanzi (specifically, a Hanzi made up of the "woman" radical plus the character for "walk"), which I am attaching a crude image of. The proposer either (1) used this character in a novel once (or has seen it used in a novel), or (2) he wants to use it as a symbol for the length unit of the new system of measurement he invented. $B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B _ $B$-$C$H8+$D$+$k$"$J$?$N?75o!!ITF0;:>pJs$O(B MSN $B=;Bp$G(B http://house.msn.co.jp/
Here is the attachment
Here is an image of the "fake Hanzi" I described in my last E-mail. $B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B _ $B$+$o$$$/$FL{2w$J%$%i%9%HK~:\(B MSN $B%-%c%i%/%?!<(B http://character.msn.co.jp/ fakehanzi.bmp Description: Windows bitmap
Re: ZWJ and Latin Ligatures
On Tuesday, July 2, 2002, at 09:49 AM, Michael Everson wrote: > At 09:41 -0600 2002-07-02, John H. Jenkins wrote: > >> Alas, but that's technically impossible. Both OT and AAT (I'm not sure >> about Graphite) require that single characters map to single glyphs, >> which are then processed. > > Hm? How do you handle the decomposed sequence A + COMBINING ACUTE? Surely > that is a sequence of characters mapping to a single glyph. > Same process. In OT, of course, you could count on the glyph being prenormalized (but this only works for stuff already in Unicode), or you could use the GPOS table to properly form the accented form on-the-fly. But neither technology allows the decomposed sequence to be mapped directly to a single glyph. > Just goes to show that I don't make proper Unicode fonts yet because the > tools just aren't up to snuff. > We're working on it. :-) >> (In OT, of course, you are also supposed to do some preprocessing in >> character space, but that doesn't solve this problem.) It would be nice >> to have a cmap format which maps multiple characters to single glyphs >> initially. > > I always thought there was. Now I'm really confused as to how I would > make a complex Indic syllable. > Same sort of thing. You put the glyph in the font and the instructions for what sequence forms it in the GSUB or morx table. == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: ZWJ and Latin Ligatures
On Tuesday, July 2, 2002, at 06:51 AM, Michael Everson wrote: > That is absolutely true. I have never argued that the only way to turn > ligatures on or off is in plain text. I saw that there were difficult > edge cases and sought blessing for the ZWJ/ZWNJ mechanism to handle them, > and won the day. But it would certainly be my view that those should > only be used where predictable ligation does not occur. A Runic font > which had an AAT/OpenType/Graphite ligatures-on mechanism would, in my > view, be inappropriate, because ligation is unusual in Runic, never the > norm, and should only be used on a case-by-case basis. Runic fonts should > have the ZWJ pairs encoded in the glyph tables. > >> Alas, but that's technically impossible. Both OT and AAT (I'm not sure about Graphite) require that single characters map to single glyphs, which are then processed. (In OT, of course, you are also supposed to do some preprocessing in character space, but that doesn't solve this problem.) It would be nice to have a cmap format which maps multiple characters to single glyphs initially. The way we deal with this is to have the ligatures with the ZWJ inserted as part of a ligature table which is on by default and which isn't revealed to the UI so that the user can't turn them off. == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: ZWJ and Latin Ligatures
At 09:41 -0600 2002-07-02, John H. Jenkins wrote: >Alas, but that's technically impossible. Both OT and AAT (I'm not >sure about Graphite) require that single characters map to single >glyphs, which are then processed. Hm? How do you handle the decomposed sequence A + COMBINING ACUTE? Surely that is a sequence of characters mapping to a single glyph. Just goes to show that I don't make proper Unicode fonts yet because the tools just aren't up to snuff. >(In OT, of course, you are also supposed to do some preprocessing in >character space, but that doesn't solve this problem.) It would be >nice to have a cmap format which maps multiple characters to single >glyphs initially. I always thought there was. Now I'm really confused as to how I would make a complex Indic syllable. >The way we deal with this is to have the ligatures with the ZWJ >inserted as part of a ligature table which is on by default and >which isn't revealed to the UI so that the user can't turn them off. I am not sure I understand, but then I haven't been able to make use of the AAT ligature tables yet. ;-) -- Michael Everson *** Everson Typography *** http://www.evertype.com
RE: Inappropriate Proposals FAQ
Suzanne M. Topping wrote: > I have a few ideas for fictional proposals to use as examples (my room > layout idea, and Mark's 3-D Mr. Potato Head representation), > but I could use another one or two if anyone feels creative. Today I don't feel very creative, perhaps because deliberating inventing bad ideas does not appeal too much to my creativeness. :-) But perhaps I have some suggestions for the less creative part of the FAQ, which is: listing the existing policies for excluding some classes of proposals. In my understanding, a few such policies are: - No precomposed ligatures which can be encoded using a sequence of existing character (possibly joined by ZWJ's); - No precomposed "accented characters" which can be composed using an existing character and one or more existing combining diacritics; - No clones of existing characters whose sole purpose is making a *logical* differentiation from some existing characters (e.g., hex digits looking identical to existing characters "0..9" and "A...F"; or a symbol for "meter" looking identical to Latin "m"); - No clones of existing characters whose sole purpose is making a *graphical* differentiation from some existing characters (e.g., a Serbian letter "t", disunified from Russian on the basis that italics looks different in the two languages); - No presentation glyphs for shapes that can already be obtained using regular characters in conjunction with ZWJ or ZWNJ. _ Marco
Re: Inappropriate Proposals FAQ
But would not using rejected proposals (as well as the fictional ones) be closer to the truth and therefore more accurate? John > from:"Suzanne M. Topping" <[EMAIL PROTECTED]> > date:Tue, 02 Jul 2002 15:01:16 > to: [EMAIL PROTECTED] > subject: Re: Inappropriate Proposals FAQ > > (An alternative would be to use > real-life proposals, and state why they were not accepted, but I thought > it more politic to keep it fictional...) >
Inappropriate Proposals FAQ
As no good deed goes unpunished, my suggestion re. an FAQ entry regarding innappropriate candidates for encoding resulted in my being asked to begin a draft. I see the need for perhaps two entries: one which states clearly what Unicode is NOT, and another which lists a few examples of innapropriate proposals and why they would not be considered. This section would probably refer to the "what Unicode isn't" entry for support of the "why"s. I have a few ideas for fictional proposals to use as examples (my room layout idea, and Mark's 3-D Mr. Potato Head representation), but I could use another one or two if anyone feels creative. The closer to being believable, the better, I suppose. (An alternative would be to use real-life proposals, and state why they were not accepted, but I thought it more politic to keep it fictional...) I'm also looking for key points to include in the "what Unicode isn't" section, and would appreciate input. I'm particularly looking for issues that have created ongoing repetitive arguments, since the goal of the FAQ entries is to help eliminate them. Thanks in advance for your input, Suzanne Topping BizWonk Inc. [EMAIL PROTECTED]
Can browsers show text? I don't think so!
Postings on this list has recently touched the topic of using various languages in web pages. Comments has been made of the use of embedded fonts (eot and pfr), as well as the lack of support for these font formats in popular browsers. This is a topic which I am very enthusiastic about, so I can not help but to add a few comments myself. Let me start by posing a question: "Can modern browsers show text?" Specifically, can they show text of any language and formatting on all platforms? I have to say; No they can not (possibly with the exception of the browser Nophus). The problem with browsers today is that although they may support Unicode encoding schemes (e.g. UTF8), they typically rely on the platform/OS they run on to show text. Platform without complete Unicode 3.x support will thus not be able to show text correctly. For example, IE6 (or any other modern browser) supports UTF8 but Win98 does not support Unicode 3.x. IE6 is thus not able to show Unicode text on Win98. You may of course be able to show some Unicode text on some platforms. This is far from claiming that a browser support Unicode though. At most, you may claim that a browser on a particular platform support some part of Unicode. Further more, even if a browser knew how to rendered text (e.g. know about the nitty-gritty details of glyph ordering, positioning and shaping that are language specific), you need something called a font to show text. Fonts can be provided as web resources through CSS 2, through a construct known as @font-family rules. However, there are no browser that fully support CSS 2 today, and in particular @font-family rules. There are browser that support @font-family on some platforms (e.g. for eot files on Windows). Again, this is far from claiming that a browser support fonts on the web. Modern browsers know how to show the characters 'A'-'Z' and a few other characters as long as you don't expect to format the text with a specific font. You will get into trouble as soon as you want to use a font or characters from other languages. You may find a solution for some languages and some fonts on some platforms. Yet again, this is far from claiming that modern browsers can show text. (I do not consider solutions where you have to download a 10MB+ language package to see a page in a foreign language. It's not a viable solution.) So what we have today are applications called "web browsers" that are very good at showing images, and animations. They are not very good at showing text, other than unformatted English text. Fortunately, there are third party solutions to work around some of the problems I mention above. Bitstreams "FontPlayer" (for pfr fonts for IE 5.x and Nav 4.x on Windows), MS Typography's WEFT tools (for eot fonts in IE 5.x on Windows), and our own FAIRY server solution (for eot fonts and language support in IE 5.x, Nav 4.x, Nav 6.x and Opera 5.x on Mac and Win). I do admire the work that people have done in creating quite outstanding web browsers through the years, sometimes with no other reward than peoples appreciation. I only wish that time were spent on supporting text, and not just flashy content. Regards, em2 Solutions Michael Jansson
Re: ZWJ and Latin Ligatures
At 11:00 -0600 2002-07-01, John H. Jenkins wrote: >I guess one thing that's frustrating for me personally in this >perennial discussion is the creation of this false dichotomy, that >ligation control either *must* be in plain text or *must* be >expressly forbidden in plain text. I would agree, Michael, that >your arguments that some degree of ligation control belongs in plain >text were unanswerable. You did a good job there. But at the same >time, I've never heard you argue that the only way to turn ligatures >on or off is in plain text. That is absolutely true. I have never argued that the only way to turn ligatures on or off is in plain text. I saw that there were difficult edge cases and sought blessing for the ZWJ/ZWNJ mechanism to handle them, and won the day. But it would certainly be my view that those should only be used where predictable ligation does not occur. A Runic font which had an AAT/OpenType/Graphite ligatures-on mechanism would, in my view, be inappropriate, because ligation is unusual in Runic, never the norm, and should only be used on a case-by-case basis. Runic fonts should have the ZWJ pairs encoded in the glyph tables. >And under no circumstances should new Latin ligatures be added to Unicode. I agree. I wonder if it wouldn't be useful at some stage for me to pick the best bits out of my papers and do them up as a Unicode Technical Note. -- Michael Everson *** Everson Typography *** http://www.evertype.com
Re: Radicals in CNS 11643-1992, Plane 1, Rows 7,8,9
"John H. Jenkins" <[EMAIL PROTECTED]> wrote: >Use the KangXi radicals in the KangXi radical block (U+2Fxx). Hmm, that is pretty obvious. I should have noted that myself. Thanks! --Torsten