Re: [OT] What happened to the OpenType list?
The OpenType list is still active, although there has not been much discussion for the past couple of weeks. See the Microsoft Typography website -- www.microsoft.com/typography -- for subscription information. John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] Type is something that you can pick up and hold in your hand. - Harry Carter
Re: GB18030
On Wed, Sep 26, 2001 at 06:17:15PM -0700, Yung-Fong Tang wrote: > Sure Unicode defined those planes, but defining planes without defining the >characters in it mean not too much to people. How can > you implement case conversion, property mapping without knowing what is inside. How do you do that for BMP characters? There's a whole lot you can do without knowing the identity of a character. You can draw the glyph from a font, which will suffice for a lot of purposes. > In particular, DOES GB18030 define code point to > code point mapping (beyond BMP) between Unicode? Unless you can said that is YES and >show me the specification how to map between > them, there are no way people can implement code set conversion between GB18030 and >Unicode. Have you looked for the specification? Or are you just going to complain on the list? According to GNU libc, the algorithm for coverting a Unicode character ch outside the BMP to GB18030 to outptr (1 .. 4) is: idx := ch + 16#1E248#; outptr (4) := (idx div 10) + 16#30#; idx := idx / 10; outptr (3) := (idx div 126) + 16#81#; idx := idx / 126; outptr (2) := (idx div 10) + 16#30#; outptr (1) := (idx / 10) + 16#81#; -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org When the aliens come, when the deathrays hum, when the bombers bomb, we'll still be freakin' friends. - "Freakin' Friends"
Re: GB18030
On Wed, Sep 26, 2001 at 06:19:51PM -0700, Yung-Fong Tang wrote: > how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ? How do you support tolower (U+0220) without knowing what U+0220 is? But conforming to the Unicode Standard still means that you don't mess with the character if you don't have to (C10). GB18030, if it claims to support Unicode, needs to round-trip both characters. -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org When the aliens come, when the deathrays hum, when the bombers bomb, we'll still be freakin' friends. - "Freakin' Friends"
Re: GB18030
From: "Geoffrey Waigh" <[EMAIL PROTECTED]> > It shouldn't require honest-to-goodness we-were't-kidding > see-here's-one-defined-now characters In many cases, it did. > for developers to slap themselves on the head They did -- and they are slapping others around them, too. > and start developing support for these things. Better late that never, I guess. :-) MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/
Re: GB18030
Frank, > Sure Unicode defined those planes, but defining planes > without defining the characters in it mean not too much > to people. Which is exactly the complacency that Doug Ewell was warning about. Too many people assumed that even though UTF-16 was defined in Unicode 2.0 they could ignore it indefinitely, since no encoded characters had been assigned on the other planes. But if they weren't getting prepared in their code, they have been left flatfooted now that suddenly there *are* 48,000 or so characters defined on the other planes. > How can > you implement case conversion, property mapping without > knowing what is inside. This is a fundamental misconception about Unicode (and character encodings in general) that unfortunately seems to be spreading. There are differences between operations on *characters* (such as case conversion), which obviously require the characters themselves to be defined to make any sense, and operations on *code points* (such as UTF-8 <--> UTF-16 conversion), which make no reference to characters. Many programmers get hopelessly confused about this distinction, apparently, since the API's just pass around the code points associated with the characters, and not the encoded characters per se. (And this is a disease that was inflicted on the world 23 years ago when Kernighan and Ritchie published a certain language that unfortunately chose to call its 8-bit numeric data type a "char".) > In particular, DOES GB18030 define code point to > code point mapping (beyond BMP) between Unicode? Yes. Absolutely it does. It is spelled out in the standard itself. GB 18030 <--> Unicode conversion is basically like a big UTF, with an enormous table for all the GBK part of the encoding, and a bunch of offset ranges to convert all the other code points. > Unless you > can said that is YES and show me the specification how to > map between > them, there are no way people can implement code set > conversion between GB18030 and Unicode. http://www-106.ibm.com/developerworks/library/u-china.html Markus Scherer's excellent documentation of GB 18030, with code snippets and pointer to a complete ICU implementation. > > That question is not wheather they should define the > relationship or not, but have they defined it yet. They have. --Ken
Re: GB18030
On Wed, 26 Sep 2001, Yung-Fong Tang wrote: > how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ? With a data table. One set of debugged code that handles surrogates, composing characters, bidirectionality etc. coupled with a datafile that gets upgraded with each release of Unicode. How many years does it take to implement some of these concepts? It shouldn't require honest-to-goodness we-were't-kidding see-here's-one-defined-now characters for developers to slap themselves on the head and start developing support for these things. Geoffrey
Re: GB18030
David Starner wrote: > On Mon, Sep 24, 2001 at 06:18:19PM -0700, Yung-Fong Tang wrote: > > Markus Scherer wrote: > > > > > Correction: "to encode _all_ of Unicode", not just "all Unicode BMP" - GB 18030 >covers all 17 planes, not just the BMP. > > > > Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 planes? I >don't think so, since Unicode have not define > > them yet, right ? Sure Unicode defined those planes, but defining planes without defining the characters in it mean not too much to people. How can you implement case conversion, property mapping without knowing what is inside. In particular, DOES GB18030 define code point to code point mapping (beyond BMP) between Unicode? Unless you can said that is YES and show me the specification how to map between them, there are no way people can implement code set conversion between GB18030 and Unicode. > > > Unicode defined all the planes, a long long time ago. It's added > characters for 3 of them - Plane 1 (basically the overflow area for the > non-CJK part of the BMP), Plane 2 (more ideographs) and Plane 14 > (special tag characters). IIRC, GB18030 does map the non-BMP area. > Why > wouldn't GB18030 define the relationship between itself and the non-BMP > planes? It's needed to properly handle Unicode (since extra Private Use > planes sit way out there), now and in the future, and it takes less work > to do it now than hack it on later. That question is not wheather they should define the relationship or not, but have they defined it yet. > > > -- > David Starner - [EMAIL PROTECTED] > Pointless website: http://dvdeug.dhis.org > When the aliens come, when the deathrays hum, when the bombers bomb, > we'll still be freakin' friends. - "Freakin' Friends"
Re: GB18030
Do you know where I can get the mapping table between GB18030 and Planes 1 to 16? I can only get the mapping between Plane 0 and GB18030. Tom Emerson wrote: > Yung-Fong Tang writes: > > Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 > > planes? I don't think so, since Unicode have not define them yet, > > right ? > > Sure it does. We know what the code points are, even if they don't > have characters assigned to them yet. This allows GB18030 to support > future versions of Unicode without having to undergo modification. > > And yes, it does support the characters added in Planes 1 and 2, and > the language tags in Plane 14. > > -tree > > -- > Tom Emerson Basis Technology Corp. > Sr. Sinostringologist http://www.basistech.com > "Beware the lollipop of mediocrity: lick it once and you suck forever"
Re: GB18030
how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ? [EMAIL PROTECTED] wrote: > In a message dated 2001-09-24 20:50:25 Pacific Daylight Time, > [EMAIL PROTECTED] writes: > > >> Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 planes? > >> I don't think so, since Unicode have not define them yet, right ? > > > > Unicode defined all the planes, a long long time ago. It's added > > characters for 3 of them - Plane 1 (basically the overflow area for the > > non-CJK part of the BMP), Plane 2 (more ideographs) and Plane 14 > > (special tag characters). > > David's absolutely right. This is another common misconception, about > Unicode "not defining" the code space unless characters are actually assigned > to all the code points. > > This kind of thinking led, in part, to all the complacency on the part of > database vendors and others concerning the need to support surrogate code > points. They thought that just because no characters had YET been assigned > to non-BMP code points, they could safely ignore the whole issue of surrogate > processing. Then, when non-BMP characters became a reality, we began to see > kludges like CESU-8. > > -Doug Ewell > Fullerton, California
RE: DerivedAge.txt
> >At the request of someone working with ICU, I regenerated a derived file > that shows the "age" of Unicode characters -- when they came into Unicode. > Does anyone think this might be useful to have in the UCD?< It is definitely useful information that could go into UNIDATA. Here is a good use for it (and my reason for asking Mark to regenerate it for me): when one uses a library such as ICU that manipulates 3.1 data but want to store some data in a database that won't like anything after 2.x. Using this, one can validate data before sending them to the database as needed. It doesn't necessarily have to get into the UCD, except if it helps me make a smaller change to ICU to support the version as a character property ;-) YA
RE: Re: A pun - will this work?
> From: Kenneth Whistler [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, September 26, 2001 11:34 AM > > Actually, if he's half Jamaican, I think you have to > say "Go mon", > > which is also the Japanese for 50,000, yes? > > No, actually, it is Japanese for "5th question", although > that seems to be only your first question. goman is Japanese for > 50,000. Hmmm - this illustrates an interesting problem in linguistics. I was using English semantics for the letters "a" and "o", whereas Ken was using Romaji semantics. Maybe I should have written it in IPA - except I don't know IPA. Such is life, /|/|ike
Re: Egyptian Transliteration Characters
Is this the same Unicode that encodes characters and not glyphs? $B$8$e$&$$$C$A$c$s(B(Juuitchan) Well, I guess what you say is true, I could never be the right kind of girl for you, I could never be your woman - White Town --- Original Message --- $B:9=P?M(B: Mark Davis <[EMAIL PROTECTED]>; $B08@h(B: [EMAIL PROTECTED];Michael Everson <[EMAIL PROTECTED]>; Cc: [EMAIL PROTECTED]; $BF|;~(B: 01/09/26 16:33 $B7oL>(B: Re: Egyptian Transliteration Characters >For > >1. LATIN CAPITAL LETTER EGYPTOLOGICAL YOD >LATIN SMALL LETTER EGYPTOLOGICAL YOD >2. LATIN CAPITAL LETTER EGYPTOLOGICAL AYIN >LATIN SMALL LETTER EGYPTOLOGICAL AYIN > >I strongly suspect that current diacritics (for 1) and modifier letters (for >2) are similar enough in shape to what is required that they can be used. >Are there any other characters used by Egyptologist that are so close in >shape to i?? and ?? or ?? that they cannot be used? >
Re: _ÿpënïdïäërïsäbövë
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Rick, RM> The hyphen & minus with umlaut exemples below look GREAT on the system I'm RM> running right now. The umlauts are not too high, not too low, but just RM> right. And they are perfectly centered. Unicode didn't do that; the RM> software did it. What ARE you running, then? With me, it looks blargh. Greetings Philippmailto:[EMAIL PROTECTED] __ Miten tämä vaikuttaa? - Tappaa. [Kaurismäki] -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.6 (MingW32) Comment: Freedom of the press is limited to those who own one. iD8DBQE7si853PGzpSk43FoRArpSAKCed9yv3d6Gur+iWtNt6IIeWGp7uQCfSX7T 3yUOxALBm9VoR0MWz6j1cTw= =btwY -END PGP SIGNATURE-
Re:_=?iso-8859-1?q?=FFp=EBn=EFd=EF=E4=EBr=EFs=E4b=F6v=EB
Here we go again... Before everyone goes off and starts blaming Unicode for bad rendering... When you render a combining character sequence and it "doesn't look right" that is not the fault of the Unicode Standard, it is the fault of your font and/or rendering software (and the people who designed them). So please don't blame Unicode. A decent font rendered with decent software should produce decent results for combining character sequences. And when it _does_ produce decent results, the Unicode Standard can't take credit for it. The hyphen & minus with umlaut exemples below look GREAT on the system I'm running right now. The umlauts are not too high, not too low, but just right. And they are perfectly centered. Unicode didn't do that; the software did it. Rick > > I think that was David's point, that these things are always possible > using > > combining characters, and the argument "but it's easier with a > precomposed > > character" doesn't stand up to the concerns about proliferation and > > normalization. > > It doesn't look correct either: > > -Ì âÌ âÌ > > In the first case, it's too far to left. In the last case it's too far to > the right. In all three cases it's too far high above the hyphens (at least > in the font I'm displaying this message with). >
RE: Re: A pun - will this work?
Mike, > > $B:9=P?M(J: Kenneth Whistler <[EMAIL PROTECTED]>; > > > > > >Go man! > > > > > Actually, if he's half Jamaican, I think you have to say "Go mon", > which is also the Japanese for 50,000, yes? No, actually, it is Japanese for "5th question", although that seems to be only your first question. goman is Japanese for 50,000. But at this point, what I think you two fellers, and perhaps the Jamaican, too, should say to the list is gomen, for continuing this farce. --Ken > > /|/|ike
RE: a joke- with no typos or end in sight
Tex, > > ok i'll quit > I figured that you would drag some GIFTS (Poison) from your MIST (Manure) ridden mind. Carl
Ḧÿp̈ḧën̈ ̈ẅïẗḧ ̈d̈ïäër̈ïs̈ ̈äb̈öv̈ë
- Original Message - From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: den 26 september 2001 17:20 Subject: Re: Egyptian Transliteration Characters > In a message dated 2001-09-26 8:09:18 Pacific Daylight Time, > [EMAIL PROTECTED] writes: > > >> The problem is, I have a couple of German texts that I plan to > >> transcribe, where all I need is HYPHEN WITH DIARESIS. > > > > So, you type HYPHEN or EN DASH and then COMBINING DIAERESIS ABOVE. > > I think that was David's point, that these things are always possible using > combining characters, and the argument "but it's easier with a precomposed > character" doesn't stand up to the concerns about proliferation and > normalization. It doesn't look correct either: -̈ –̈ —̈ In the first case, it's too far to left. In the last case it's too far to the right. In all three cases it's too far high above the hyphens (at least in the font I'm displaying this message with). Stefan (äb̈c̈d̈ëf̈g̈ḧïj̈k̈l̈m̈n̈öp̈q̈r̈s̈ẗüv̈ẅẍÿz̈å̈ä̈ö̈... diaeris above the last three letters in the Swedish alphabet — åäö — doesn't work very well ;)) _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
RE: Re: A pun - will this work?
> $B:9=P?M(J: Kenneth Whistler <[EMAIL PROTECTED]>; > $BF|;~(J: 01/09/26 2:23 > > > >Go man! > > Actually, if he's half Jamaican, I think you have to say "Go mon", which is also the Japanese for 50,000, yes? /|/|ike
Re: Egyptian Transliteration Characters
For 1. LATIN CAPITAL LETTER EGYPTOLOGICAL YOD LATIN SMALL LETTER EGYPTOLOGICAL YOD 2. LATIN CAPITAL LETTER EGYPTOLOGICAL AYIN LATIN SMALL LETTER EGYPTOLOGICAL AYIN I strongly suspect that current diacritics (for 1) and modifier letters (for 2) are similar enough in shape to what is required that they can be used. Are there any other characters used by Egyptologist that are so close in shape to ỉ and ʻ or ʿ that they cannot be used? Mark — Δός μοι ποῦ στῶ, καὶ κινῶ τὴν γῆν — Ἀρχιμήδης [http://www.macchiato.com] - Original Message - From: "Michael Everson" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Wednesday, September 26, 2001 7:50 AM Subject: Re: Egyptian Transliteration Characters > At 07:20 -0700 2001-09-26, Mark Davis wrote: > > >2. something that looks like a right half ring with a tail egyptologists > >have represented it with something that looks like two right half rings > >stacked on top of each other. > > > >3. a capital and small glottal stop and reversed glottal stop > > > >For (2), (3), we would need a submission with documentation of usage. We do > >add capital/small versions of characters when there is sufficient evidence > >of their usage. This happens, for example, when an IPA is pressed into > >service in the regular orthography of a language. > > Pleas http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2241.pdf, my N2241: > Proposal to add 6 Egyptological characters to the UCS > -- > Michael Everson *** Everson Typography *** http://www.evertype.com > 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland > Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement) > >
Re: a joke- with no typos or end in sight
Not if they RENDER him senseless first. Of course, if they get caught, they might have trouble getting a lawyer to take their CASE. (YA YA, they need a SENSITIVE one, no need to go for the really cheap shots...) Hey, (not the hebrew letter) if they get tried in one of the lower courts, is it a LOWER CASE? I heard they were tried as a CAPITAL offense and SENTENCED to be deCAPitated. (I started uppercasing the puns, because I was having trouble determining which words were puns. I wondered about "barman" for a while... ;-) ) ok i'll quit tex "Ayers, Mike" wrote: > > What I want to know is: if the fonts cause trouble, will the barman > call the serif? > > > From: Tex Texin [mailto:[EMAIL PROTECTED]] > > Sent: Tuesday, September 25, 2001 10:01 AM > > > > I nearly had a stroke when I read this! > > > > Michael Everson typed: > > Three fonts walk into a bar. The barman, wiping a glass, shakes his > > head and says to them: "I'll have none of your type in here." > > > > Suzanne M. Topping, tried topping him: > > Gee, and I thought he was going to say: > > "Why the long face?" > > > > Michael (michka) Kaplan" descended below the baseline: > > What a bunch of characters! -- - Tex TexinDirector, International Business mailto:[EMAIL PROTECTED]Tel: +1-781-280-4271 the Progress Company Fax: +1-781-280-4655 -
Re: Egyptian Transliteration Characters
In a message dated 2001-09-26 8:09:18 Pacific Daylight Time, [EMAIL PROTECTED] writes: >> The problem is, I have a couple of German texts that I plan to >> transcribe, where all I need is HYPHEN WITH DIARESIS. > > So, you type HYPHEN or EN DASH and then COMBINING DIAERESIS ABOVE. I think that was David's point, that these things are always possible using combining characters, and the argument "but it's easier with a precomposed character" doesn't stand up to the concerns about proliferation and normalization. -Doug Ewell Fullerton, California
Re: Egyptian Transliteration Characters
At 09:13 -0500 2001-09-26, David Starner wrote: >The problem is, I have a couple of German texts that I plan to >transcribe, where all I need is HYPHEN WITH DIARESIS. So, you type HYPHEN or EN DASH and then COMBINING DIAERESIS ABOVE. -- Michael Everson *** Everson Typography *** http://www.evertype.com 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)
Re: Egyptian Transliteration Characters
At 07:20 -0700 2001-09-26, Mark Davis wrote: >2. something that looks like a right half ring with a tail egyptologists >have represented it with something that looks like two right half rings >stacked on top of each other. > >3. a capital and small glottal stop and reversed glottal stop > >For (2), (3), we would need a submission with documentation of usage. We do >add capital/small versions of characters when there is sufficient evidence >of their usage. This happens, for example, when an IPA is pressed into >service in the regular orthography of a language. Pleas http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2241.pdf, my N2241: Proposal to add 6 Egyptological characters to the UCS -- Michael Everson *** Everson Typography *** http://www.evertype.com 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)
Re: Egyptian Transliteration Characters
Of your three issues: 1. LATIN CAPITAL LETTER H WITH LINE BELOW 2. something that looks like a right half ring with a tail egyptologists have represented it with something that looks like two right half rings stacked on top of each other. 3. a capital and small glottal stop and reversed glottal stop For (1), they are already representable in Unicode, as you state. The policy is not to introduce new precomposed characters, because of normalization stability. A new precomposed character is disallowed in NFC, so it would end up being decomposed in NFC systems in any event: with XML, etc. For (2), (3), we would need a submission with documentation of usage. We do add capital/small versions of characters when there is sufficient evidence of their usage. This happens, for example, when an IPA is pressed into service in the regular orthography of a language. To submit a proposal, go to www.unicode.org, click on "submitting proposals" (you may already be following that, since it recommends discussing proposals on this list!) Mark — Δός μοι ποῦ στῶ, καὶ κινῶ τὴν γῆν — Ἀρχιμήδης [http://www.macchiato.com] - Original Message - From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, September 26, 2001 12:42 AM Subject: Egyptian Transliteration Characters > Hello One and All, > > Before setting off down the path of submitting a couple of new characters I > would like to run them past you for your consideration. If I have ben blind > as a bat and these characters already exist please correct me in my error. > But first, a little context... > > I am an Egyptologist and, as you can imagine, transliteration is big in > Egyptology since it is not only essential in language teaching but a major > convenience in its own right. While complete unanimity is lacking amongst > egyptologists concerning the conventions for transliteration there is way > better than 95% agreement on the basics. Not surprisingly the Unicode > character-set already addresses nearly every character required to > transliterate Ancient Egyptian according to any of the alternative schemes > which may be used. > > However, it appears that one character is missing (OK, 2 characters if we > say uncial and diminuative) and another is not available in the form in > which egyptologists are accustomed to encounter it. > > The missing characters can be characterised as follows: > > LATIN CAPITAL LETTER H WITH LINE BELOW > LATIN SMALL LETTER H WITH LINE BELOW > > I model these descriptions on those of 1E0E, 1E6E, 1E2A, 1E24 (at least > insofar as the capital is concerned). > > Now, I know that the correct appearance could be achieved using combining > characters, but it seems a pain to have to do this for one character only. > > The other character - the one that just does not appear in a form commonly > used in egyptology - corresponds in function to the glottal stop (02C0),but > rather than represent this as something that looks like a right half ring > with a tail egyptologists have represented it with something that looks > like two right half rings stacked on top of each other. To illustrate this > rather poor description a little more graphically let me say that in > typescript egyptologists often just fake it by typing a "3". By the way we > typically refer to this character as "aleph", modelled on the Hebrew. > ... Then there is the small issue that we like to use capitals in > transliterating proper nouns - but does it even make sense to have a > capital and small glottal stop and reversed glottal stop? I will stop now > before I embarass myself. > > Many thanks to all who will reply. > > - Spencer Tasker > > >
Re: Egyptian Transliteration Characters
On Wed, Sep 26, 2001 at 09:42:32AM +0200, [EMAIL PROTECTED] wrote: > The missing characters can be characterised as follows: > > LATIN CAPITAL LETTER H WITH LINE BELOW > LATIN SMALL LETTER H WITH LINE BELOW > > I model these descriptions on those of 1E0E, 1E6E, 1E2A, 1E24 (at least > insofar as the capital is concerned). > > Now, I know that the correct appearance could be achieved using combining > characters, but it seems a pain to have to do this for one character only. The problem is, I have a couple of German texts that I plan to transcribe, where all I need is HYPHEN WITH DIARESIS. (It's used in a vocabulary list to indicate mutation of the vowel for the plural form.) The Lithuanians only needed a few more combining characters for pedagogal reasons, as put forth in their proposal a few years ago. There's so many places that could use just one or two more combining characters, that Unicode has basically drawn a line in the sand. (Also, it messes with the Composition/Decomposition algorithm to add more composed characters.) -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org When the aliens come, when the deathrays hum, when the bombers bomb, we'll still be freakin' friends. - "Freakin' Friends"
Egyptian Transliteration Characters
Hello One and All, Before setting off down the path of submitting a couple of new characters I would like to run them past you for your consideration. If I have ben blind as a bat and these characters already exist please correct me in my error. But first, a little context... I am an Egyptologist and, as you can imagine, transliteration is big in Egyptology since it is not only essential in language teaching but a major convenience in its own right. While complete unanimity is lacking amongst egyptologists concerning the conventions for transliteration there is way better than 95% agreement on the basics. Not surprisingly the Unicode character-set already addresses nearly every character required to transliterate Ancient Egyptian according to any of the alternative schemes which may be used. However, it appears that one character is missing (OK, 2 characters if we say uncial and diminuative) and another is not available in the form in which egyptologists are accustomed to encounter it. The missing characters can be characterised as follows: LATIN CAPITAL LETTER H WITH LINE BELOW LATIN SMALL LETTER H WITH LINE BELOW I model these descriptions on those of 1E0E, 1E6E, 1E2A, 1E24 (at least insofar as the capital is concerned). Now, I know that the correct appearance could be achieved using combining characters, but it seems a pain to have to do this for one character only. The other character - the one that just does not appear in a form commonly used in egyptology - corresponds in function to the glottal stop (02C0),but rather than represent this as something that looks like a right half ring with a tail egyptologists have represented it with something that looks like two right half rings stacked on top of each other. To illustrate this rather poor description a little more graphically let me say that in typescript egyptologists often just fake it by typing a "3". By the way we typically refer to this character as "aleph", modelled on the Hebrew. ... Then there is the small issue that we like to use capitals in transliterating proper nouns - but does it even make sense to have a capital and small glottal stop and reversed glottal stop? I will stop now before I embarass myself. Many thanks to all who will reply. - Spencer Tasker