Deseret keyboard (was:Re: Special Type Sorts Tray 2001)
In a message dated 2001-10-02 22:04:41 Pacific Daylight Time, [EMAIL PROTECTED] writes: >> I still live in hopes that someone, John or someone else, will one >> day send me a Deseret keyboard layout that is at least SLIGHTLY >> standard (meaning more than one person has ever used it). >> >> I need something I can download and read on a Windows machine. >> Text or a GIF would be fine. > > I have a hard time picturing what such a layout would *look* like... what > the heck would someone who uses the language expect, anyway? :-) Well, careful now. The language is English. You mean "someone who uses the script." I tried creating a Deseret keyboard for (and with) SC UniPad, using the Dvorak keyboard layout as a loose model. By that I do not at all mean that I mapped Latin letters on the Dvorak keyboard to "equivalent" Deseret letters, but rather that I put the most common letters (as determined from a large chunk of text in Deseret) on the home row and relegated the least common letters to Alt+Gr (Ctrl+Alt) combinations. The biggest problem, of course, is that there are 38 of the buggers and so these Alt+Gr combinations are necessary. My keyboard is all right, I guess, but it is completely my own invention and I really know nothing about the engineering that goes into proper keyboard design. I'd feel better with something designed by someone who had a clue, and/or something that has seen some actual use. Not that there are an awful lot of users, mind you. -Doug Ewell Fullerton, California
Re: Special Type Sorts Tray 2001
From: <[EMAIL PROTECTED]> > I still live in hopes that someone, John or someone else, will one > day send me a Deseret keyboard layout that is at least SLIGHTLY > standard (meaning more than one person has ever used it). > > I need something I can download and read on a Windows machine. > Text or a GIF would be fine. I have a hard time picturing what such a layout would *look* like... what the heck would someone who uses the language expect, anyway? :-) > I noticed that the LDS Church is listed as an associate member of Unicode. I > wonder if their representative might have anything. I don't know if the reps have ever participated here or in the UTC? MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/
Code points for "al-Qaeda"
Like everyone else, I have suddenly become familiar in the past three weeks with the name "al-Qaeda," Arabic for "the base" and the name of Osama bin Laden's terror network. I have also noticed the variations in pronunciation and romanized spelling, and being a bit more interested in such things than the typical American, it makes me curious: How is "al-Qaeda" spelled in Arabic? I know there are several list members who know the small amount of Arabic necessary to answer this question. Please specify Unicode code points in the U+0600 block. Thanks, -Doug Ewell Fullerton, California
Re: Special Type Sorts Tray 2001
In a message dated 2001-10-02 10:46:47 Pacific Daylight Time, [EMAIL PROTECTED] writes: >> And I am sure Apple is hard at work on the Desert font and keyboard for Mac >> OS 11? :-) > > We've already added a Deseret glyph to the Last Resort font in 10.1. > Beyond that, the Deseret Language Kit remains available at my Web > site but doesn't work on X. Yet. I still live in hopes that someone, John or someone else, will one day send me a Deseret keyboard layout that is at least SLIGHTLY standard (meaning more than one person has ever used it). I need something I can download and read on a Windows machine. Text or a GIF would be fine. I noticed that the LDS Church is listed as an associate member of Unicode. I wonder if their representative might have anything. -Doug Ewell Fullerton, California
Re: Special Type Sorts Tray 2001
At 09:27 10/2/2001, John H. Jenkins wrote: >The current generation of font tools does not generally allow the creation >of a glyph in a font without assigning it a code-point of some sort. As a >result, there are a number of fonts out there that have PUA code points >assigned to them, but *not* as a means of promoting interchange of these >glyphs in plain text, but as a means of easing the font production process. That is about to change dramatically with the release of FontLab 4.0, in which the presence or absence of codepoints for glyphs is explicit and controlled by the font developer. John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] Type is something that you can pick up and hold in your hand. - Harry Carter
Re: Special Type Sorts Tray 2001 (derives from Egyptian TransliterationCharacters)
>I feel that this is a matter that needs to be formally resolved one way or >the other, so that, if such a refusal has been declared then people who wish >to have these characters encoded may act knowing that the Unicode Consortium >will have legally estopped itself from making any future complaint that it >has some right to set the standards in such a matter and that those people >who would like to see the problem solved and ligatured characters encoded as >single characters so that a font can be produced may proceed accordingly... >perhaps approaching the international standards body directly if the Unicode >Consortium refuses to do so without a process of even considering individual >submissions on their individual merits... This is all based on false assumptions and reflects a lack of understanding of Unicode and the technologies it is designed to work together with. The problem *has* been solved and does not require ligated *glyphs* to be encoded as distinct characters. You can see implementations working very nicely, for example, with Arabic or Devanagari ligatures in Notepad (or MS Word 2000) on any Windows 2000 system. Not only *can* production of fonts proceed accordingly, but such fonts already exist and are distributed in shipping products. MS products do not yet support Latin ligatures, but that is not an encoding problem -- it is a problem with the particular products in question (and MS is working to address it -- expect to see support for Latin ligatures in the next version of Office due out next year). There are other software products that do support Latin ligatures today without requiring them to be encoded as distinct characters. Moreover, the Unicode Consortium does not have to concern itself with legal rights regarding what does or does not get encoded -- it owns the Unicode standard, and can decide to encode or not to encode as it sees fit. The Consortium has entrusted those decisions to its Technical Committee, and that committee has decided to work with implementation principles that do not in general require ligature glyphs to be encoded as distinct characters. Furthermore, the Unicode Technical Committee will always, and does, consider *any* submission on their individual merits. Submitters do not always end up satisfied with the conclusions reached by the Committee, but that is another issue. Also, trying to by-pass UTC by going directly to ISO is not going to change anything since the corresponding ISO committee uses the same implementation principles (they are the ones that wrote the character-glyph model document, ISO/IEC TR15285 -- can be obtained for free from http://isotc.iso.ch/livelink/livelink/fetch/2000/2489/Ittf_Home/ITTF.htm), and by mutual agreement nothing gets encoded by one committee unless ratified by the other. If you're needing to see something in print, try section 2.2 "Unicode Design Principles" of TUS3.0, specifically the sub-section entitled "Characters, Not Glyphs". >I feel that it would be quite wrong to pull up the ladder on the possibility >of adding characters such as the ct ligature as U+FB07 without the >possibility of consideration of each case on its merits at the time that a >possibility arises. The merits have been considered, weighed in the balance and found wanting. The fact that a ct ligature at FB07 is *not* needed is illustrated by the fact that you can produce that ligature from an encoded sequence of < c, t > in (for example) Adobe InDesign using appropriate fonts (such as Adobe Minion Pro). >If the possibility of fair consideration is, however, still open, then the >ct ligature could be defined as U+E707 within the private use area and >published as part of an independent private initiative amongst those members >of the unicode user community that would like to be able to use that >character in a document by the character being encoded as a character in an >ordinary font file. That would enable font makers to add in the ct >character if they so choose. You can look for others with which to make a private agreement if you so choose, but don't expect the major type foundaries to encode a ct-ligature glyph at e707: they already know that they don't need to, and a number of fonts already include it without having resorted to direct encoding. >My point is that the specification purports to lay down the rules, yet there >seems to be many other pieces of information that seem to be "understood" on >a nudge nudge basis Not at all. If you were to attend a conference, you would find sessions discussing some of these implementation issues. If you were a professional font developer, then you would find these issues discussed at professional conferences such as ATypI, and you would probably already know of resources that explain them on the web. This is not secretive stuff; the VOLT user community, for example, has over 1700 members -- these are people interested in development of OpenType fonts that handle exactly the kind of
Re: Special Type Sorts Tray 2001
Doug Ewell wrote: >You might start by checking existing fonts, especially those shipped with >major operating systems, to see what PUA code points are commonly used >internally for glyphs not associated with a standard Unicode character. Fonts that are designed to work with advanced rendering technologies and that contain presentation-form glyphs such as a ct-ligature do not have to encode those glyphs in the PUA. The transformations that convert sequences of characters into sequences of positioned glyphs are all done entirely in terms of glyph identifiers (such as Postscript names), which are purely a font-internal thing and have nothing whatsoever to do with character encoding. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: Shape of the US Dollar Sign
I can't resist transcribing the following, which is a quotation from _Love_and_Sleep_ by John Crowley (Bantam Books, 1994). (It's fiction.) |There are many Monarchs, and many Princes, but only one Emperor. Rudolf | II, King in his own right of Hungary and Bohemia, Archduke of Austria, | became Emperor by election and the chrism with which the Pope had anointed | his head: Singular and Universal Monarch of the Whole Wide World. Or at | least his shadow. | |His grandfather Charles, who had been king of all the lands Rudolf was | king of, had been king in Spain too, ruler of the Netherlands and Low | Countries; he was king of Savoy, lord of Naples and Sicily, he had had the | Pope at his feet and sacked His City, Rome. God's scourge. Charles had had a | device made for him, of all the famous devices and signs and emblems of | great rulers the most famous, known and seen throughout Christendom and in | lands around the world that the old emprerors in Rome had never known | existed. Charles's emblem showed two pillars---they were the Pillars of | Hercules that stand at the Gates of the Sea, the gates to the New World. | Around these pillars ran a banner, that bore these words: _Plus_oultra_, | "Even farther." The emblem was cut on medals and embossed on shields and | breastplates, it was engraved on wood and printed on the title pages of | geographies of the New World, and it was stamped on coins made of gold | that was dug on the other side of the world. The emblem was so famous that | it went on being stamped on gold coins for long after Charles was dead, for | so long that the dies lost their details, and the words of the motto were | worn away, and still it kept being stamped on Spanish coins, though all that | was left to be seen were the two pillars and the twining banner, no longer | meaning "Empire" or "Charles" or "Even farther" but only "dollar": | | $ | | No kingdom is eternal.
Emails in Chinese
Title: Message -Original Message-From: Jennifer David [mailto:[EMAIL PROTECTED]] Sent: Friday, September 28, 2001 10:22 PMTo: [EMAIL PROTECTED]Subject: Hello friends at Unicode, I am wondering if you could tell me why I can send an E-mail in Chinese characters to a friend in China who can recieve it clearly, but when they write me in Chinese I receive a scrambled message that doesn't resemble Chinese writing. I am currently using universal translator 2000 supported by unicode to send them E-mails in Chinese. If they don't use unicode could this be why their messages are scrambled to me? If so, please let me know what I must do to set them up for proper communication. Your time and consideration is greatly appreciated. Respectfully, Shawn David.
RE: Unicode IPA chart
Rick McGowan wrote: > > Anyone knows where I could find an online chart of the International > > Phonetic Alphabet encoded in Unicode (plain text or HTML)? > > Thanks in advance. > > _ Marco > > Try the charts! > > http://www.unicode.org/charts/ Seeking a page to explain what I was looking for... I simply found it: http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm Thank you. _ Marco
RE: Special Type Sorts Tray 2001
MichKa, > > And I am sure Apple is hard at work on the Desert font and > keyboard for Mac > OS 11? :-) > Getting the scripts defined will allow third parties to add support to most operating systems for specific languages that are not supported by the standard offerings. The big deal will be getting all the Unicode software that was written for UCS-2 changed to support UTF-16 or UTF-32. I think that GB18030 will be the big factor. You can not convert it to Unicode without extended plane support. Processing it in code page is a mess. Carl
Re: Egyptian Transliteration Characters
>3. a capital and small glottal stop and reversed glottal stop >For (2), (3), we would need a submission with documentation of usage. We do >add capital/small versions of characters when there is sufficient evidence >of their usage. This happens, for example, when an IPA is pressed into >service in the regular orthography of a language. > >To submit a proposal, go to www.unicode.org, click on "submitting proposals" >(you may already be following that, since it recommends discussing proposals >on this list!) I recently learned of some languages using upper and lower case glottal stops. I don't have details at the moment, but have anticipated writing a proposal once the linguists involved provide further info. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: Egyptian Transliteration Characters
>At 09:13 -0500 2001-09-26, David Starner wrote: > >>The problem is, I have a couple of German texts that I plan to >>transcribe, where all I need is HYPHEN WITH DIARESIS. > >So, you type HYPHEN or EN DASH and then COMBINING DIAERESIS ABOVE. It isn't obvious to me that this is the correct solution: first, one needs to decide whether 002d, 2010, 2011, 2012, 2013 or 2212 will be used, and then try to ensure that that is what is consistently used. More importantly, though, there is a question as to whether any of these has the appropriate character properties. For instance, I'm guessing that the line-breaking properties would be wrong for this usage. It would be possible to add a new character DASH WITH DIAERESIS as long as it does not have any decomposition. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: Ḧÿp̈ḧën̈ ̈ẅïẗḧ ̈d̈ïäër̈ïs̈ ̈äb̈öv̈ë
>It doesn't look correct either: > >-Ì âÌ âÌ > >In the first case, it's too far to left. In the last case it's too far to >the right. In all three cases it's too far high above the hyphens (at least >in the font I'm displaying this message with). This naively assumes that rendering of combining marks can be done using default glyph metrics alone. This is simply not the case -- complex rendering requires a "smart font" rendering technology like AAT, Graphite or OpenType+Uniscribe|CoolType. All three combinations can be made to look good using any of the three technologies mentioned. These things are well understood by people implementing support for scripts like Devanagari, Arabic or Myanmar. What many people still need to learn is that Latin is also a "complex" script. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: Special Type Sorts Tray 2001
At 10:13 AM -0700 10/2/01, Michael (michka) Kaplan wrote: >And I am sure Apple is hard at work on the Desert font and keyboard for Mac >OS 11? :-) > We've already added a Deseret glyph to the Last Resort font in 10.1. Beyond that, the Deseret Language Kit remains available at my Web site but doesn't work on X. Yet. -- John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: Unicode IPA chart
At 10:20 -0700 2001-10-02, Rick McGowan wrote: > > Anyone knows where I could find an online chart of the International >> Phonetic Alphabet encoded in Unicode (plain text or HTML)? > > Thanks in advance. > >Try the charts! > > http://www.unicode.org/charts/ No, he meant the IPA chart, not the IPA page in Unicode. -- Michael Everson *** Everson Typography *** http://www.evertype.com 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)
Re: Special Type Sorts Tray 2001
From: "John H. Jenkins" <[EMAIL PROTECTED]> > At 5:28 PM +0100 10/2/01, Michael Everson wrote: > > > >The CSUR is maintained to support scripts of various kinds. Some of > >those (Shavian, Deseret, Tengwar, Cirth) are expected to "graduate" > >into Unicode. > > And one of them already has! And I am sure Apple is hard at work on the Desert font and keyboard for Mac OS 11? :-) MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/
Re: Unicode IPA chart
> Anyone knows where I could find an online chart of the International > Phonetic Alphabet encoded in Unicode (plain text or HTML)? > Thanks in advance. > _ Marco Try the charts! http://www.unicode.org/charts/ Rick
Re: Shape of the US Dollar Sign
> From: Michael Everson <[EMAIL PROTECTED]> > > I find the double-barred dollar sign a bit old-fashioned looking. > Reminds me of money clips and monopoly games. I rather like it. Especially in handwriting -- Jeff Guévin Staff Coordinator The University Professors Boston University
Re: Special Type Sorts Tray 2001
At 5:28 PM +0100 10/2/01, Michael Everson wrote: > >The CSUR is maintained to support scripts of various kinds. Some of >those (Shavian, Deseret, Tengwar, Cirth) are expected to "graduate" >into Unicode. And one of them already has! -- John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Unicode IPA chart
Anyone knows where I could find an online chart of the International Phonetic Alphabet encoded in Unicode (plain text or HTML)? Thanks in advance. _ Marco
Re: plane business
At 10:42 PM 10/1/01 -0700, Bernard Miller wrote: >--- Asmus Freytag <[EMAIL PROTECTED]> wrote: > > There are 66 non-characters as of Unicode 3.1, there > > were 34 non-characters > > before. > >I understand now.. the non characters in 16 higher >planes were defined first, then the ones in the arabic >presentation forms block. In this case it is as I >suspected, just a documentation problem. The book says >"None of these surrogate pairs has been ASSIGNED in >this version of the standard" (emphasis mine). There are three types of things that can be stated for a code point (code point, not character) - allocation - designation - assignment Allocation refers to whether the code point is part of the standard - allocation changed once in the life of Unicode to include the range 0x1-0x10. Designation refers to the status as character, non- character, surrogate, private use character, etc. Designation changed twice in Unicode, once to designate the surrogates, and once to designate the 32 characters on the BMP as non-characters. Assignment refers to assigning a character to a code point. New assignments are made all the time, as new characters are added to the standard. In the early history of Unicode, assignments changed twice, once to reflect the merger with 10646, and once to add the Korean Hangul. Future assignment changes are restricted to adding new assignments. Because people easily confuse code points and characters, few people make the distinction between allocation, designation, and assignment. New text being drafted for Unicode 4.0 will clarify these terms. >It >would merely be misleading to not mention 32 non >characters in the section called "non characters" and >to state that there are no characters in the higher >planes as of Unicode 3.0; but I think we have a bona >fide incorrect statement to say that no surrogate pair >has been ASSIGNED when in fact 32 surrogate pairs were >assigned the status of non characters. As you can see from the above, they were "designated" and not "assigned". > > The reason to put the additional (defined in 3.1) > > non-characters into the BMP is to allow them to > > have single codes for UTF-16 implementation - > > something that doesn't > > work so well if they are on the higher planes. > >I don't understand this, the "arabic" non characters >are supposed to REPRESENT the "hidden" non characters? No, implementors in the UTC simply demonstrated a need to have 32 non-character code points - code points that they would be free to use internally because they would never be a legal part of any interchanged data. For UTF-16 implementations, using the 32 supplementary non-characters would have forced them to use surrogate pairs, which is awkward for the kinds of use intended for internal-use code points. That's why 32 code points in the BMP were re-designated from 'reserved' to 'non-character'. A./
Re: Special Type Sorts Tray 2001
At 11:43 AM -0400 10/2/01, [EMAIL PROTECTED] wrote: > >You might start by checking existing fonts, especially those shipped with >major operating systems, to see what PUA code points are commonly used >internally for glyphs not associated with a standard Unicode character. I >know that several Windows fonts have privately assigned glyphs, and I assume >the same is true for Macintosh fonts. The current generation of font tools does not generally allow the creation of a glyph in a font without assigning it a code-point of some sort. As a result, there are a number of fonts out there that have PUA code points assigned to them, but *not* as a means of promoting interchange of these glyphs in plain text, but as a means of easing the font production process. >Also, maybe the various font makers >who haunt this list could contribute any guidelines they know of for >quasi-standardizing these code points. Adobe has a list somewhere at its site of how it uses the PUA. Apple also pubishes its PUA use. >Obviously, you are hoping that >standardizing the code points could lead to some measure of interoperability; >otherwise there would be no discussion. If all you want is to encode the ct >ligature in a font, you can use any old PUA character you wish, conformantly. > >OTOH, private creation of quasi-standards on the part of vendors is not >necessarily a good thing. It is the sort of thing that the public tends to >vilify Microsoft for doing. The purpose for both Adobe and Apple, at least, in making their PUA use public is to avoid collision more than to promote interchange. There is near-universal agreement that the way to get MS Word to handle ligatures correctly is for it to beef up its OT/AAT support. >If you want to interchange the ct ligature and the long-s ligatures, you can >do that right now. Just encode or . >Then, rendering engines that have a glyph for the desired ligature can render >it, and those that don't will fall back to the individual characters >(assuming they are conformant). This approach has at least three major >advantages: > >(1) It is already supported by the Unicode Standard. >(2) It provides a standard interchange mechanism without requiring font >vendors to agree on the code point used for the precomposed glyph. >(3) It provides a sensible fallback mechanism for the great majority of >fonts that, let's admit it, will not have these specialized glyphs. BTW, I'm not aware that anybody is revising their fonts to handle ZWJ this way. Anyway, there is is a long-standing argument on this subject, and unless I misremember the official position of the UTC, this approach --specifying ligation control in plain text -- is not considered the best mechanism in Latin typography. The problem is that ligation control is *very* font-specific in Latin type. Different fonts will have different sets of ligatures available to them -- you can compare the set of ligatures in a font like Courier (which has fi and fl only because MacRoman forced them to be present and should, typographically, have no ligatures at all), with the set in a font like Adobe Garamond Pro, with the set in Hoefler Text, with the set in Zapfino. On the whole, one cannot assume that the user can even anticipate the set of ligatures that the type designer will consider appropriate for their typeface. It's only when you have the typeface specified that you can meaningfully begin to specify the set of ligatures to use. The consistent approach of font vendors towards the problem if ligation is not to include the request for them in plain text, and definitely *not* to use distinct code points to represent them. -- John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
RE: Shape of the US Dollar Sign
I find the double-barred dollar sign a bit old-fashioned looking. Reminds me of money clips and monopoly games. -- Michael Everson *** Everson Typography *** http://www.evertype.com 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)
Re: Special Type Sorts Tray 2001
At 11:43 -0400 2001-10-02, [EMAIL PROTECTED] wrote: >[EMAIL PROTECTED] writes: > >>> You might want to take a look at the ConScript Unicode Registry, which was >>> originally intended for "constructed" and artificial scripts, but which >>> could also be used for this purpose. >> >> No, it couldn't. It's for constructed and artificial scripts, not for >> precomposed Latin glyphs. > >I stand corrected. But there is no reason William couldn't initiate his own >registry, along the lines of CSUR, for the purpose of assigning PUA code >points to precomposed Latin glyphs. Just don't expect the characters thus >added to "graduate" somehow into Unicode. The CSUR is maintained to support scripts of various kinds. Some of those (Shavian, Deseret, Tengwar, Cirth) are expected to "graduate" into Unicode. But those are legitimate scripts with legitimate users, and they can't be represented in Unicode otherwise. William, I have a number of papers about using the ZWJ to force ligation. I am interested in the problem; perhaps those papers may be of interest to you. -- Michael Everson *** Everson Typography *** http://www.evertype.com 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)
Re: Special Type Sorts Tray 2001 (derives from Egyptian Transliteration Chara...
Oops, I forgot something. In a message dated 2001-10-02 4:50:03 Pacific Daylight Time, [EMAIL PROTECTED] writes: > if such a refusal has been declared then people who wish > to have these characters encoded may act knowing that the Unicode Consortium > will have legally estopped itself from making any future complaint that it > has some right to set the standards in such a matter The Unicode Consortium is a private, not-for-profit organization. ISO/IEC JTC1/SC2/WG2 is an international standards working group. I don't believe either is subject to the legal principle of estoppel. Essentially, if they want to they can play Calvinball with the standard they are creating, although we all hope that does not happen. -Doug Ewell Fullerton, California ("Calvinball" comes from the American comic strip "Calvin and Hobbes," in which a young boy plays a game with his stuffed tiger who comes to life, the main rule of which game is that the boy, Calvin, can change the rules at any time.)
Re: Special Type Sorts Tray 2001
In a message dated 2001-10-02 4:50:03 Pacific Daylight Time, [EMAIL PROTECTED] writes: > Is there an official Unicode Consortium statement that states, for the > record, that the Unicode Consortium refuses to encode more ligatures and > precomposed characters please? I'm pretty sure there is, since it has been brought up so often by UTC members on this list. If there is no such statement, then one should be drafted. > I feel that this is a matter that needs to be formally resolved one way or > the other, so that, if such a refusal has been declared then people who wish > to have these characters encoded may act knowing that the Unicode Consortium > will have legally estopped itself from making any future complaint that it > has some right to set the standards in such a matter and that those people > who would like to see the problem solved and ligatured characters encoded as > single characters so that a font can be produced may proceed accordingly, > perhaps approaching the international standards body directly if the Unicode > Consortium refuses to do so without a process of even considering individual > submissions on their individual merits. On the other hand, if no such > formal statement has been issued, then those people who would like to see > the problem solved and ligatured characters encoded as single characters so > that a font can be produced for use with software such as Microsoft Word may > proceed to define characters in the private use area in a manner compatible > with their possible promotion to being regular unicode characters in the > presentation forms section. Was that only two sentences? Wow Regarding the "refusal" to encode more ligatures and precomposed presentation forms: It is not arbitrary. There is a reason why Unicode will not encode these things. They would interfere with the established standard for decomposition. Now that Unicode has reached its present level of popularity, some vendors and implementations (and standards) require a stable set of decomposable code points. That set is Unicode 3.0. If new precomposed characters were added, engines and standards that were built to the new standard would decompose them differently from those built to the old standard, and this is not acceptable to those who need decomposition to work at all. Precomposed characters and ligatures won't be considered "on their individual merits," and they won't be "promoted" from a private standard to true Unicode character status, because the decomposition problem is bigger than the individual merits. Note that I personally like the ct ligature and think it would be a great thing to have in a font. If this were 1993, perhaps it might have been encoded. Regarding fonts: Nothing is stopping you or anyone else from making a font with these precomposed glyphs and associating them with Unicode PUA (Private Use Area) code points. That is an excellent illustration of a possible use of the PUA, and many, many font vendors do just that. > I feel that it would be quite wrong to pull up the ladder on the possibility > of adding characters such as the ct ligature as U+FB07 without the > possibility of consideration of each case on its merits at the time that a > possibility arises. A situation would then exist that several ligatures > have been defined as U+FB00 through to U+FB06 including one long s ligature, > yet that U+FB07 through to U+FB12 must remain unused even though they could > be quite reasonably used for ct and various long s ligatures so as to > produce a set of characters that could be used, if desired, for transcribing > the typography of an 18th Century printed book. Yet, if the ladder has been > pulled up, perhaps U+FB07 can be defined as the ct ligature directly by the > international standards organization and the international standards > organization could decide directly about including the long s ligatures. The organization you are talking about is ISO/IEC JTC1/SC2/WG2. They are firmly committed to maintaining compatibility between Unicode and ISO/IEC 10646. Sorry, but this is a good thing. > If the possibility of fair consideration is, however, still open, then the > ct ligature could be defined as U+E707 within the private use area and > published as part of an independent private initiative amongst those members > of the unicode user community that would like to be able to use that > character in a document by the character being encoded as a character in an > ordinary font file. That would enable font makers to add in the ct > character if they so choose. You might start by checking existing fonts, especially those shipped with major operating systems, to see what PUA code points are commonly used internally for glyphs not associated with a standard Unicode character. I know that several Windows fonts have privately assigned glyphs, and I assume the same is true for Macint
RE: [OT] Roman numeral arithmetic
> From: Edward Cherlin [mailto:[EMAIL PROTECTED]] > Sent: Saturday, September 29, 2001 05:55 PM > > If we omit the later use of subtractive notation (iv=4, xc=90 > etc.), the original Roman numerals are exactly equivalent to > the Chinese abacus where each wire holds four beads below the > bar (value I, X, C, M) and one above (value V, L, D, U+2181). > It is well known that practiced abacists could beat users of > mechanical adding machines in multi-column addition and > subtraction. The same technique is taught (under the Korean > name Chisanpeop) for two-column finger arithmetic, using the > thumbs for the five beads and the other fingers for the one beads. Maybe, but I think you may be missing a point: subtractive notation was an improvement (or so I believe). I will use the same examples and my finest ASCII graphics. Note: fixed width font required. > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED]]On > > Behalf Of James Kass > > Sent: Sat, September 22, 2001 4:52 PM > > > > Doug Ewell wrote: > > > > > >> I would be fascinated to see some sort of evidence > > that addition and > > > >> subtraction is easier in Roman numerals than in > > Hindu-Arabic ("European") > > > >> numerals. > > > > > > > > I + I = II > > > > X + X = XX > > > > X + X + X = XXX > > > > C + X = CX > > > > CX - X = C > > > > > > For these carefully chosen examples, sure, but what about: > > > > > > III + IX = XII > > III+IX = III + V = VIII = XII First, subtractive notation gives us the opportunity to perform preoperations on each final higher order place, so that some calculations occur solely during the simplification: +Cancel from the right V +---+ | |+---+ | || |<---Keep III + IX = XII || || |+--+| ++ ^ +-Keep > > > XXIV + XXVII = LI > > XXIV + XXVII = XX + XXVII = VII = XI = LI Next, we see that like types are handled individually for the first part of the operation, followed by combination: ++-++-- ||+||-+<------Cancel ||||| | XXIV + XXVII = VVI = XI = LI | | | ||| +-+-|---++| +-+ > > > C - I = XCIX > > C - I = LX - I = LVI - I = LV = XCIX Now, a bit that seems quite tricky at first. Each symbol can (during computation only) be expanded into a subtractive-additive form, using the next lower level symbol (ignoring groupings of five): C - I = XCX - I = XCIXI - I = XCIX > > Let's get serious. Try 1984 + 1066. > > MCMLXXXIV + MLXVI = MDLXXX + MLXVI = MM DLL VI > = MMML = 3050 Ugh. That was an anticase. Try: MCMLXXXIV + MLXVI = MMCMLLVV = MMML | || | ++ Cancel--->| | | C | | +--+ > > > > etc. This is no better than European digits, and it > > feels a little like > > > doing math with pounds, shillings, and pence. Actually, get a little used to it and you'll find it easier than decimal addition and subtraction. This is the force of habit at play. Decimal mathematics are done purely by rote memorization of the tables, then using combining techniques. It is those combining techniques that give it power and flexibility, especially where higher order operations are concerned, and make up for the poor results of the memorized tables. > Lsd is a simple mixed base. Says you and Timothy Leary. ;-) /|/|ike
RE: Shape of the US Dollar Sign
> From: G. Adam Stanislav [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 01, 2001 12:07 PM > Send him a check instead. Every single US check I have ever seen had > a dollar sign printed to the left of the field where the > numeric amount > is to be entered. They all use the same glyph regardless of the rest > of the design of the check. Not necessarily. I don't recall looking, but any commonality here is bound to be coincidence. > That glyph is the S with a single vertical bar. That does not make it > the official legal glyph (I doubt we have one), though. There is no "official" dollar sign, unless it's a really well kept secret. In fact the dollar sign rarely appears in governmental publications (it probably shows up a bit these days, but previously has been very rare). > I grew up in Slovakia, and we were taught to draw the US dollar sign > with two vertical bars. I recall my surprise when I came to the US > and saw the single-bad dollar sign. I asked my American born friend > about it, and he insisted that in America the dollar sign is always > drawn with a single vertical bar. He was either putting you on or misinformed. One bar, two bars - as we say in America, "it's all good". > Heh, then the computer revolution started, and suddenly I started > noticing dollar signs that looked like S with just a tiny scratch > above and below but not all the way across. Go figure. :) That has nothing to do with the computer revolution. The "S with vertical bars above and below" has been around a while. let me digress a bit... As I understand it, the original dollar sign did have two bars. The single bar version came into play because it was easier to make movable type with the single bar glyph, the double bar glyph requiring very hard metals and the appropriate tools, and therefore requiring more expensive type. Likewise, even the single bar was a bit too much for cheap rubber type, so the bar was removed from inside the "S" curves of the single bar glyph to accomodate that (otherwise the ink would run into the blank semicircles and you would just have a blob). If you go to a minimart that still uses a price gun, you may have the treat of seeing this glyph, along with its sister, the "c with vertical bars above and below". /|/|ike
Re: Special Type Sorts Tray 2001 (derives from Egyptian Transliteration Characters)
From: "William Overington" <[EMAIL PROTECTED]> > Is there an official Unicode Consortium statement that states, for the > record, that the Unicode Consortium refuses to encode more ligatures and > precomposed characters please? I think it is quite clearly stated that the ones that ARE present are there for backwards compatibility with pre-existing standards. Not sure why you feel that it is important to do more than this? Perhaps the standard is not applying as much verbiage to it as you would like it to -- but the point is just as valid in a sentence as in a chapter. If you like, you can propose such characters -- even a completely preposterous proposal (which this is not!) would not be ignored. If it is refused, then you can understand that the people here are trying to guide your noble (but in my humble opinion misplaced) effort to use Unicode in some way (any way) that it is not in fact why its customers need to use it. > It is unfortunate that an attempt to quite > happily seek to use the private use area as set out in the specification, > where the word "published" is used, seems to become controversialized. I think you are misunderstanding the intentions of the people who have been commenting. Your ideas are not "bad" or "wrong" or "controversial". Some of them simply do not mesh with the intentions of Unicode in every case. People who comment are not claiming "controversy" since these decisions have already been made and do not need to be made again. I think I stated a long time ago that there is much useful work that COULD be done, long before anyone will be bored enough to want to invent new standards such as STST2001 which really do not mesh with the present goals of Unicode. Will you not apply some of the boundless energy that you give to STST into some of those items? Obviously Unicode is not a place to go for fame or glory, or to be remembered for all time as the person who invented __ (fill in the blank here). But it is still useful work that many people will use. And people appreciate Unicode best when they do not notice it. :-) MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/
Re: Special Type Sorts Tray 2001
At 14:22 -0400 2001-09-30, [EMAIL PROTECTED] wrote: >You might want to take a look at the ConScript Unicode Registry, which was >originally intended for "constructed" and artificial scripts, but which could >also be used for this purpose. No, it couldn't. It's for constructed and artificial scripts, not for precomposed Latin glyphs. -- Michael Everson *** Everson Typography *** http://www.evertype.com 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)
Re: Special Type Sorts Tray 2001 (derives from Egyptian Transliteration Characters)
>> Maybe someday some of the characters might be promoted to become regular >> unicode characters by the Unicode Consortium, maybe not. > >Not likely. Unicode refuses to encode more ligatures and precomposed >characters. > Is there an official Unicode Consortium statement that states, for the record, that the Unicode Consortium refuses to encode more ligatures and precomposed characters please? I feel that this is a matter that needs to be formally resolved one way or the other, so that, if such a refusal has been declared then people who wish to have these characters encoded may act knowing that the Unicode Consortium will have legally estopped itself from making any future complaint that it has some right to set the standards in such a matter and that those people who would like to see the problem solved and ligatured characters encoded as single characters so that a font can be produced may proceed accordingly, perhaps approaching the international standards body directly if the Unicode Consortium refuses to do so without a process of even considering individual submissions on their individual merits. On the other hand, if no such formal statement has been issued, then those people who would like to see the problem solved and ligatured characters encoded as single characters so that a font can be produced for use with software such as Microsoft Word may proceed to define characters in the private use area in a manner compatible with their possible promotion to being regular unicode characters in the presentation forms section. The absence of a formal statement coupled to an informal nudge nudge wink wink everybody knows what is meant but it will not be set out as a formal statement is not, in my own opinion, an acceptable situation, so I ask please for formal clarification of the claimed refusal one way or the other. I feel that it would be quite wrong to pull up the ladder on the possibility of adding characters such as the ct ligature as U+FB07 without the possibility of consideration of each case on its merits at the time that a possibility arises. A situation would then exist that several ligatures have been defined as U+FB00 through to U+FB06 including one long s ligature, yet that U+FB07 through to U+FB12 must remain unused even though they could be quite reasonably used for ct and various long s ligatures so as to produce a set of characters that could be used, if desired, for transcribing the typography of an 18th Century printed book. Yet, if the ladder has been pulled up, perhaps U+FB07 can be defined as the ct ligature directly by the international standards organization and the international standards organization could decide directly about including the long s ligatures. If the possibility of fair consideration is, however, still open, then the ct ligature could be defined as U+E707 within the private use area and published as part of an independent private initiative amongst those members of the unicode user community that would like to be able to use that character in a document by the character being encoded as a character in an ordinary font file. That would enable font makers to add in the ct character if they so choose. My point is that the specification purports to lay down the rules, yet there seems to be many other pieces of information that seem to be "understood" on a nudge nudge basis and that words that are in the specification about the private use area such as "published" seem to be overlooked in discussions of using the private use area. It is unfortunate that an attempt to quite happily seek to use the private use area as set out in the specification, where the word "published" is used, seems to become controversialized. William Overington 2 October 2001
RE: Currency symbols (was RE: Shape of the US Dollar Sign)
Yves Arrouye wrote: > > About "£" (L with two bars = "Italian lira" or > "Egypt/Cyprus pound") and > > "£" > > (L with one bar = "Pound Sterling" or "Irish punt"), I > think that the > > Unicode distinction is not valid because: > > > > [...] > > > > For these reason, I suggest that font designers ignore the > distinction > > between U+00A3 (POUND SIGN) and U+20A4 (LIRA SIGN) and use > the same glyph > > for both. The glyphs should have one or two bars depending > on the font > > style and on the choice made for other currency symbols. > > Interesting comment. Isn't the Unicode distinction simply one > of characters, Sure. In fact, I did not discuss the existence of these two different versions of "£" in Unicode. There may be lots of reason for Unicode to have defined two duplicates for the same symbol; a frequently seen reason is compatibility with existing standards. What I say is that I see no reason to keep them visually distinguished in fonts. But I also dispute the correctness of the annotations on U+00A3 (POUND SIGN) and U+20A4 (LIRA SIGN): 00A3POUND SIGN = pound sterling, Irish punt x (lira sign - 20A4) ... 20A4LIRA SIGN * Italy, Turkey x (pound sign - 00A3) I'd find the entries more correct like this: 00A3POUND SIGN * Britain, Egypt, Ireland, Italy, etc. x (number sign - 0023) x (lira sign - 20A4) x (l b bar symbol - 2114) x (square pondo - 3340) x (square rira - 3352) x (fullwidth pound sign - FFE1) ... 20A4LIRA SIGN * Italy, Turkey, etc. x (number sign - 0023) x (pound sign - 00A3) x (l b bar symbol - 2114) x (square pondo - 3340) x (square rira - 3352) x (fullwidth pound sign - FFE1) I know that this is probably impossible, but I'd also add a compatibility mapping: 20A4LIRA SIGN ... # 00A3 > and the difference in glyphs shown in the standard simply a > reflection of > the preferences of the designer of the fonts used to print > the character > tables? I'd think so. Yes, and no. I think that the choice of fonts for the charts reflects many editorial needs. One of these criteria was clearly to choose fonts which are quite "classic" and neutral (e.g. a roman type for Western scripts). Another criterion was probably to deliberately show some little difference between similar characters, in order to distinguish them in indexes (e.g., I know that this was the reason for choosing a sans-serif font for the KangXi radicals, as opposed to a more classical font for other Han characters). But, in some cases, I think that the representative glyph on the charts is intended as a precise (although not mandatory) indication to type designers. In this sense, I found wrong that the U+00A3 (POUND SIGN) and U+20A4 (LIRA SIGN): I'd suggest both glyphs for both characters, separated by a "|". _ Marco