Deseret keyboard (was:Re: Special Type Sorts Tray 2001)

2001-10-02 Thread DougEwell2

In a message dated 2001-10-02 22:04:41 Pacific Daylight Time, 
[EMAIL PROTECTED] writes:

>> I still live in hopes that someone, John or someone else, will one
>> day send me a Deseret keyboard layout that is at least SLIGHTLY
>> standard (meaning more than one person has ever used it).
>>
>> I need something I can download and read on a Windows machine.
>> Text or a GIF would be fine.
>
>  I have a hard time picturing what such a layout would *look* like... what
>  the heck would someone who uses the language expect, anyway? :-)

Well, careful now.  The language is English.  You mean "someone who uses the 
script."

I tried creating a Deseret keyboard for (and with) SC UniPad, using the 
Dvorak keyboard layout as a loose model.  By that I do not at all mean that I 
mapped Latin letters on the Dvorak keyboard to "equivalent" Deseret letters, 
but rather that I put the most common letters (as determined from a large 
chunk of text in Deseret) on the home row and relegated the least common 
letters to Alt+Gr (Ctrl+Alt) combinations.  The biggest problem, of course, 
is that there are 38 of the buggers and so these Alt+Gr combinations are 
necessary.

My keyboard is all right, I guess, but it is completely my own invention and 
I really know nothing about the engineering that goes into proper keyboard 
design.  I'd feel better with something designed by someone who had a clue, 
and/or something that has seen some actual use.  Not that there are an awful 
lot of users, mind you.

-Doug Ewell
 Fullerton, California




Re: Special Type Sorts Tray 2001

2001-10-02 Thread Michael \(michka\) Kaplan

From: <[EMAIL PROTECTED]>

> I still live in hopes that someone, John or someone else, will one
> day send me a Deseret keyboard layout that is at least SLIGHTLY
> standard (meaning more than one person has ever used it).
>
> I need something I can download and read on a Windows machine.
> Text or a GIF would be fine.

I have a hard time picturing what such a layout would *look* like... what
the heck would someone who uses the language expect, anyway? :-)

> I noticed that the LDS Church is listed as an associate member of Unicode.
I
> wonder if their representative might have anything.

I don't know if the reps have ever participated here or in the UTC?


MichKa

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/






Code points for "al-Qaeda"

2001-10-02 Thread DougEwell2

Like everyone else, I have suddenly become familiar in the past three weeks 
with the name "al-Qaeda," Arabic for "the base" and the name of Osama bin 
Laden's terror network.

I have also noticed the variations in pronunciation and romanized spelling, 
and being a bit more interested in such things than the typical American, it 
makes me curious:  How is "al-Qaeda" spelled in Arabic?

I know there are several list members who know the small amount of Arabic 
necessary to answer this question.  Please specify Unicode code points in the 
U+0600 block.

Thanks,

-Doug Ewell
 Fullerton, California




Re: Special Type Sorts Tray 2001

2001-10-02 Thread DougEwell2

In a message dated 2001-10-02 10:46:47 Pacific Daylight Time, 
[EMAIL PROTECTED] writes:

>> And I am sure Apple is hard at work on the Desert font and keyboard for Mac
>> OS 11? :-)
>
>  We've already added a Deseret glyph to the Last Resort font in 10.1. 
>  Beyond that, the Deseret Language Kit remains available at my Web 
>  site but doesn't work on X.  Yet.

I still live in hopes that someone, John or someone else, will one day send 
me a Deseret keyboard layout that is at least SLIGHTLY standard (meaning more 
than one person has ever used it).

I need something I can download and read on a Windows machine.  Text or a GIF 
would be fine.

I noticed that the LDS Church is listed as an associate member of Unicode.  I 
wonder if their representative might have anything.

-Doug Ewell
 Fullerton, California




Re: Special Type Sorts Tray 2001

2001-10-02 Thread John Hudson

At 09:27 10/2/2001, John H. Jenkins wrote:

>The current generation of font tools does not generally allow the creation 
>of a glyph in a font without assigning it a code-point of some sort.  As a 
>result, there are a number of fonts out there that have PUA code points 
>assigned to them, but *not* as a means of promoting interchange of these 
>glyphs in plain text, but as a means of easing the font production process.

That is about to change dramatically with the release of FontLab 4.0, in 
which the presence or absence of codepoints for glyphs is explicit and 
controlled by the font developer.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

Type is something that you can pick up and hold in your hand.
   - Harry Carter





Re: Special Type Sorts Tray 2001 (derives from Egyptian TransliterationCharacters)

2001-10-02 Thread Peter_Constable



>I feel that this is a matter that needs to be formally resolved one way or
>the other, so that, if such a refusal has been declared then people who wish
>to have these characters encoded may act knowing that the Unicode Consortium
>will have legally estopped itself from making any future complaint that it
>has some right to set the standards in such a matter and that those people
>who would like to see the problem solved and ligatured characters encoded as
>single characters so that a font can be produced may proceed accordingly...
>perhaps approaching the international standards body directly if the Unicode
>Consortium refuses to do so without a process of even considering individual
>submissions on their individual merits...  

This is all based on false assumptions and reflects a lack of understanding of Unicode and the technologies it is designed to work together with. The problem *has* been solved and does not require ligated *glyphs* to be encoded as distinct characters. You can see implementations working very nicely, for example, with Arabic or Devanagari ligatures in Notepad (or MS Word 2000) on any Windows 2000 system. Not only *can* production of fonts proceed accordingly, but such fonts already exist and are distributed in shipping products. MS products do not yet support Latin ligatures, but that is not an encoding problem -- it is a problem with the particular products in question (and MS is working to address it -- expect to see support for Latin ligatures in the next version of Office due out next year). There are other software products that do support Latin ligatures today without requiring them to be encoded as distinct characters.

Moreover, the Unicode Consortium does not have to concern itself with legal rights regarding what does or does not get encoded -- it owns the Unicode standard, and can decide to encode or not to encode as it sees fit. The Consortium has entrusted those decisions to its Technical Committee, and that committee has decided to work with implementation principles that do not in general require ligature glyphs to be encoded as distinct characters.

Furthermore, the Unicode Technical Committee will always, and does, consider *any* submission on their individual merits. Submitters do not always end up satisfied with the conclusions reached by the Committee, but that is another issue. Also, trying to by-pass UTC by going directly to ISO is not going to change anything since the corresponding ISO committee uses the same implementation principles (they are the ones that wrote the character-glyph model document, ISO/IEC TR15285 -- can be obtained for free from http://isotc.iso.ch/livelink/livelink/fetch/2000/2489/Ittf_Home/ITTF.htm), and by mutual agreement nothing gets encoded by one committee unless ratified by the other.

If you're needing to see something in print, try section 2.2 "Unicode Design Principles" of TUS3.0, specifically the sub-section entitled "Characters, Not Glyphs".



>I feel that it would be quite wrong to pull up the ladder on the possibility
>of adding characters such as the ct ligature as U+FB07 without the
>possibility of consideration of each case on its merits at the time that a
>possibility arises.  

The merits have been considered, weighed in the balance and found wanting. The fact that a ct ligature at FB07 is *not* needed is illustrated by the fact that you can produce that ligature from an encoded sequence of < c, t > in (for example) Adobe InDesign using appropriate fonts (such as Adobe Minion Pro).



>If the possibility of fair consideration is, however, still open, then the
>ct ligature could be defined as U+E707 within the private use area and
>published as part of an independent private initiative amongst those members
>of the unicode user community that would like to be able to use that
>character in a document by the character being encoded as a character in an
>ordinary font file.  That would enable font makers to add in the ct
>character if they so choose.

You can look for others with which to make a private agreement if you so choose, but don't expect the major type foundaries to encode a ct-ligature glyph at e707: they already know that they don't need to, and a number of fonts already include it without having resorted to direct encoding.


>My point is that the specification purports to lay down the rules, yet there
>seems to be many other pieces of information that seem to be "understood" on
>a nudge nudge basis 

Not at all. If you were to attend a conference, you would find sessions discussing some of these implementation issues. If you were a professional font developer, then you would find these issues discussed at professional conferences such as ATypI, and you would probably already know of resources that explain them on the web. This is not secretive stuff; the VOLT user community, for example, has over 1700 members -- these are people interested in development of OpenType fonts that handle exactly the kind of 

Re: Special Type Sorts Tray 2001

2001-10-02 Thread Peter_Constable



Doug Ewell wrote:

>You might start by checking existing fonts, especially those shipped with
>major operating systems, to see what PUA code points are commonly used
>internally for glyphs not associated with a standard Unicode character.  

Fonts that are designed to work with advanced rendering technologies and that contain presentation-form glyphs such as a ct-ligature do not have to encode those glyphs in the PUA. The transformations that convert sequences of characters into sequences of positioned glyphs are all done entirely in terms of glyph identifiers (such as Postscript names), which are purely a font-internal thing and have nothing whatsoever to do with character encoding.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>


Re: Shape of the US Dollar Sign

2001-10-02 Thread Jonathan Coxhead

   I can't resist transcribing the following, which is a quotation from 
_Love_and_Sleep_ by John Crowley (Bantam Books, 1994). (It's fiction.)

 |There are many Monarchs, and many Princes, but only one Emperor. Rudolf
 | II, King in his own right of Hungary and Bohemia, Archduke of Austria,
 | became Emperor by election and the chrism with which the Pope had anointed
 | his head: Singular and Universal Monarch of the Whole Wide World. Or at
 | least his shadow.
 |
 |His grandfather Charles, who had been king of all the lands Rudolf was
 | king of, had been king in Spain too, ruler of the Netherlands and Low
 | Countries; he was king of Savoy, lord of Naples and Sicily, he had had the
 | Pope at his feet and sacked His City, Rome. God's scourge. Charles had had a
 | device made for him, of all the famous devices and signs and emblems of
 | great rulers the most famous, known and seen throughout Christendom and in
 | lands around the world that the old emprerors in Rome had never known
 | existed. Charles's emblem showed two pillars---they were the Pillars of
 | Hercules that stand at the Gates of the Sea, the gates to the New World.
 | Around these pillars ran a banner, that bore these words: _Plus_oultra_,
 | "Even farther." The emblem was cut on medals and embossed on shields and
 | breastplates, it was engraved on wood and printed on the title pages of
 | geographies of the New World, and it was stamped on coins made of gold
 | that was dug on the other side of the world. The emblem was so famous that
 | it went on being stamped on gold coins for long after Charles was dead, for
 | so long that the dies lost their details, and the words of the motto were
 | worn away, and still it kept being stamped on Spanish coins, though all that
 | was left to be seen were the two pillars and the twining banner, no longer
 | meaning "Empire" or "Charles" or "Even farther" but only "dollar":
 |
 |   $
 |
 | No kingdom is eternal.





Emails in Chinese

2001-10-02 Thread Magda Danish (Unicode)
Title: Message



 

-Original Message-From: Jennifer David 
[mailto:[EMAIL PROTECTED]] Sent: Friday, September 28, 2001 10:22 
PMTo: [EMAIL PROTECTED]Subject: 
Hello friends at Unicode,
 
I am wondering if you could tell me why I can send an E-mail 
in Chinese characters to a friend in China who can recieve it clearly, but when 
they write me in Chinese I receive a scrambled message that doesn't resemble 
Chinese writing. I am currently using universal translator 2000 supported by 
unicode to send them E-mails in Chinese. If they don't use unicode could this be 
why their messages are scrambled to me? If so, please let me know what I must do 
to set them up for proper communication. Your time and consideration is greatly 
appreciated. 
 

  
Respectfully,
   
Shawn David.


RE: Unicode IPA chart

2001-10-02 Thread Marco Cimarosti

Rick McGowan wrote:
> > Anyone knows where I could find an online chart of the International
> > Phonetic Alphabet encoded in Unicode (plain text or HTML)?
> > Thanks in advance.
> > _ Marco
> 
> Try the charts!
> 
>   http://www.unicode.org/charts/

Seeking a page to explain what I was looking for... I simply found it:

http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm

Thank you.

_ Marco




RE: Special Type Sorts Tray 2001

2001-10-02 Thread Carl W. Brown

MichKa,

>
> And I am sure Apple is hard at work on the Desert font and
> keyboard for Mac
> OS 11? :-)
>

Getting the scripts defined will allow third parties to add support to most
operating systems for specific languages that are not supported by the
standard offerings.

The big deal will be getting all the Unicode software that was written for
UCS-2 changed to support UTF-16 or UTF-32.  I think that GB18030 will be the
big factor.  You can not convert it to Unicode without extended plane
support.  Processing it in code page is a mess.

Carl






Re: Egyptian Transliteration Characters

2001-10-02 Thread Peter_Constable



>3. a capital and small glottal stop and reversed glottal stop

>For (2), (3), we would need a submission with documentation of usage. We do
>add capital/small versions of characters when there is sufficient evidence
>of their usage. This happens, for example, when an IPA is pressed into
>service in the regular orthography of a language.
>
>To submit a proposal, go to www.unicode.org, click on "submitting proposals"
>(you may already be following that, since it recommends discussing proposals
>on this list!)

I recently learned of some languages using upper and lower case glottal stops. I don't have details at the moment, but have anticipated writing a proposal once the linguists involved provide further info.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>


Re: Egyptian Transliteration Characters

2001-10-02 Thread Peter_Constable



>At 09:13 -0500 2001-09-26, David Starner wrote:
>
>>The problem is, I have a couple of German texts that I plan to
>>transcribe, where all I need is HYPHEN WITH DIARESIS.
>
>So, you type HYPHEN or EN DASH and then COMBINING DIAERESIS ABOVE.

It isn't obvious to me that this is the correct solution: first, one needs to decide whether 002d, 2010, 2011, 2012, 2013 or 2212 will be used, and then try to ensure that that is what is consistently used. More importantly, though, there is a question as to whether any of these has the appropriate character properties. For instance, I'm guessing that the line-breaking properties would be wrong for this usage.

It would be possible to add a new character DASH WITH DIAERESIS as long as it does not have any decomposition.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>


Re: Ḧÿp̈ḧën̈ ̈ẅïẗḧ ̈d̈ïäër̈ïs̈ ̈äb̈öv̈ë

2001-10-02 Thread Peter_Constable


>It doesn't look correct either:
>
>-̈ –̈ —̈
>
>In the first case, it's too far to left. In the last case it's too far to
>the right. In all three cases it's too far high above the hyphens (at least
>in the font I'm displaying this message with).

This naively assumes that rendering of combining marks can be done using default glyph metrics alone. This is simply not the case -- complex rendering requires a "smart font" rendering technology like AAT, Graphite or OpenType+Uniscribe|CoolType. All three combinations can be made to look good using any of the three technologies mentioned.

These things are well understood by people implementing support for scripts like Devanagari, Arabic or Myanmar. What many people still need to learn is that Latin is also a "complex" script.


- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>



Re: Special Type Sorts Tray 2001

2001-10-02 Thread John H. Jenkins

At 10:13 AM -0700 10/2/01, Michael (michka) Kaplan wrote:
>And I am sure Apple is hard at work on the Desert font and keyboard for Mac
>OS 11? :-)
>

We've already added a Deseret glyph to the Last Resort font in 10.1. 
Beyond that, the Deseret Language Kit remains available at my Web 
site but doesn't work on X.  Yet.

-- 

John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/




Re: Unicode IPA chart

2001-10-02 Thread Michael Everson

At 10:20 -0700 2001-10-02, Rick McGowan wrote:
>  > Anyone knows where I could find an online chart of the International
>>  Phonetic Alphabet encoded in Unicode (plain text or HTML)?
>  > Thanks in advance.
>
>Try the charts!
>
>   http://www.unicode.org/charts/

No, he meant the IPA chart, not the IPA page in Unicode.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)




Re: Special Type Sorts Tray 2001

2001-10-02 Thread Michael \(michka\) Kaplan

From: "John H. Jenkins" <[EMAIL PROTECTED]>

> At 5:28 PM +0100 10/2/01, Michael Everson wrote:
> >
> >The CSUR is maintained to support scripts of various kinds. Some of
> >those (Shavian, Deseret, Tengwar, Cirth) are expected to "graduate"
> >into Unicode.
>
> And one of them already has!

And I am sure Apple is hard at work on the Desert font and keyboard for Mac
OS 11? :-)


MichKa

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/






Re: Unicode IPA chart

2001-10-02 Thread Rick McGowan

> Anyone knows where I could find an online chart of the International
> Phonetic Alphabet encoded in Unicode (plain text or HTML)?
> Thanks in advance.
> _ Marco

Try the charts!

http://www.unicode.org/charts/

Rick





Re: Shape of the US Dollar Sign

2001-10-02 Thread Jeff Guevin

> From: Michael Everson <[EMAIL PROTECTED]>
> 
> I find the double-barred dollar sign a bit old-fashioned looking.
> Reminds me of money clips and monopoly games.

I rather like it.  Especially in handwriting
-- 
Jeff Guévin
Staff Coordinator
The University Professors
Boston University








Re: Special Type Sorts Tray 2001

2001-10-02 Thread John H. Jenkins

At 5:28 PM +0100 10/2/01, Michael Everson wrote:
>
>The CSUR is maintained to support scripts of various kinds. Some of 
>those (Shavian, Deseret, Tengwar, Cirth) are expected to "graduate" 
>into Unicode.

And one of them already has!

-- 

John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/




Unicode IPA chart

2001-10-02 Thread Marco Cimarosti

Anyone knows where I could find an online chart of the International
Phonetic Alphabet encoded in Unicode (plain text or HTML)?

Thanks in advance.
_ Marco





Re: plane business

2001-10-02 Thread Asmus Freytag

At 10:42 PM 10/1/01 -0700, Bernard Miller wrote:

>--- Asmus Freytag <[EMAIL PROTECTED]> wrote:
> > There are 66 non-characters as of Unicode 3.1, there
> > were 34 non-characters
> > before.
>
>I understand now.. the non characters in 16 higher
>planes were defined first, then the ones in the arabic
>presentation forms block. In this case it is as I
>suspected, just a documentation problem. The book says
>"None of these surrogate pairs has been ASSIGNED in
>this version of the standard" (emphasis mine).

There are three types of things that can be stated for
a code point (code point, not character)
- allocation
- designation
- assignment
Allocation refers to whether the code point is part of
the standard - allocation changed once in the life of
Unicode to include the range 0x1-0x10.

Designation refers to the status as character, non-
character, surrogate, private use character, etc.
Designation changed twice in Unicode, once to
designate the surrogates, and once to designate
the 32 characters on the BMP as non-characters.

Assignment refers to assigning a character to a
code point. New assignments are made all the time,
as new characters are added to the standard.
In the early history of Unicode, assignments changed
twice, once to reflect the merger with 10646, and
once to add the Korean Hangul. Future assignment
changes are restricted to adding new assignments.

Because people easily confuse code points and characters,
few people make the distinction between allocation,
designation, and assignment. New text being
drafted for Unicode 4.0 will clarify these terms.

>It
>would merely be misleading to not mention 32 non
>characters in the section called "non characters" and
>to state that there are no characters in the higher
>planes as of Unicode 3.0; but I think we have a bona
>fide incorrect statement to say that no surrogate pair
>has been ASSIGNED when in fact 32 surrogate pairs were
>assigned the status of non characters.

As you can see from the above, they were "designated"
and not "assigned".

> > The reason to put the additional (defined in 3.1)
> > non-characters into the BMP is to allow them to
> > have single codes for UTF-16 implementation -
> > something that doesn't
> > work so well if they are on the higher planes.
>
>I don't understand this, the "arabic" non characters
>are supposed to REPRESENT the "hidden" non characters?

No, implementors in the UTC simply demonstrated a need
to have 32 non-character code points - code points that
they would be free to use internally because they would
never be a legal part of any interchanged data.

For UTF-16 implementations, using the 32 supplementary
non-characters would have forced them to use surrogate
pairs, which is awkward for the kinds of use intended
for internal-use code points. That's why 32 code points
in the BMP were re-designated from 'reserved' to
'non-character'.

A./




Re: Special Type Sorts Tray 2001

2001-10-02 Thread John H. Jenkins

At 11:43 AM -0400 10/2/01, [EMAIL PROTECTED] wrote:
>
>You might start by checking existing fonts, especially those shipped with
>major operating systems, to see what PUA code points are commonly used
>internally for glyphs not associated with a standard Unicode character.  I
>know that several Windows fonts have privately assigned glyphs, and I assume
>the same is true for Macintosh fonts.

The current generation of font tools does not generally allow the 
creation of a glyph in a font without assigning it a code-point of 
some sort.  As a result, there are a number of fonts out there that 
have PUA code points assigned to them, but *not* as a means of 
promoting interchange of these glyphs in plain text, but as a means 
of easing the font production process.

>Also, maybe the various font makers
>who haunt this list could contribute any guidelines they know of for
>quasi-standardizing these code points.

Adobe has a list somewhere at its site of how it uses the PUA.  Apple 
also pubishes its PUA use. 

>Obviously, you are hoping that
>standardizing the code points could lead to some measure of interoperability;
>otherwise there would be no discussion.  If all you want is to encode the ct
>ligature in a font, you can use any old PUA character you wish, conformantly.
>
>OTOH, private creation of quasi-standards on the part of vendors is not
>necessarily a good thing.  It is the sort of thing that the public tends to
>vilify Microsoft for doing.

The purpose for both Adobe and Apple, at least, in making their PUA 
use public is to avoid collision more than to promote interchange. 
There is near-universal agreement that the way to get MS Word to 
handle ligatures correctly is for it to beef up its OT/AAT support.

>If you want to interchange the ct ligature and the long-s ligatures, you can
>do that right now.  Just encode  or . 
>Then, rendering engines that have a glyph for the desired ligature can render
>it, and those that don't will fall back to the individual characters
>(assuming they are conformant).  This approach has at least three major
>advantages:
>
>(1)  It is already supported by the Unicode Standard.
>(2)  It provides a standard interchange mechanism without requiring font
>vendors to agree on the code point used for the precomposed glyph.
>(3)  It provides a sensible fallback mechanism for the great majority of
>fonts that, let's admit it, will not have these specialized glyphs.

BTW, I'm not aware that anybody is revising their fonts to handle ZWJ this way.

Anyway, there is is a long-standing argument on this subject, and 
unless I misremember the official position of the UTC, this approach 
--specifying ligation control in plain text -- is not considered the 
best mechanism in Latin typography.

The problem is that ligation control is *very* font-specific in Latin 
type.  Different fonts will have different sets of ligatures 
available to them -- you can compare the set of ligatures in a font 
like Courier (which has fi and fl only because MacRoman forced them 
to be present and should, typographically, have no ligatures at all), 
with the set in a font like Adobe Garamond Pro, with the set in 
Hoefler Text, with the set in Zapfino.

On the whole, one cannot assume that the user can even anticipate the 
set of ligatures that the type designer will consider appropriate for 
their typeface.  It's only when you have the typeface specified that 
you can meaningfully begin to specify the set of ligatures to use.

The consistent approach of font vendors towards the problem if 
ligation is not to include the request for them in plain text, and 
definitely *not* to use distinct code points to represent them.
-- 

John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/




RE: Shape of the US Dollar Sign

2001-10-02 Thread Michael Everson

I find the double-barred dollar sign a bit old-fashioned looking. 
Reminds me of money clips and monopoly games.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)




Re: Special Type Sorts Tray 2001

2001-10-02 Thread Michael Everson

At 11:43 -0400 2001-10-02, [EMAIL PROTECTED] wrote:
>[EMAIL PROTECTED] writes:
>
>>>  You might want to take a look at the ConScript Unicode Registry, which was
>>>  originally intended for "constructed" and artificial scripts, but which
>>>  could also be used for this purpose.
>>
>>   No, it couldn't. It's for constructed and artificial scripts, not for
>>   precomposed Latin glyphs.
>
>I stand corrected. But there is no reason William couldn't initiate his own
>registry, along the lines of CSUR, for the purpose of assigning PUA code
>points to precomposed Latin glyphs. Just don't expect the characters thus
>added to "graduate" somehow into Unicode.

The CSUR is maintained to support scripts of various kinds. Some of 
those (Shavian, Deseret, Tengwar, Cirth) are expected to "graduate" 
into Unicode. But those are legitimate scripts with legitimate users, 
and they can't be represented in Unicode otherwise.

William, I have a number of papers about using the ZWJ to force 
ligation. I am interested in the problem; perhaps those papers may be 
of interest to you.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)




Re: Special Type Sorts Tray 2001 (derives from Egyptian Transliteration Chara...

2001-10-02 Thread DougEwell2

Oops, I forgot something.

In a message dated 2001-10-02 4:50:03 Pacific Daylight Time, 
[EMAIL PROTECTED] writes:

>  if such a refusal has been declared then people who wish
>  to have these characters encoded may act knowing that the Unicode 
Consortium
>  will have legally estopped itself from making any future complaint that it
>  has some right to set the standards in such a matter 

The Unicode Consortium is a private, not-for-profit organization.  ISO/IEC 
JTC1/SC2/WG2 is an international standards working group.  I don't believe 
either is subject to the legal principle of estoppel.  Essentially, if they 
want to they can play Calvinball with the standard they are creating, 
although we all hope that does not happen.

-Doug Ewell
 Fullerton, California

("Calvinball" comes from the American comic strip "Calvin and Hobbes," in 
which a young boy plays a game with his stuffed tiger who comes to life, the 
main rule of which game is that the boy, Calvin, can change the rules at any 
time.)




Re: Special Type Sorts Tray 2001

2001-10-02 Thread DougEwell2

In a message dated 2001-10-02 4:50:03 Pacific Daylight Time, 
[EMAIL PROTECTED] writes:

>  Is there an official Unicode Consortium statement that states, for the
>  record, that the Unicode Consortium refuses to encode more ligatures and
>  precomposed characters please?

I'm pretty sure there is, since it has been brought up so often by UTC 
members on this list.  If there is no such statement, then one should be 
drafted.

>  I feel that this is a matter that needs to be formally resolved one way or
>  the other, so that, if such a refusal has been declared then people who 
wish
>  to have these characters encoded may act knowing that the Unicode 
Consortium
>  will have legally estopped itself from making any future complaint that it
>  has some right to set the standards in such a matter and that those people
>  who would like to see the problem solved and ligatured characters encoded 
as
>  single characters so that a font can be produced may proceed accordingly,
>  perhaps approaching the international standards body directly if the 
Unicode
>  Consortium refuses to do so without a process of even considering 
individual
>  submissions on their individual merits.  On the other hand, if no such
>  formal statement has been issued, then those people who would like to see
>  the problem solved and ligatured characters encoded as single characters so
>  that a font can be produced for use with software such as Microsoft Word 
may
>  proceed to define characters in the private use area in a manner compatible
>  with their possible promotion to being regular unicode characters in the
>  presentation forms section.

Was that only two sentences?  Wow

Regarding the "refusal" to encode more ligatures and precomposed presentation 
forms: It is not arbitrary.  There is a reason why Unicode will not encode 
these things.  They would interfere with the established standard for 
decomposition.  Now that Unicode has reached its present level of popularity, 
some vendors and implementations (and standards) require a stable set of 
decomposable code points.  That set is Unicode 3.0.  If new precomposed 
characters were added, engines and standards that were built to the new 
standard would decompose them differently from those built to the old 
standard, and this is not acceptable to those who need decomposition to work 
at all.

Precomposed characters and ligatures won't be considered "on their individual 
merits," and they won't be "promoted" from a private standard to true Unicode 
character status, because the decomposition problem is bigger than the 
individual merits.  Note that I personally like the ct ligature and think it 
would be a great thing to have in a font.  If this were 1993, perhaps it 
might have been encoded.

Regarding fonts: Nothing is stopping you or anyone else from making a font 
with these precomposed glyphs and associating them with Unicode PUA (Private 
Use Area) code points.  That is an excellent illustration of a possible use 
of the PUA, and many, many font vendors do just that.  

>  I feel that it would be quite wrong to pull up the ladder on the 
possibility
>  of adding characters such as the ct ligature as U+FB07 without the
>  possibility of consideration of each case on its merits at the time that a
>  possibility arises.  A situation would then exist that several ligatures
>  have been defined as U+FB00 through to U+FB06 including one long s 
ligature,
>  yet that U+FB07 through to U+FB12 must remain unused even though they could
>  be quite reasonably used for ct and various long s ligatures so as to
>  produce a set of characters that could be used, if desired, for 
transcribing
>  the typography of an 18th Century printed book.  Yet, if the ladder has 
been
>  pulled up, perhaps U+FB07 can be defined as the ct ligature directly by the
>  international standards organization and the international standards
>  organization could decide directly about including the long s ligatures.

The organization you are talking about is ISO/IEC JTC1/SC2/WG2.  They are 
firmly committed to maintaining compatibility between Unicode and ISO/IEC 
10646.  Sorry, but this is a good thing.

>  If the possibility of fair consideration is, however, still open, then the
>  ct ligature could be defined as U+E707 within the private use area and
>  published as part of an independent private initiative amongst those 
members
>  of the unicode user community that would like to be able to use that
>  character in a document by the character being encoded as a character in an
>  ordinary font file.  That would enable font makers to add in the ct
>  character if they so choose.

You might start by checking existing fonts, especially those shipped with 
major operating systems, to see what PUA code points are commonly used 
internally for glyphs not associated with a standard Unicode character.  I 
know that several Windows fonts have privately assigned glyphs, and I assume 
the same is true for Macint

RE: [OT] Roman numeral arithmetic

2001-10-02 Thread Ayers, Mike


> From: Edward Cherlin [mailto:[EMAIL PROTECTED]] 
> Sent: Saturday, September 29, 2001 05:55 PM
> 
> If we omit the later use of subtractive notation (iv=4, xc=90 
> etc.), the original Roman numerals are exactly equivalent to 
> the Chinese abacus where each wire holds four beads below the 
> bar (value I, X, C, M) and one above (value V, L, D, U+2181). 
> It is well known that practiced abacists could beat users of 
> mechanical adding machines in multi-column addition and 
> subtraction. The same technique is taught (under the Korean 
> name Chisanpeop) for two-column finger arithmetic, using the 
> thumbs for the five beads and the other fingers for the one beads.

Maybe, but I think you may be missing a point: subtractive notation
was an improvement (or so I believe).  I will use the same examples and my
finest ASCII graphics. Note: fixed width font required.

> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED]]On
> > Behalf Of James Kass
> > Sent: Sat, September 22, 2001 4:52 PM
> > 
> > Doug Ewell wrote:
> > 
> > > >> I would be fascinated to see some sort of evidence 
> > that addition and
> > > >> subtraction is easier in Roman numerals than in 
> > Hindu-Arabic ("European")
> > > >> numerals.
> > > >
> > > >  I + I = II
> > > >  X + X = XX
> > > >  X + X + X = XXX
> > > >  C + X = CX
> > > >  CX - X = C
> > >
> > > For these carefully chosen examples, sure, but what about:
> > >
> > > III + IX = XII
> 
> III+IX = III + V = VIII = XII

First, subtractive notation gives us the opportunity to perform
preoperations on each final higher order place, so that some calculations
occur solely during the simplification:

   +Cancel from the right
   V
  +---+
  |   |+---+
  |   ||   |<---Keep
III + IX = XII
||  ||
|+--+|
++
   ^
   +-Keep

> > > XXIV + XXVII = LI
> 
> XXIV + XXVII = XX + XXVII = VII = XI = LI

Next, we see that like types are handled individually for the first
part of the operation, followed by combination:

++-++--
||+||-+<------Cancel
||||| |
XXIV + XXVII = VVI = XI = LI
   | | |   |||
   +-+-|---++|
   +-+

> > > C - I = XCIX
> 
> C - I = LX - I = LVI - I = LV = XCIX

Now, a bit that seems quite tricky at first.  Each symbol can
(during computation only) be expanded into a subtractive-additive form,
using the next lower level symbol (ignoring groupings of five):

C - I = XCX - I = XCIXI - I = XCIX

> 
> Let's get serious. Try 1984 + 1066.
> 
> MCMLXXXIV + MLXVI = MDLXXX + MLXVI = MM DLL VI
> = MMML = 3050

Ugh.  That was an anticase.  Try:

MCMLXXXIV + MLXVI = MMCMLLVV = MMML
  | ||
  | ++
Cancel--->|  |
  |  C
  |  |
  +--+

> 
> > > etc.  This is no better than European digits, and it 
> > feels a little like
> > > doing math with pounds, shillings, and pence.

Actually, get a little used to it and you'll find it easier than
decimal addition and subtraction.  This is the force of habit at play.
Decimal mathematics are done purely by rote memorization of the tables, then
using combining techniques.  It is those combining techniques that give it
power and flexibility, especially where higher order operations are
concerned, and make up for the poor results of the memorized tables.

> Lsd is a simple mixed base.

Says you and Timothy Leary.

;-)


/|/|ike




RE: Shape of the US Dollar Sign

2001-10-02 Thread Ayers, Mike


> From: G. Adam Stanislav [mailto:[EMAIL PROTECTED]] 
> Sent: Monday, October 01, 2001 12:07 PM

> Send him a check instead. Every single US check I have ever seen had
> a dollar sign printed to the left of the field where the 
> numeric amount
> is to be entered. They all use the same glyph regardless of the rest
> of the design of the check.

Not necessarily.  I don't recall looking, but any commonality here
is bound to be coincidence.

> That glyph is the S with a single vertical bar. That does not make it
> the official legal glyph (I doubt we have one), though.

There is no "official" dollar sign, unless it's a really well kept
secret.  In fact the dollar sign rarely appears in governmental publications
(it probably shows up a bit these days, but previously has been very rare).

> I grew up in Slovakia, and we were taught to draw the US dollar sign
> with two vertical bars. I recall my surprise when I came to the US
> and saw the single-bad dollar sign. I asked my American born friend
> about it, and he insisted that in America the dollar sign is always
> drawn with a single vertical bar.

He was either putting you on or misinformed.  One bar, two bars - as
we say in America, "it's all good".

> Heh, then the computer revolution started, and suddenly I started
> noticing dollar signs that looked like S with just a tiny scratch
> above and below but not all the way across. Go figure. :)

That has nothing to do with the computer revolution.  The "S with
vertical bars above and below" has been around a while.  let me digress a
bit...

As I understand it, the original dollar sign did have two bars.  The
single bar version came into play because it was easier to make movable type
with the single bar glyph, the double bar glyph requiring very hard metals
and the appropriate tools, and therefore requiring more expensive type.

Likewise, even the single bar was a bit too much for cheap rubber
type, so the bar was removed from inside the "S" curves of the single bar
glyph to accomodate that (otherwise the ink would run into the blank
semicircles and you would just have a blob).  If you go to a minimart that
still uses a price gun, you may have the treat of seeing this glyph, along
with its sister, the "c with vertical bars above and below".


/|/|ike




Re: Special Type Sorts Tray 2001 (derives from Egyptian Transliteration Characters)

2001-10-02 Thread Michael \(michka\) Kaplan

From: "William Overington" <[EMAIL PROTECTED]>

> Is there an official Unicode Consortium statement that states, for the
> record, that the Unicode Consortium refuses to encode more ligatures and
> precomposed characters please?

I think it is quite clearly stated that the ones that ARE present are there
for backwards compatibility with pre-existing standards. Not sure why you
feel that it is important to do more than this? Perhaps the standard is not
applying as much verbiage to it as you would like it to -- but the point is
just as valid in a sentence as in a chapter.

If you like, you can propose such characters -- even a completely
preposterous proposal (which this is not!) would not be ignored. If it is
refused, then you can understand that the people here are trying to guide
your noble (but in my humble opinion misplaced) effort to use Unicode in
some way (any way) that it is not in fact why its customers need to use it.

> It is unfortunate that an attempt to quite
> happily seek to use the private use area as set out in the specification,
> where the word "published" is used, seems to become controversialized.

I think you are misunderstanding the intentions of the people who have been
commenting. Your ideas are not "bad" or "wrong" or "controversial". Some of
them simply do not mesh with the intentions of Unicode in every case. People
who comment are not claiming "controversy" since these decisions have
already been made and do not need to be made again.

I think I stated a long time ago that there is much useful work that COULD
be done, long before anyone will be bored enough to want to invent new
standards such as STST2001 which really do not mesh with the present goals
of Unicode. Will you not apply some of the boundless energy that you give to
STST into some of those items?

Obviously Unicode is not a place to go for fame or glory, or to be
remembered for all time as the person who invented __ (fill in the blank
here). But it is still useful work that many people will use. And people
appreciate Unicode best when they do not notice it. :-)


MichKa

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/







Re: Special Type Sorts Tray 2001

2001-10-02 Thread Michael Everson

At 14:22 -0400 2001-09-30, [EMAIL PROTECTED] wrote:

>You might want to take a look at the ConScript Unicode Registry, which was
>originally intended for "constructed" and artificial scripts, but which could
>also be used for this purpose.

No, it couldn't. It's for constructed and artificial scripts, not for 
precomposed Latin glyphs.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Telephone +353 86 807 9169 *** Fax +353 1 478 2597 (by arrangement)




Re: Special Type Sorts Tray 2001 (derives from Egyptian Transliteration Characters)

2001-10-02 Thread William Overington

>> Maybe someday some of the characters might be promoted to become regular
>> unicode characters by the Unicode Consortium, maybe not.
>
>Not likely. Unicode refuses to encode more ligatures and precomposed
>characters.
>

Is there an official Unicode Consortium statement that states, for the
record, that the Unicode Consortium refuses to encode more ligatures and
precomposed characters please?

I feel that this is a matter that needs to be formally resolved one way or
the other, so that, if such a refusal has been declared then people who wish
to have these characters encoded may act knowing that the Unicode Consortium
will have legally estopped itself from making any future complaint that it
has some right to set the standards in such a matter and that those people
who would like to see the problem solved and ligatured characters encoded as
single characters so that a font can be produced may proceed accordingly,
perhaps approaching the international standards body directly if the Unicode
Consortium refuses to do so without a process of even considering individual
submissions on their individual merits.  On the other hand, if no such
formal statement has been issued, then those people who would like to see
the problem solved and ligatured characters encoded as single characters so
that a font can be produced for use with software such as Microsoft Word may
proceed to define characters in the private use area in a manner compatible
with their possible promotion to being regular unicode characters in the
presentation forms section.  The absence of a formal statement coupled to an
informal nudge nudge wink wink everybody knows what is meant but it will not
be set out as a formal statement is not, in my own opinion, an acceptable
situation, so I ask please for formal clarification of the claimed refusal
one way or the other.

I feel that it would be quite wrong to pull up the ladder on the possibility
of adding characters such as the ct ligature as U+FB07 without the
possibility of consideration of each case on its merits at the time that a
possibility arises.  A situation would then exist that several ligatures
have been defined as U+FB00 through to U+FB06 including one long s ligature,
yet that U+FB07 through to U+FB12 must remain unused even though they could
be quite reasonably used for ct and various long s ligatures so as to
produce a set of characters that could be used, if desired, for transcribing
the typography of an 18th Century printed book.  Yet, if the ladder has been
pulled up, perhaps U+FB07 can be defined as the ct ligature directly by the
international standards organization and the international standards
organization could decide directly about including the long s ligatures.

If the possibility of fair consideration is, however, still open, then the
ct ligature could be defined as U+E707 within the private use area and
published as part of an independent private initiative amongst those members
of the unicode user community that would like to be able to use that
character in a document by the character being encoded as a character in an
ordinary font file.  That would enable font makers to add in the ct
character if they so choose.

My point is that the specification purports to lay down the rules, yet there
seems to be many other pieces of information that seem to be "understood" on
a nudge nudge basis and that words that are in the specification about the
private use area such as "published" seem to be overlooked in discussions of
using the private use area.  It is unfortunate that an attempt to quite
happily seek to use the private use area as set out in the specification,
where the word "published" is used, seems to become controversialized.

William Overington

2 October 2001









RE: Currency symbols (was RE: Shape of the US Dollar Sign)

2001-10-02 Thread Marco Cimarosti

Yves Arrouye wrote:
> > About "£" (L with two bars = "Italian lira" or 
> "Egypt/Cyprus pound") and
> > "£"
> > (L with one bar = "Pound Sterling" or "Irish punt"), I 
> think that the
> > Unicode distinction is not valid because:
> > 
> > [...]
> >
> > For these reason, I suggest that font designers ignore the 
> distinction
> > between U+00A3 (POUND SIGN) and U+20A4 (LIRA SIGN) and use 
> the same glyph
> > for both.  The glyphs should have one or two bars depending 
> on the font
> > style and on the choice made for other currency symbols.
> 
> Interesting comment. Isn't the Unicode distinction simply one 
> of characters,

Sure.  In fact, I did not discuss the existence of these two different
versions of "£" in Unicode. There may be lots of reason for Unicode to have
defined two duplicates for the same symbol; a frequently seen reason is
compatibility with existing standards.

What I say is that I see no reason to keep them visually distinguished in
fonts.

But I also dispute the correctness of the annotations on U+00A3 (POUND SIGN)
and U+20A4 (LIRA SIGN):

00A3POUND SIGN
= pound sterling, Irish punt
x (lira sign - 20A4)
...
20A4LIRA SIGN
* Italy, Turkey
x (pound sign - 00A3)

I'd find the entries more correct like this:

00A3POUND SIGN
* Britain, Egypt, Ireland, Italy, etc.
x (number sign - 0023)
x (lira sign - 20A4)
x (l b bar symbol - 2114)
x (square pondo - 3340)
  x (square rira - 3352)
x (fullwidth pound sign - FFE1)
...
20A4LIRA SIGN
* Italy, Turkey, etc.
x (number sign - 0023)
x (pound sign - 00A3)
x (l b bar symbol - 2114)
x (square pondo - 3340)
x (square rira - 3352)
x (fullwidth pound sign - FFE1)

I know that this is probably impossible, but I'd also add a compatibility
mapping:

20A4LIRA SIGN
...
#  00A3

> and the difference in glyphs shown in the standard simply a 
> reflection of
> the preferences of the designer of the fonts used to print 
> the character
> tables? I'd think so.

Yes, and no.  I think that the choice of fonts for the charts reflects many
editorial needs.  One of these criteria was clearly to choose fonts which
are quite "classic" and neutral (e.g. a roman type for Western scripts).
Another criterion was probably to deliberately show some little difference
between similar characters, in order to distinguish them in indexes (e.g., I
know that this was the reason for choosing a sans-serif font for the KangXi
radicals, as opposed to a more classical font for other Han characters).

But, in some cases, I think that the representative glyph on the charts is
intended as a precise (although not mandatory) indication to type designers.
In this sense, I found wrong that the U+00A3 (POUND SIGN) and U+20A4 (LIRA
SIGN): I'd suggest both glyphs for both characters, separated by a "|".

_ Marco