date:20020702

Encode THIS in the PUA

2002-07-02 Thread Doug Ewell


To promote the new "Men in Black II" movie, Burger King is handing out
kids' toys with "secret messages" displayed in these glyphs:

http://www.burgerking.com/mibdecoder/

It's a straight cipher for the Latin alphabet, so don't bother
suggesting it for ConScript.  They have a policy against ciphers, even
historic ones like the Utopian "alphabet" originally printed in 1516:

http://www.adh.brighton.ac.uk/schoolofdesign/MA.COURSE/05/LL47.html

ConScript is also not the place to propose ciphers invented for other
recent movies, such as the Mara "alphabet" from "Indiana Jones and the
Temple of Doom":

http://www.mouseplanet.com/al/docs/indy.htm

or the 29-letter Atlantean script from "Atlantis: The Lost Empire":

http://omniglot.com/writing/atlantean.htm

(Note: Unicode hobbyists who go to the Disney site and choose "Character
Gallery" may not find what they expect.)

But of course someone could still encode them in the PUA.  Is anyone
planning to start up that separate PUA mailing list?

-Doug Ewell
 Fullerton, California

Re: Can browsers show text? I don't think so!

2002-07-02 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B


>
> > http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html
>
I loaded the beginning of that document, and it looks like just a bunch of 
characters from the start of a list of characters in "aiueo-jun" (Japanese 
"alphabetical order"). Not a real "document",

Is what you want something like what you can find at www.shodouka.com? Like 
if you are trying to view your message board on an American library 
computer and all you get is mojibake instead of a Japanese message. 
Shodouka will display images for text.

There is a Web site that can do furigana, kind of. (Its mistakes are 
sometimes funny, but if you are a student of Japanese trying to read 
Japanese Web pages, it can be a lot of help.) If you do a search on 
kids.goo.ne.jp, and choose "furigana ari", it will give you your furigana. 
I wonder if there is a romaji version we could use?



$B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B

_
$B%&%#%k%9%a!<%k!"LBOG%a!<%kBP:v$J$i(B MSN Hotmail http://www.hotmail.com/JA

Re: RE: Can browsers show text? I don't think so!

2002-07-02 Thread starner


>My point is that it would be great if browsers supported all languages, no
>matter how complicated the language is. Still, even with languages that does
>not require shaping, you have problems. For example, a typical Western
>Mac/Win/Unix user may not have a Georgian/Chinese/(insert your favorite
>language here) font on his machine. This is a problem that is solved with
>CSS 2. Still, there is not any wide spread support for web fonts in modern
>browsers. I wonder why?

Most people have fonts to display their own language - they came with their
operating system. I'm a Unicode geek, and it doesn't really matter if I can't 
see whether it displays correctly or not. My friends couldn't care less.

>The link below will take you to a web page that shows 500 Japanese
>characters (courtesy of Morisawa Co Ltd) and a fairly large point size
>(18px). 

And I can't change the point size, which sucks. I can install a language
pack, which will let me change the point size, and work for all pages,
whether or not they share the font resources. 

>This scales up very well as well, because pages may share font resources. A
>font with 2000 characters would be 80kb in this case, and would perhaps work
>for hundres of pages. 

And would fail the instant someone added a new character.

>Also, what is it with people and the lack of interest in using fonts. Do
>people actually think that you only need one font, possibly in bold, italic
>and regular style? Do they think that other languages, e.g. Chinese, do not
>use styles? Text should be beautiful to look at too!

But text should be readable first. Typographers will probably flame me for 
this, but for English, there's only two or three distinct readable fonts (with 
a thousand minor variations on the form.) I'd usually prefer to see my serif 
font, instead of some bitmap font someone else chose, as mine will be scalable 
and anti-aliased. Pictures work better than fonts for fancy titles, and are 
already used for that.

Re: ZWJ and Latin Ligatures

2002-07-02 Thread John Cowan


Michael Everson scripsit:

> I have to confess I don't understand what you are talking about at 
> all. Get me them tools, John!

Ligature tables at a high level tell you things like "The glyph 'a'
and the glyph 'acute accent' should be merged to form the glyph
'aacute'."  Internally, though, it reads more like "A #502 followed
by a #397 should be replaced by a #929", where the numbers (or
names, in some contexts) *represent* the actual glyph outlines.
You could write "#202 followed by #999 becomes SHAVIAN PEEP glyph"
without there being any actual outlines for #202 or #999, but as
John says, if something actually called for a #202 to be imaged,
the rendering software would go belly-up.

I hope this helps.

-- 
John Cowan[EMAIL PROTECTED]
At times of peril or dubitation,  http://www.ccil.org/~cowan
Perform swift circular ambulation,http://www.reutershealth.com
With loud and high-pitched ululation.

Re: (long) Re: Chromatic font research

2002-07-02 Thread Kenneth Whistler


[*groans in the audience*]

I know, I know -- another contribution in the endless thread...

In re:
 
> The Respectfully Experiment 

> I used it as evidence that ideas about what should not be
> included in Unicode can change over a period of time as new scientific
> evidence is discovered.

Having been intimately involved in nearly all the decisions made
about what was included in Unicode over the last 13 years, and also
being formally trained as a scientist, I think I may be qualified
to dispute this conclusion.

Most of the change in ideas about what can be included in Unicode
have been the result of two types of influence:

  A. The encountering of legacy practice in preexisting character
 encodings which had to be accomodated for interoperability
 reasons. This accounts for many, if not all of the hinky little
 edge cases where Unicode appears to depart from its general
 principles for how to encode characters.

  B. The development of new processing requirements that required
 special kinds of encoded characters. This accounted for strange
 animals such as the bidi format controls, the BOM, the object
 replacement character, and the like.

There is a very narrow window of opportunity for *scientific*
evidence contributing to this -- namely, the result of graphological
analysis of previously poorly studied ancient or minority scripts,
which conceivably could turn up some obscure new principle of writing 
systems that would require Unicode to consider adding a new type of
character to accomodate it. But at this point, with Unicode having managed
to encode everything from Arabic to Mongolian to Han to Khmer..., I
consider it rather unlikely that scientific graphological study is going
to turn up many new fundamental principles here. As a scientific
*hypothesis* I think this surmise is proving to hold up rather well,
as our premier encoder of historic and minority scripts, Michael
Everson, has managed to successfully pull together encoding proposals,
based on current principles in Unicode, for dozens more scripts,
with little difficulty except for that inherent in extracting
information about rather poorly documented writing systems.

> it just seems to me that some
> extra ligature characters in the U+FB.. block would be useful.

Best practice, and near unanimous consensus in the Unicode Technical
Committee and among the correspondents on this list, would be
aligned with exactly the opposite opinion.

> In the
> light of this new evidence, I am wondering whether the decision not to
> encode any new ligatures in regular Unicode could possibly be looked at
> again.

As others have pointed out, "The Respectfully Experiment" did not
constitute new *evidence* of anything in this regard.

In any case, the UTC is quite unlikely to look at that decision again.

The exception that the UTC *has* considered recently was the Arabic
bismillah ligature, and the reason for doing so again was the result
of considering legacy practice. This thing exists in implemented
character encodings as a single encoded character. And furthermore,
it is used as a unitary symbol, in such a way that substituting out
an actual (long) string of Arabic letters and expecting the software
to ligate it correctly precisely in the contexts where it was being
used as a symbol, would place an unnecessary burden on both users and
on software implementations. That is *quite* different from the position
that claims that one, two, or dozens more Latin ligatures of two letters
need to be given standard Unicode encodings.

>if it cannot be done or would cause great anguish and
> arguments, well, that is that, forget it.

Good idea.

--Ken

Re: Can browsers show text? I don't think so!

2002-07-02 Thread Stefan Persson

- Original Message -
From: "Michael Jansson" <[EMAIL PROTECTED]>
To: "'David Starner'" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Tuesday, July 02, 2002 11:16 PM
Subject: RE: Can browsers show text? I don't think so!

> http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html

Let me see... And if you'd like to copy some text from that page and paste
it into some document...?

Stefan

_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

RE: Can browsers show text? I don't think so!

2002-07-02 Thread Murray Sargent


Michael Jansson says:

> There are no technical reasons for why css/html4/xhtml can not produce
every bit as high quality
> as any other page layout format.


Sadly this is currently far from the case. HTML/CSS even including CSS3
is far from a professional document publishing format. It doesn't even
have center/right/decimal tabs and tab leaders, which virtually all WP
systems have. The list of DTP omissions goes on and on. Defining their
own XMLs is the direction that WP systems are going in for interchange.
XSLT can be used to translate between these XMLs to the extent that the
features are translatable. XHTML/CSS is only used as a fallback for
browsers.

Which isn't to say that XHTML/CSS isn't cool. It is. But currently it's
a weak DTP format at best.

Murray

RE: Can browsers show text? I don't think so!

2002-07-02 Thread Michael Jansson


See below.

> -Original Message-
> From: David Starner [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, July 02, 2002 11:50 PM
> To: Michael Jansson; '[EMAIL PROTECTED]'
> Subject: Re: Can browsers show text? I don't think so!
> This is text. Changing fonts is just flash. And frankly, no 
> modern browser has
> any trouble with anything from Latin, Cyrillic, Greek, or 
> CJK; Hebrew, Thai,
> Arabic and any script that doesn't shape or combine is 
> usually supported too.
> I'd say they show text just fine.

Are you saying that browser do support all languages, as long as you exclude
all languages that are not supported? Well ;-)

My point is that it would be great if browsers supported all languages, no
matter how complicated the language is. Still, even with languages that does
not require shaping, you have problems. For example, a typical Western
Mac/Win/Unix user may not have a Georgian/Chinese/(insert your favorite
language here) font on his machine. This is a problem that is solved with
CSS 2. Still, there is not any wide spread support for web fonts in modern
browsers. I wonder why?

Being able to manually install fonts is not helpful in many cases either.
Mere mortals (you know, "ordinary" people that just want to surf the net)
don't know how to install fonts, nor would they no where to find fonts, or
even know that they are anyway. 

> 
> >(I do not consider solutions where you have
> >to download a 10MB+ language package to see a page in a 
> foreign language.
> >It's not a viable solution.)
> 
> So you'd rather download the fonts every time you want to view a page,
> rather than just once? It's not like any one can't afford 
> 10MB of space
> anymore.

You would not need 10MB to show a single web page, or even a full web site. 

The link below will take you to a web page that shows 500 Japanese
characters (courtesy of Morisawa Co Ltd) and a fairly large point size
(18px). The file size of the pages and the font data is roughly 20kb (font:
<20kb, page: ~1kb) if you are using a popular browser on Mac OS9 or Windows.
It would take a modem users ~3-7s to load this page. Size is not an issue
w.r.t. web fonts.
http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html

This scales up very well as well, because pages may share font resources. A
font with 2000 characters would be 80kb in this case, and would perhaps work
for hundres of pages. Note also that I would only need to download this data
_once_ (it would stick stay in the browser cache), and it would be done
without any user interaction (i.e. no manual labour). You click on the link
and you see the text. It should not have to be more complicated than that. 

> 
> >So what we have today are applications called "web browsers" 
> that are very
> >good at showing images, and animations. They are not very 
> good at showing
> >text, other than unformatted English text.
> 
> If you want to nitpick, they aren't that good at showing images;
> look at how modern browsers fail the PNG transparency test 
> one of these days.
> And for most animations, you have to download 10MB+ plugins.

OK, so they are good for nothing then... (just kidding ;-)

> 
> Every web browser since the beginning of time has supported 
> at least bold,
> italics and headings. And HTML has become a very common 
> medium for formatted
> text, and not just for English. Yes, they have failures in 
> complex situations
> that haven't had much work in them; no, not every font has or 
> will have every
> language in it. And if you want Adobe Acrobat, you know where 
> to find it; web
> browsing was never intended to give full control over fonts 
> and display to
> the creator of the documents; it was intended to give control 
> over _meaning_.

I think that pretty much sums up peoples expectations on web browsers, which
is a shame. There are no technical reasons for why css/html4/xhtml can not
produce every bit as high quality as any other page layout format.

Also, what is it with people and the lack of interest in using fonts. Do
people actually think that you only need one font, possibly in bold, italic
and regular style? Do they think that other languages, e.g. Chinese, do not
use styles? Text should be beautiful to look at too!



Regards,
em2 Solutions
Michael Jansson

Q: Online multilingual text projects and handling missingchars./variants

2002-07-02 Thread Deborah W. Anderson


I am assembling a list of online multilingual text projects, including
online foreign language instruction projects. My current interest is in
projects created at or for the university, but is not limited to this
category.

I was wondering how such real-life projects (if indeed their creators
read this list) currently handle (a) missing Unicode characters, and (b)
being able to specify needed variants of characters. 

I'd be very grateful for any input. 

With many thanks,
Deborah Anderson
Researcher, Dept. of Linguistics
UC Berkeley

Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins

On Tuesday, July 2, 2002, at 12:51 PM, Marco Cimarosti wrote:

> The next step could be standardizing the values of the glyph indexes, so
> that the entire "GSUB"/"morx" table can be copied in from a template, and
> type designers can concentrate on drawing the outlines.
>

The typical approach these days is for the tools that provide advanced 
layout table support to be keyed to glyph name.  Apple's tools allow glyph 
name, glyph number, of Unicode code point as glyph identifiers.  As you 
say, it makes it possible to cut-and-paste source files and is very handy.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/

Re: Can browsers show text? I don't think so!

2002-07-02 Thread David Starner

At 03:39 PM 7/2/02 +0200, Michael Jansson wrote:
>Modern browsers know how to show the characters 'A'-'Z' and a few other
>characters as long as you don't expect to format the text with a specific
>font. You will get into trouble as soon as you want to use a font or
>characters from other languages. You may find a solution for some languages
>and some fonts on some platforms. Yet again, this is far from claiming that
>modern browsers can show text.

This is text. Changing fonts is just flash. And frankly, no modern browser has
any trouble with anything from Latin, Cyrillic, Greek, or CJK; Hebrew, Thai,
Arabic and any script that doesn't shape or combine is usually supported too.
I'd say they show text just fine.

>(I do not consider solutions where you have
>to download a 10MB+ language package to see a page in a foreign language.
>It's not a viable solution.)

So you'd rather download the fonts every time you want to view a page,
rather than just once? It's not like any one can't afford 10MB of space
anymore.

>So what we have today are applications called "web browsers" that are very
>good at showing images, and animations. They are not very good at showing
>text, other than unformatted English text.

If you want to nitpick, they aren't that good at showing images;
look at how modern browsers fail the PNG transparency test one of these days.
And for most animations, you have to download 10MB+ plugins.

Every web browser since the beginning of time has supported at least bold,
italics and headings. And HTML has become a very common medium for formatted
text, and not just for English. Yes, they have failures in complex situations
that haven't had much work in them; no, not every font has or will have every
language in it. And if you want Adobe Acrobat, you know where to find it; web
browsing was never intended to give full control over fonts and display to
the creator of the documents; it was intended to give control over _meaning_.

Re: ZWJ and Latin Ligatures

2002-07-02 Thread Michael Everson


At 12:15 -0600 2002-07-02, John H. Jenkins wrote:
>On Tuesday, July 2, 2002, at 11:39 AM, John Cowan wrote:
>
>>
>>>1) If you map directly from multiple characters to a single glyph, you don'
>>>t have to include glyphs in your font for all the "pieces" if they're
>>>never supposed to appear by themselves.  As an extreme example, if I
>>>implemented astral character support via ligating surrogate pairs, I'd
>>>need to include glyphs for the unpaired surrogates.
>>
>>More precisely, you need to have glyph *indexes* that are never mapped
>>to glyphs.  The actual outlines themselves don't need to exist, AFAIK.
>>
>
>True.  I tend to avoid that, because if something goes wrong and the 
>system attempts to actually *display* one of these virtual glyphs, 
>disaster would ensue.  (Dave Opstad and I have had long debates on 
>the safety of doing this.)

I have to confess I don't understand what you are talking about at 
all. Get me them tools, John!
-- 

Michael Everson *** Everson Typography *** http://www.evertype.com

RE: ZWJ and Latin Ligatures

2002-07-02 Thread Marco Cimarosti

John Cowan wrote:
> More precisely, you need to have glyph *indexes* that are never mapped
> to glyphs.  The actual outlines themselves don't need to exist, AFAIK.

Yes, of course. E.g., I guess that the ZWJ "glyph" can be a pseudo-index
which doesn't actually index anything.

The next step could be standardizing the values of the glyph indexes, so
that the entire "GSUB"/"morx" table can be copied in from a template, and
type designers can concentrate on drawing the outlines.
:-)

_ Marco

Re: Inappropriate Proposals FAQ

2002-07-02 Thread Barry Caplan

At 10:01 AM 7/2/2002 -0400, Suzanne M. Topping wrote:
>I have a few ideas for fictional proposals to use as examples (my room
>layout idea, and Mark's 3-D Mr. Potato Head representation), but I could
>use another one or two if anyone feels creative. The closer to being
>believable, the better, I suppose. (An alternative would be to use
>real-life proposals, and state why they were not accepted, but I thought
>it more politic to keep it fictional...)

There was a discussion last year about a symbol to represent pi/2 or pi/4 or something 
like that. If you want to fictionalize that to some other fraction of a mathematical 
constant, that might work (e/2 perhaps?)

Barry Caplan
www.i18n.com

Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins

On Tuesday, July 2, 2002, at 11:39 AM, John Cowan wrote:

>
>> 1) If you map directly from multiple characters to a single glyph, you 
>> don'
>> t have to include glyphs in your font for all the "pieces" if they're
>> never supposed to appear by themselves.  As an extreme example, if I
>> implemented astral character support via ligating surrogate pairs, I'd
>> need to include glyphs for the unpaired surrogates.
>
> More precisely, you need to have glyph *indexes* that are never mapped
> to glyphs.  The actual outlines themselves don't need to exist, AFAIK.
>

True.  I tend to avoid that, because if something goes wrong and the 
system attempts to actually *display* one of these virtual glyphs, 
disaster would ensue.  (Dave Opstad and I have had long debates on the 
safety of doing this.)

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/

Re: Inappropriate Proposals FAQ

2002-07-02 Thread Wm Seán Glen




How about symbols from electronics and hydraulics? Schematic 
symbols.
Wm Seán Glen

  - Original Message - 
  From: 
  Suzanne M. 
  Topping 
  To: Unicode (E-mail) 
  Sent: Tuesday, 02 July, 2002 7:01
  Subject: Inappropriate Proposals 
FAQ
  I have a few ideas for 
  fictional proposals to use as examples (my roomlayout idea, and Mark's 3-D 
  Mr. Potato Head representation), but I coulduse another one or two if 
  anyone feels creative. Thanks in advance for your input,Suzanne 
  ToppingBizWonk Inc.[EMAIL PROTECTED]

Re: Inappropriate Proposals FAQ

2002-07-02 Thread Michael Everson


At 12:38 -0400 2002-07-02, ÇÎÅZÅZÅZÅZ ÇÎÅZÅZÅZ wrote:
>I have a few ideas:
>
>Fictional scripts that would probably be rejected, such as the 
>script of the Codex Seraphinianus

Certainly not. Tengwar and Cirth  are certain to be encoded. The 
Codex script would probably not be encoded because it occurs in only 
one manuscript and is undeciphered.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: ZWJ and Latin Ligatures

2002-07-02 Thread John Cowan


John H. Jenkins scripsit:

> 1) If you map directly from multiple characters to a single glyph, you don'
> t have to include glyphs in your font for all the "pieces" if they're 
> never supposed to appear by themselves.  As an extreme example, if I 
> implemented astral character support via ligating surrogate pairs, I'd 
> need to include glyphs for the unpaired surrogates.  

More precisely, you need to have glyph *indexes* that are never mapped
to glyphs.  The actual outlines themselves don't need to exist, AFAIK.

-- 
John Cowan   http://www.ccil.org/~cowan[EMAIL PROTECTED]
To say that Bilbo's breath was taken away is no description at all.  There are
no words left to express his staggerment, since Men changed the language that
they learned of elves in the days when all the world was wonderful. --The Hobbit

Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins

On Tuesday, July 2, 2002, at 10:55 AM, Marco Cimarosti wrote:

> I mean: isn't this two-step mapping:
>
>   code point -> glyph ID
>   component glyph ID's -> ligature glyph ID
>
> functionally equivalent to an hypothetical one-step mapping?
>
>   component code points -> ligature glyph ID
>
> Am I missing something?
>

Functionally, the two are equivalent.  There are, however, two subtle 
differences:

1) If you map directly from multiple characters to a single glyph, you don'
t have to include glyphs in your font for all the "pieces" if they're 
never supposed to appear by themselves.  As an extreme example, if I 
implemented astral character support via ligating surrogate pairs, I'd 
need to include glyphs for the unpaired surrogates.  As it is, Windows and 
the Mac *do* support mapping paired surrogates directly to glyphs, so you 
don't need these extra glyphs which are never seen.

2) A mapping directly from multiple characters to single glyphs expressly 
makes the process something not to percolate up to the UI.  The indirect 
process means that there are some actions in glyph space which *are* 
optional and which the user can turn on and off, and others which aren't.

In OpenType, this is less of an issue since this was always the case and 
applications are expected to do the UI work themselves.  In AAT, we 
originally assumed (back in the days of the Technology That Must Not Be 
Named) that all layout features are optional and can be turned on and off,
  and that the UI would always reflect the entire suite of available 
features.  We had to rewrite our tools to allow for required actions which 
cannot be turned off.

Poor Michael is saddled with older versions of our tools which are hard to 
use and don't let him do this.  We're working on getting newer and better 
ones to him.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/

RE: ZWJ and Latin Ligatures

2002-07-02 Thread Marco Cimarosti

Michael Everson wrote:
> At 09:41 -0600 2002-07-02, John H. Jenkins wrote:
> 
> >Alas, but that's technically impossible.  Both OT and AAT (I'm not 
> >sure about Graphite) require that single characters map to single 
> >glyphs, which are then processed.

I am confused by this statement; perhaps some expert in fonts can help me
checking my understanding.

The OpenType specs published on the Adobe site states that table GSUB has a
subtable to handle ligatures ("LookupType 4: Ligature Substitution
Subtable": http://partners.adobe.com/asn/developer/opentype/gsub.html#LSF1).

It says that "A Ligature Substitution (LigatureSubst) subtable identifies
ligature substitutions where a single glyph replaces multiple glyphs"
(multiple *glyphs*, not multiple characters).

OK: literally speaking, it is true that OT maps single characters to single
glyphs, but then it maps multiple glyphs to ligature glyphs, so what's the
difference?

I mean: isn't this two-step mapping:

code point -> glyph ID
component glyph ID's -> ligature glyph ID

functionally equivalent to an hypothetical one-step mapping?

component code points -> ligature glyph ID

Am I missing something?

_ Marco

Re: Inappropriate Proposals FAQ

2002-07-02 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B

I have a few ideas:

Fictional scripts that would probably be rejected, such as the script of 
the Codex Seraphinianus

A "fictional" Hanzi (specifically, a Hanzi made up of the "woman" radical 
plus the character for "walk"), which I am attaching a crude image of. The 
proposer either (1) used this character in a novel once (or has seen it 
used in a novel), or (2) he wants to use it as a symbol for the length unit 
of the new system of measurement he invented.


$B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B

_
$B$-$C$H8+$D$+$k$"$J$?$N?75o!!ITF0;:>pJs$O(B MSN $B=;Bp$G(B http://house.msn.co.jp/

Here is the attachment

2002-07-02 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B

Here is an image of the "fake Hanzi" I described in my last E-mail.

$B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B

_
$B$+$o$$$/$FL{2w$J%$%i%9%HK~:\(B MSN $B%-%c%i%/%?!<(B http://character.msn.co.jp/


fakehanzi.bmp
Description: Windows bitmap

Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins

On Tuesday, July 2, 2002, at 09:49 AM, Michael Everson wrote:

> At 09:41 -0600 2002-07-02, John H. Jenkins wrote:
>
>> Alas, but that's technically impossible.  Both OT and AAT (I'm not sure 
>> about Graphite) require that single characters map to single glyphs, 
>> which are then processed.
>
> Hm? How do you handle the decomposed sequence A + COMBINING ACUTE? Surely 
> that is a sequence of characters mapping to a single glyph.
>

Same process.  In OT, of course, you could count on the glyph being 
prenormalized (but this only works for stuff already in Unicode), or you 
could use the GPOS table to properly form the accented form on-the-fly.

But neither technology allows the decomposed sequence to be mapped 
directly to a single glyph.

> Just goes to show that I don't make proper Unicode fonts yet because the 
> tools just aren't up to snuff.
>

We're working on it.  :-)

>> (In OT, of course, you are also supposed to do some preprocessing in 
>> character space, but that doesn't solve this problem.)  It would be nice 
>> to have a cmap format which maps multiple characters to single glyphs 
>> initially.
>
> I always thought there was. Now I'm really confused as to how I would 
> make a complex Indic syllable.
>

Same sort of thing.  You put the glyph in the font and the instructions 
for what sequence forms it in the GSUB or morx table.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/

Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins

On Tuesday, July 2, 2002, at 06:51 AM, Michael Everson wrote:

> That is absolutely true. I have never argued that the only way to turn 
> ligatures on or off is in plain text. I saw that there were difficult 
> edge cases and sought blessing for the ZWJ/ZWNJ mechanism to handle them,
>  and won the day. But it would certainly be my view that those should 
> only be used where predictable ligation does not occur. A Runic font 
> which had an AAT/OpenType/Graphite ligatures-on mechanism would, in my 
> view, be inappropriate, because ligation is unusual in Runic, never the 
> norm, and should only be used on a case-by-case basis. Runic fonts should 
> have the ZWJ pairs encoded in the glyph tables.
>
>>

Alas, but that's technically impossible.  Both OT and AAT (I'm not sure 
about Graphite) require that single characters map to single glyphs, which 
are then processed.  (In OT, of course, you are also supposed to do some 
preprocessing in character space, but that doesn't solve this problem.)  
It would be nice to have a cmap format which maps multiple characters to 
single glyphs initially.

The way we deal with this is to have the ligatures with the ZWJ inserted 
as part of a ligature table which is on by default and which isn't 
revealed to the UI so that the user can't turn them off.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/

Re: ZWJ and Latin Ligatures

2002-07-02 Thread Michael Everson


At 09:41 -0600 2002-07-02, John H. Jenkins wrote:

>Alas, but that's technically impossible.  Both OT and AAT (I'm not 
>sure about Graphite) require that single characters map to single 
>glyphs, which are then processed.

Hm? How do you handle the decomposed sequence A + COMBINING ACUTE? 
Surely that is a sequence of characters mapping to a single glyph.

Just goes to show that I don't make proper Unicode fonts yet because 
the tools just aren't up to snuff.

>(In OT, of course, you are also supposed to do some preprocessing in 
>character space, but that doesn't solve this problem.)  It would be 
>nice to have a cmap format which maps multiple characters to single 
>glyphs initially.

I always thought there was. Now I'm really confused as to how I would 
make a complex Indic syllable.

>The way we deal with this is to have the ligatures with the ZWJ 
>inserted as part of a ligature table which is on by default and 
>which isn't revealed to the UI so that the user can't turn them off.

I am not sure I understand, but then I haven't been able to make use 
of the AAT ligature tables yet. ;-)
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

RE: Inappropriate Proposals FAQ

2002-07-02 Thread Marco Cimarosti


Suzanne M. Topping wrote:
> I have a few ideas for fictional proposals to use as examples (my room
> layout idea, and Mark's 3-D Mr. Potato Head representation), 
> but I could use another one or two if anyone feels creative.

Today I don't feel very creative, perhaps because deliberating inventing bad
ideas does not appeal too much to my creativeness. :-)

But perhaps I have some suggestions for the less creative part of the FAQ,
which is: listing the existing policies for excluding some classes of
proposals.

In my understanding, a few such policies are:

- No precomposed ligatures which can be encoded using a sequence of existing
character (possibly joined by ZWJ's);

- No precomposed "accented characters" which can be composed using an
existing character and one or more existing combining diacritics;

- No clones of existing characters whose sole purpose is making a *logical*
differentiation from some existing characters (e.g., hex digits looking
identical to existing characters "0..9" and "A...F"; or a symbol for "meter"
looking identical to Latin "m");

- No clones of existing characters whose sole purpose is making a
*graphical* differentiation from some existing characters (e.g., a Serbian
letter "t", disunified from Russian on the basis that italics looks
different in the two languages);

- No presentation glyphs for shapes that can already be obtained using
regular characters in conjunction with ZWJ or ZWNJ.

_ Marco

Re: Inappropriate Proposals FAQ

2002-07-02 Thread john . colby


But would not using rejected proposals (as well as the fictional ones) be closer to 
the truth and therefore more accurate?

John

>  from:"Suzanne M. Topping" <[EMAIL PROTECTED]>
>  date:Tue, 02 Jul 2002 15:01:16
>  to:  [EMAIL PROTECTED]
>  subject: Re: Inappropriate Proposals FAQ
> 
> (An alternative would be to use
> real-life proposals, and state why they were not accepted, but I thought
> it more politic to keep it fictional...)
>

Inappropriate Proposals FAQ

2002-07-02 Thread Suzanne M. Topping


As no good deed goes unpunished, my suggestion re. an FAQ entry
regarding innappropriate candidates for encoding resulted in my being
asked to begin a draft.

I see the need for perhaps two entries: one which states clearly what
Unicode is NOT, and another which lists a few examples of innapropriate
proposals and why they would not be considered. This section would
probably refer to the "what Unicode isn't" entry for support of the
"why"s.

I have a few ideas for fictional proposals to use as examples (my room
layout idea, and Mark's 3-D Mr. Potato Head representation), but I could
use another one or two if anyone feels creative. The closer to being
believable, the better, I suppose. (An alternative would be to use
real-life proposals, and state why they were not accepted, but I thought
it more politic to keep it fictional...)

I'm also looking for key points to include in the "what Unicode isn't"
section, and would appreciate input. I'm particularly looking for issues
that have created ongoing repetitive arguments, since the goal of the
FAQ entries is to help eliminate them. 

Thanks in advance for your input,

Suzanne Topping
BizWonk Inc.

[EMAIL PROTECTED]

Can browsers show text? I don't think so!

2002-07-02 Thread Michael Jansson


Postings on this list has recently touched the topic of using various
languages in web pages. Comments has been made of the use of embedded fonts
(eot and pfr), as well as the lack of support for these font formats in
popular browsers. This is a topic which I am very enthusiastic about, so I
can not help but to add a few comments myself. 

Let me start by posing a question: 
"Can modern browsers show text?"
Specifically, can they show text of any language and formatting on all
platforms? I have to say; No they can not (possibly with the exception of
the browser Nophus). 

The problem with browsers today is that although they may support Unicode
encoding schemes (e.g. UTF8), they typically rely on the platform/OS they
run on to show text. Platform without complete Unicode 3.x support will thus
not be able to show text correctly. For example, IE6 (or any other modern
browser) supports UTF8 but Win98 does not support Unicode 3.x. IE6 is thus
not able to show Unicode text on Win98. You may of course be able to show
some Unicode text on some platforms. This is far from claiming that a
browser support Unicode though. At most, you may claim that a browser on a
particular platform support some part of Unicode. 

Further more, even if a browser knew how to rendered text (e.g. know about
the nitty-gritty details of glyph ordering, positioning and shaping that are
language specific), you need something called a font to show text. Fonts can
be provided as web resources through CSS 2, through a construct known as
@font-family rules. However, there are no browser that fully support CSS 2
today, and in particular @font-family rules. There are browser that support
@font-family on some platforms (e.g. for eot files on Windows). Again, this
is far from claiming that a browser support fonts on the web.

Modern browsers know how to show the characters 'A'-'Z' and a few other
characters as long as you don't expect to format the text with a specific
font. You will get into trouble as soon as you want to use a font or
characters from other languages. You may find a solution for some languages
and some fonts on some platforms. Yet again, this is far from claiming that
modern browsers can show text. (I do not consider solutions where you have
to download a 10MB+ language package to see a page in a foreign language.
It's not a viable solution.)

So what we have today are applications called "web browsers" that are very
good at showing images, and animations. They are not very good at showing
text, other than unformatted English text.

Fortunately, there are third party solutions to work around some of the
problems I mention above. Bitstreams "FontPlayer" (for pfr fonts for IE 5.x
and Nav 4.x on Windows), MS Typography's WEFT tools (for eot fonts in IE 5.x
on Windows), and our own FAIRY server solution (for eot fonts and language
support in IE 5.x, Nav 4.x, Nav 6.x and Opera 5.x on Mac and Win). 

I do admire the work that people have done in creating quite outstanding web
browsers through the years, sometimes with no other reward than peoples
appreciation. I only wish that time were spent on supporting text, and not
just flashy content.


Regards,
em2 Solutions
Michael Jansson

Re: ZWJ and Latin Ligatures

2002-07-02 Thread Michael Everson

At 11:00 -0600 2002-07-01, John H. Jenkins wrote:

>I guess one thing that's frustrating for me personally in this 
>perennial discussion is the creation of this false dichotomy, that 
>ligation control either *must* be in plain text or *must* be 
>expressly forbidden in plain text.  I would agree, Michael, that 
>your arguments that some degree of ligation control belongs in plain 
>text were unanswerable.  You did a good job there.  But at the same 
>time, I've never heard you argue that the only way to turn ligatures 
>on or off is in plain text.

That is absolutely true. I have never argued that the only way to 
turn ligatures on or off is in plain text. I saw that there were 
difficult edge cases and sought blessing for the ZWJ/ZWNJ mechanism 
to handle them, and won the day. But it would certainly be my view 
that those should only be used where predictable ligation does not 
occur. A Runic font which had an AAT/OpenType/Graphite ligatures-on 
mechanism would, in my view, be inappropriate, because ligation is 
unusual in Runic, never the norm, and should only be used on a 
case-by-case basis. Runic fonts should have the ZWJ pairs encoded in 
the glyph tables.

>And under no circumstances should new Latin ligatures be added to Unicode.

I agree.

I wonder if it wouldn't be useful at some stage for me to pick the 
best bits out of my papers and do them up as a Unicode Technical Note.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Radicals in CNS 11643-1992, Plane 1, Rows 7,8,9

2002-07-02 Thread Torsten Mohrin


"John H. Jenkins" <[EMAIL PROTECTED]> wrote:

>Use the KangXi radicals in the KangXi radical block (U+2Fxx).

Hmm, that is pretty obvious. I should have noted that myself. Thanks!

--Torsten

Encode THIS in the PUA

Re: Can browsers show text? I don't think so!

Re: RE: Can browsers show text? I don't think so!

Re: ZWJ and Latin Ligatures

Re: (long) Re: Chromatic font research

Re: Can browsers show text? I don't think so!

RE: Can browsers show text? I don't think so!

RE: Can browsers show text? I don't think so!

Q: Online multilingual text projects and handling missingchars./variants

Re: ZWJ and Latin Ligatures

Re: Can browsers show text? I don't think so!

Re: ZWJ and Latin Ligatures

RE: ZWJ and Latin Ligatures

Re: Inappropriate Proposals FAQ

Re: ZWJ and Latin Ligatures

Re: Inappropriate Proposals FAQ

Re: Inappropriate Proposals FAQ

Re: ZWJ and Latin Ligatures

Re: ZWJ and Latin Ligatures

RE: ZWJ and Latin Ligatures

Re: Inappropriate Proposals FAQ

Here is the attachment

Re: ZWJ and Latin Ligatures

Re: ZWJ and Latin Ligatures

Re: ZWJ and Latin Ligatures

RE: Inappropriate Proposals FAQ

Re: Inappropriate Proposals FAQ

Inappropriate Proposals FAQ

Can browsers show text? I don't think so!

Re: ZWJ and Latin Ligatures

Re: Radicals in CNS 11643-1992, Plane 1, Rows 7,8,9

31 matches

Site Navigation

Mail list logo

Footer information