Re: Radicals in CNS 11643-1992, Plane 1, Rows 7,8,9

2002-07-02 Thread Torsten Mohrin

John H. Jenkins [EMAIL PROTECTED] wrote:

Use the KangXi radicals in the KangXi radical block (U+2Fxx).

Hmm, that is pretty obvious. I should have noted that myself. Thanks!

--Torsten





Re: ZWJ and Latin Ligatures

2002-07-02 Thread Michael Everson

At 11:00 -0600 2002-07-01, John H. Jenkins wrote:

I guess one thing that's frustrating for me personally in this 
perennial discussion is the creation of this false dichotomy, that 
ligation control either *must* be in plain text or *must* be 
expressly forbidden in plain text.  I would agree, Michael, that 
your arguments that some degree of ligation control belongs in plain 
text were unanswerable.  You did a good job there.  But at the same 
time, I've never heard you argue that the only way to turn ligatures 
on or off is in plain text.

That is absolutely true. I have never argued that the only way to 
turn ligatures on or off is in plain text. I saw that there were 
difficult edge cases and sought blessing for the ZWJ/ZWNJ mechanism 
to handle them, and won the day. But it would certainly be my view 
that those should only be used where predictable ligation does not 
occur. A Runic font which had an AAT/OpenType/Graphite ligatures-on 
mechanism would, in my view, be inappropriate, because ligation is 
unusual in Runic, never the norm, and should only be used on a 
case-by-case basis. Runic fonts should have the ZWJ pairs encoded in 
the glyph tables.

And under no circumstances should new Latin ligatures be added to Unicode.

I agree.

I wonder if it wouldn't be useful at some stage for me to pick the 
best bits out of my papers and do them up as a Unicode Technical Note.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Can browsers show text? I don't think so!

2002-07-02 Thread Michael Jansson

Postings on this list has recently touched the topic of using various
languages in web pages. Comments has been made of the use of embedded fonts
(eot and pfr), as well as the lack of support for these font formats in
popular browsers. This is a topic which I am very enthusiastic about, so I
can not help but to add a few comments myself. 

Let me start by posing a question: 
Can modern browsers show text?
Specifically, can they show text of any language and formatting on all
platforms? I have to say; No they can not (possibly with the exception of
the browser Nophus). 

The problem with browsers today is that although they may support Unicode
encoding schemes (e.g. UTF8), they typically rely on the platform/OS they
run on to show text. Platform without complete Unicode 3.x support will thus
not be able to show text correctly. For example, IE6 (or any other modern
browser) supports UTF8 but Win98 does not support Unicode 3.x. IE6 is thus
not able to show Unicode text on Win98. You may of course be able to show
some Unicode text on some platforms. This is far from claiming that a
browser support Unicode though. At most, you may claim that a browser on a
particular platform support some part of Unicode. 

Further more, even if a browser knew how to rendered text (e.g. know about
the nitty-gritty details of glyph ordering, positioning and shaping that are
language specific), you need something called a font to show text. Fonts can
be provided as web resources through CSS 2, through a construct known as
font-family rules. However, there are no browser that fully support CSS 2
today, and in particular font-family rules. There are browser that support
font-family on some platforms (e.g. for eot files on Windows). Again, this
is far from claiming that a browser support fonts on the web.

Modern browsers know how to show the characters 'A'-'Z' and a few other
characters as long as you don't expect to format the text with a specific
font. You will get into trouble as soon as you want to use a font or
characters from other languages. You may find a solution for some languages
and some fonts on some platforms. Yet again, this is far from claiming that
modern browsers can show text. (I do not consider solutions where you have
to download a 10MB+ language package to see a page in a foreign language.
It's not a viable solution.)

So what we have today are applications called web browsers that are very
good at showing images, and animations. They are not very good at showing
text, other than unformatted English text.

Fortunately, there are third party solutions to work around some of the
problems I mention above. Bitstreams FontPlayer (for pfr fonts for IE 5.x
and Nav 4.x on Windows), MS Typography's WEFT tools (for eot fonts in IE 5.x
on Windows), and our own FAIRY server solution (for eot fonts and language
support in IE 5.x, Nav 4.x, Nav 6.x and Opera 5.x on Mac and Win). 

I do admire the work that people have done in creating quite outstanding web
browsers through the years, sometimes with no other reward than peoples
appreciation. I only wish that time were spent on supporting text, and not
just flashy content.


Regards,
em2 Solutions
Michael Jansson




Inappropriate Proposals FAQ

2002-07-02 Thread Suzanne M. Topping

As no good deed goes unpunished, my suggestion re. an FAQ entry
regarding innappropriate candidates for encoding resulted in my being
asked to begin a draft.

I see the need for perhaps two entries: one which states clearly what
Unicode is NOT, and another which lists a few examples of innapropriate
proposals and why they would not be considered. This section would
probably refer to the what Unicode isn't entry for support of the
whys.

I have a few ideas for fictional proposals to use as examples (my room
layout idea, and Mark's 3-D Mr. Potato Head representation), but I could
use another one or two if anyone feels creative. The closer to being
believable, the better, I suppose. (An alternative would be to use
real-life proposals, and state why they were not accepted, but I thought
it more politic to keep it fictional...)

I'm also looking for key points to include in the what Unicode isn't
section, and would appreciate input. I'm particularly looking for issues
that have created ongoing repetitive arguments, since the goal of the
FAQ entries is to help eliminate them. 

Thanks in advance for your input,

Suzanne Topping
BizWonk Inc.

[EMAIL PROTECTED]




Re: Inappropriate Proposals FAQ

2002-07-02 Thread john . colby

But would not using rejected proposals (as well as the fictional ones) be closer to 
the truth and therefore more accurate?

John

  from:Suzanne M. Topping [EMAIL PROTECTED]
  date:Tue, 02 Jul 2002 15:01:16
  to:  [EMAIL PROTECTED]
  subject: Re: Inappropriate Proposals FAQ
 
 (An alternative would be to use
 real-life proposals, and state why they were not accepted, but I thought
 it more politic to keep it fictional...)
 





RE: Inappropriate Proposals FAQ

2002-07-02 Thread Marco Cimarosti

Suzanne M. Topping wrote:
 I have a few ideas for fictional proposals to use as examples (my room
 layout idea, and Mark's 3-D Mr. Potato Head representation), 
 but I could use another one or two if anyone feels creative.

Today I don't feel very creative, perhaps because deliberating inventing bad
ideas does not appeal too much to my creativeness. :-)

But perhaps I have some suggestions for the less creative part of the FAQ,
which is: listing the existing policies for excluding some classes of
proposals.

In my understanding, a few such policies are:

- No precomposed ligatures which can be encoded using a sequence of existing
character (possibly joined by ZWJ's);

- No precomposed accented characters which can be composed using an
existing character and one or more existing combining diacritics;

- No clones of existing characters whose sole purpose is making a *logical*
differentiation from some existing characters (e.g., hex digits looking
identical to existing characters 0..9 and A...F; or a symbol for meter
looking identical to Latin m);

- No clones of existing characters whose sole purpose is making a
*graphical* differentiation from some existing characters (e.g., a Serbian
letter t, disunified from Russian on the basis that italics looks
different in the two languages);

- No presentation glyphs for shapes that can already be obtained using
regular characters in conjunction with ZWJ or ZWNJ.

_ Marco




Re: ZWJ and Latin Ligatures

2002-07-02 Thread Michael Everson

At 09:41 -0600 2002-07-02, John H. Jenkins wrote:

Alas, but that's technically impossible.  Both OT and AAT (I'm not 
sure about Graphite) require that single characters map to single 
glyphs, which are then processed.

Hm? How do you handle the decomposed sequence A + COMBINING ACUTE? 
Surely that is a sequence of characters mapping to a single glyph.

Just goes to show that I don't make proper Unicode fonts yet because 
the tools just aren't up to snuff.

(In OT, of course, you are also supposed to do some preprocessing in 
character space, but that doesn't solve this problem.)  It would be 
nice to have a cmap format which maps multiple characters to single 
glyphs initially.

I always thought there was. Now I'm really confused as to how I would 
make a complex Indic syllable.

The way we deal with this is to have the ligatures with the ZWJ 
inserted as part of a ligature table which is on by default and 
which isn't revealed to the UI so that the user can't turn them off.

I am not sure I understand, but then I haven't been able to make use 
of the AAT ligature tables yet. ;-)
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins


On Tuesday, July 2, 2002, at 06:51 AM, Michael Everson wrote:

 That is absolutely true. I have never argued that the only way to turn 
 ligatures on or off is in plain text. I saw that there were difficult 
 edge cases and sought blessing for the ZWJ/ZWNJ mechanism to handle them,
  and won the day. But it would certainly be my view that those should 
 only be used where predictable ligation does not occur. A Runic font 
 which had an AAT/OpenType/Graphite ligatures-on mechanism would, in my 
 view, be inappropriate, because ligation is unusual in Runic, never the 
 norm, and should only be used on a case-by-case basis. Runic fonts should 
 have the ZWJ pairs encoded in the glyph tables.



Alas, but that's technically impossible.  Both OT and AAT (I'm not sure 
about Graphite) require that single characters map to single glyphs, which 
are then processed.  (In OT, of course, you are also supposed to do some 
preprocessing in character space, but that doesn't solve this problem.)  
It would be nice to have a cmap format which maps multiple characters to 
single glyphs initially.

The way we deal with this is to have the ligatures with the ZWJ inserted 
as part of a ligature table which is on by default and which isn't 
revealed to the UI so that the user can't turn them off.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/





Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins


On Tuesday, July 2, 2002, at 09:49 AM, Michael Everson wrote:

 At 09:41 -0600 2002-07-02, John H. Jenkins wrote:

 Alas, but that's technically impossible.  Both OT and AAT (I'm not sure 
 about Graphite) require that single characters map to single glyphs, 
 which are then processed.

 Hm? How do you handle the decomposed sequence A + COMBINING ACUTE? Surely 
 that is a sequence of characters mapping to a single glyph.


Same process.  In OT, of course, you could count on the glyph being 
prenormalized (but this only works for stuff already in Unicode), or you 
could use the GPOS table to properly form the accented form on-the-fly.

But neither technology allows the decomposed sequence to be mapped 
directly to a single glyph.

 Just goes to show that I don't make proper Unicode fonts yet because the 
 tools just aren't up to snuff.


We're working on it.  :-)

 (In OT, of course, you are also supposed to do some preprocessing in 
 character space, but that doesn't solve this problem.)  It would be nice 
 to have a cmap format which maps multiple characters to single glyphs 
 initially.

 I always thought there was. Now I'm really confused as to how I would 
 make a complex Indic syllable.


Same sort of thing.  You put the glyph in the font and the instructions 
for what sequence forms it in the GSUB or morx table.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/





Here is the attachment

2002-07-02 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B
Here is an image of the "fake Hanzi" I described in my last E-mail.

$B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B

_
$B$+$o$$$/$FL{2w$J%$%i%9%HK~:\(B MSN $B%-%c%i%/%?!<(B http://character.msn.co.jp/


fakehanzi.bmp
Description: Windows bitmap


Re: Inappropriate Proposals FAQ

2002-07-02 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B
I have a few ideas:

Fictional scripts that would probably be rejected, such as the script of 
the Codex Seraphinianus

A "fictional" Hanzi (specifically, a Hanzi made up of the "woman" radical 
plus the character for "walk"), which I am attaching a crude image of. The 
proposer either (1) used this character in a novel once (or has seen it 
used in a novel), or (2) he wants to use it as a symbol for the length unit 
of the new system of measurement he invented.


$B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B

_
$B$-$C$H8+$D$+$k$"$J$?$N?75o!!ITF0;:>pJs$O(B MSN $B=;Bp$G(B http://house.msn.co.jp/


RE: ZWJ and Latin Ligatures

2002-07-02 Thread Marco Cimarosti

Michael Everson wrote:
 At 09:41 -0600 2002-07-02, John H. Jenkins wrote:
 
 Alas, but that's technically impossible.  Both OT and AAT (I'm not 
 sure about Graphite) require that single characters map to single 
 glyphs, which are then processed.

I am confused by this statement; perhaps some expert in fonts can help me
checking my understanding.

The OpenType specs published on the Adobe site states that table GSUB has a
subtable to handle ligatures (LookupType 4: Ligature Substitution
Subtable: http://partners.adobe.com/asn/developer/opentype/gsub.html#LSF1).

It says that A Ligature Substitution (LigatureSubst) subtable identifies
ligature substitutions where a single glyph replaces multiple glyphs
(multiple *glyphs*, not multiple characters).

OK: literally speaking, it is true that OT maps single characters to single
glyphs, but then it maps multiple glyphs to ligature glyphs, so what's the
difference?

I mean: isn't this two-step mapping:

code point - glyph ID
component glyph ID's - ligature glyph ID

functionally equivalent to an hypothetical one-step mapping?

component code points - ligature glyph ID

Am I missing something?

_ Marco




Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins


On Tuesday, July 2, 2002, at 10:55 AM, Marco Cimarosti wrote:

 I mean: isn't this two-step mapping:

   code point - glyph ID
   component glyph ID's - ligature glyph ID

 functionally equivalent to an hypothetical one-step mapping?

   component code points - ligature glyph ID

 Am I missing something?


Functionally, the two are equivalent.  There are, however, two subtle 
differences:

1) If you map directly from multiple characters to a single glyph, you don'
t have to include glyphs in your font for all the pieces if they're 
never supposed to appear by themselves.  As an extreme example, if I 
implemented astral character support via ligating surrogate pairs, I'd 
need to include glyphs for the unpaired surrogates.  As it is, Windows and 
the Mac *do* support mapping paired surrogates directly to glyphs, so you 
don't need these extra glyphs which are never seen.

2) A mapping directly from multiple characters to single glyphs expressly 
makes the process something not to percolate up to the UI.  The indirect 
process means that there are some actions in glyph space which *are* 
optional and which the user can turn on and off, and others which aren't.

In OpenType, this is less of an issue since this was always the case and 
applications are expected to do the UI work themselves.  In AAT, we 
originally assumed (back in the days of the Technology That Must Not Be 
Named) that all layout features are optional and can be turned on and off,
  and that the UI would always reflect the entire suite of available 
features.  We had to rewrite our tools to allow for required actions which 
cannot be turned off.

Poor Michael is saddled with older versions of our tools which are hard to 
use and don't let him do this.  We're working on getting newer and better 
ones to him.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/





Re: ZWJ and Latin Ligatures

2002-07-02 Thread John Cowan

John H. Jenkins scripsit:

 1) If you map directly from multiple characters to a single glyph, you don'
 t have to include glyphs in your font for all the pieces if they're 
 never supposed to appear by themselves.  As an extreme example, if I 
 implemented astral character support via ligating surrogate pairs, I'd 
 need to include glyphs for the unpaired surrogates.  

More precisely, you need to have glyph *indexes* that are never mapped
to glyphs.  The actual outlines themselves don't need to exist, AFAIK.

-- 
John Cowan   http://www.ccil.org/~cowan[EMAIL PROTECTED]
To say that Bilbo's breath was taken away is no description at all.  There are
no words left to express his staggerment, since Men changed the language that
they learned of elves in the days when all the world was wonderful. --The Hobbit




Re: Inappropriate Proposals FAQ

2002-07-02 Thread Michael Everson

At 12:38 -0400 2002-07-02, ÇÎÅZÅZÅZÅZ ÇÎÅZÅZÅZ wrote:
I have a few ideas:

Fictional scripts that would probably be rejected, such as the 
script of the Codex Seraphinianus

Certainly not. Tengwar and Cirth  are certain to be encoded. The 
Codex script would probably not be encoded because it occurs in only 
one manuscript and is undeciphered.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: Inappropriate Proposals FAQ

2002-07-02 Thread Wm Sen Glen



How about symbols from electronics and hydraulics? Schematic 
symbols.
Wm Seán Glen

  - Original Message - 
  From: 
  Suzanne M. 
  Topping 
  To: Unicode (E-mail) 
  Sent: Tuesday, 02 July, 2002 7:01
  Subject: Inappropriate Proposals 
FAQ
  I have a few ideas for 
  fictional proposals to use as examples (my roomlayout idea, and Mark's 3-D 
  Mr. Potato Head representation), but I coulduse another one or two if 
  anyone feels creative. Thanks in advance for your input,Suzanne 
  ToppingBizWonk Inc.[EMAIL PROTECTED]


Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins


On Tuesday, July 2, 2002, at 11:39 AM, John Cowan wrote:


 1) If you map directly from multiple characters to a single glyph, you 
 don'
 t have to include glyphs in your font for all the pieces if they're
 never supposed to appear by themselves.  As an extreme example, if I
 implemented astral character support via ligating surrogate pairs, I'd
 need to include glyphs for the unpaired surrogates.

 More precisely, you need to have glyph *indexes* that are never mapped
 to glyphs.  The actual outlines themselves don't need to exist, AFAIK.


True.  I tend to avoid that, because if something goes wrong and the 
system attempts to actually *display* one of these virtual glyphs, 
disaster would ensue.  (Dave Opstad and I have had long debates on the 
safety of doing this.)

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/





Re: Inappropriate Proposals FAQ

2002-07-02 Thread Barry Caplan

At 10:01 AM 7/2/2002 -0400, Suzanne M. Topping wrote:
I have a few ideas for fictional proposals to use as examples (my room
layout idea, and Mark's 3-D Mr. Potato Head representation), but I could
use another one or two if anyone feels creative. The closer to being
believable, the better, I suppose. (An alternative would be to use
real-life proposals, and state why they were not accepted, but I thought
it more politic to keep it fictional...)


There was a discussion last year about a symbol to represent pi/2 or pi/4 or something 
like that. If you want to fictionalize that to some other fraction of a mathematical 
constant, that might work (e/2 perhaps?)

Barry Caplan
www.i18n.com





RE: ZWJ and Latin Ligatures

2002-07-02 Thread Marco Cimarosti

John Cowan wrote:
 More precisely, you need to have glyph *indexes* that are never mapped
 to glyphs.  The actual outlines themselves don't need to exist, AFAIK.

Yes, of course. E.g., I guess that the ZWJ glyph can be a pseudo-index
which doesn't actually index anything.

The next step could be standardizing the values of the glyph indexes, so
that the entire GSUB/morx table can be copied in from a template, and
type designers can concentrate on drawing the outlines.
:-)

_ Marco




Re: ZWJ and Latin Ligatures

2002-07-02 Thread Michael Everson

At 12:15 -0600 2002-07-02, John H. Jenkins wrote:
On Tuesday, July 2, 2002, at 11:39 AM, John Cowan wrote:


1) If you map directly from multiple characters to a single glyph, you don'
t have to include glyphs in your font for all the pieces if they're
never supposed to appear by themselves.  As an extreme example, if I
implemented astral character support via ligating surrogate pairs, I'd
need to include glyphs for the unpaired surrogates.

More precisely, you need to have glyph *indexes* that are never mapped
to glyphs.  The actual outlines themselves don't need to exist, AFAIK.


True.  I tend to avoid that, because if something goes wrong and the 
system attempts to actually *display* one of these virtual glyphs, 
disaster would ensue.  (Dave Opstad and I have had long debates on 
the safety of doing this.)

I have to confess I don't understand what you are talking about at 
all. Get me them tools, John!
-- 

Michael Everson *** Everson Typography *** http://www.evertype.com




Re: Can browsers show text? I don't think so!

2002-07-02 Thread David Starner

At 03:39 PM 7/2/02 +0200, Michael Jansson wrote:
Modern browsers know how to show the characters 'A'-'Z' and a few other
characters as long as you don't expect to format the text with a specific
font. You will get into trouble as soon as you want to use a font or
characters from other languages. You may find a solution for some languages
and some fonts on some platforms. Yet again, this is far from claiming that
modern browsers can show text.

This is text. Changing fonts is just flash. And frankly, no modern browser has
any trouble with anything from Latin, Cyrillic, Greek, or CJK; Hebrew, Thai,
Arabic and any script that doesn't shape or combine is usually supported too.
I'd say they show text just fine.

(I do not consider solutions where you have
to download a 10MB+ language package to see a page in a foreign language.
It's not a viable solution.)

So you'd rather download the fonts every time you want to view a page,
rather than just once? It's not like any one can't afford 10MB of space
anymore.

So what we have today are applications called web browsers that are very
good at showing images, and animations. They are not very good at showing
text, other than unformatted English text.

If you want to nitpick, they aren't that good at showing images;
look at how modern browsers fail the PNG transparency test one of these days.
And for most animations, you have to download 10MB+ plugins.

Every web browser since the beginning of time has supported at least bold,
italics and headings. And HTML has become a very common medium for formatted
text, and not just for English. Yes, they have failures in complex situations
that haven't had much work in them; no, not every font has or will have every
language in it. And if you want Adobe Acrobat, you know where to find it; web
browsing was never intended to give full control over fonts and display to
the creator of the documents; it was intended to give control over _meaning_.





Re: ZWJ and Latin Ligatures

2002-07-02 Thread John H. Jenkins


On Tuesday, July 2, 2002, at 12:51 PM, Marco Cimarosti wrote:

 The next step could be standardizing the values of the glyph indexes, so
 that the entire GSUB/morx table can be copied in from a template, and
 type designers can concentrate on drawing the outlines.


The typical approach these days is for the tools that provide advanced 
layout table support to be keyed to glyph name.  Apple's tools allow glyph 
name, glyph number, of Unicode code point as glyph identifiers.  As you 
say, it makes it possible to cut-and-paste source files and is very handy.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/





Q: Online multilingual text projects and handling missingchars./variants

2002-07-02 Thread Deborah W. Anderson

I am assembling a list of online multilingual text projects, including
online foreign language instruction projects. My current interest is in
projects created at or for the university, but is not limited to this
category.

I was wondering how such real-life projects (if indeed their creators
read this list) currently handle (a) missing Unicode characters, and (b)
being able to specify needed variants of characters. 

I'd be very grateful for any input. 

With many thanks,
Deborah Anderson
Researcher, Dept. of Linguistics
UC Berkeley







RE: Can browsers show text? I don't think so!

2002-07-02 Thread Michael Jansson

See below.

 -Original Message-
 From: David Starner [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, July 02, 2002 11:50 PM
 To: Michael Jansson; '[EMAIL PROTECTED]'
 Subject: Re: Can browsers show text? I don't think so!
 This is text. Changing fonts is just flash. And frankly, no 
 modern browser has
 any trouble with anything from Latin, Cyrillic, Greek, or 
 CJK; Hebrew, Thai,
 Arabic and any script that doesn't shape or combine is 
 usually supported too.
 I'd say they show text just fine.

Are you saying that browser do support all languages, as long as you exclude
all languages that are not supported? Well ;-)

My point is that it would be great if browsers supported all languages, no
matter how complicated the language is. Still, even with languages that does
not require shaping, you have problems. For example, a typical Western
Mac/Win/Unix user may not have a Georgian/Chinese/(insert your favorite
language here) font on his machine. This is a problem that is solved with
CSS 2. Still, there is not any wide spread support for web fonts in modern
browsers. I wonder why?

Being able to manually install fonts is not helpful in many cases either.
Mere mortals (you know, ordinary people that just want to surf the net)
don't know how to install fonts, nor would they no where to find fonts, or
even know that they are anyway. 

 
 (I do not consider solutions where you have
 to download a 10MB+ language package to see a page in a 
 foreign language.
 It's not a viable solution.)
 
 So you'd rather download the fonts every time you want to view a page,
 rather than just once? It's not like any one can't afford 
 10MB of space
 anymore.

You would not need 10MB to show a single web page, or even a full web site. 

The link below will take you to a web page that shows 500 Japanese
characters (courtesy of Morisawa Co Ltd) and a fairly large point size
(18px). The file size of the pages and the font data is roughly 20kb (font:
20kb, page: ~1kb) if you are using a popular browser on Mac OS9 or Windows.
It would take a modem users ~3-7s to load this page. Size is not an issue
w.r.t. web fonts.
http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html

This scales up very well as well, because pages may share font resources. A
font with 2000 characters would be 80kb in this case, and would perhaps work
for hundres of pages. Note also that I would only need to download this data
_once_ (it would stick stay in the browser cache), and it would be done
without any user interaction (i.e. no manual labour). You click on the link
and you see the text. It should not have to be more complicated than that. 

 
 So what we have today are applications called web browsers 
 that are very
 good at showing images, and animations. They are not very 
 good at showing
 text, other than unformatted English text.
 
 If you want to nitpick, they aren't that good at showing images;
 look at how modern browsers fail the PNG transparency test 
 one of these days.
 And for most animations, you have to download 10MB+ plugins.

OK, so they are good for nothing then... (just kidding ;-)

 
 Every web browser since the beginning of time has supported 
 at least bold,
 italics and headings. And HTML has become a very common 
 medium for formatted
 text, and not just for English. Yes, they have failures in 
 complex situations
 that haven't had much work in them; no, not every font has or 
 will have every
 language in it. And if you want Adobe Acrobat, you know where 
 to find it; web
 browsing was never intended to give full control over fonts 
 and display to
 the creator of the documents; it was intended to give control 
 over _meaning_.

I think that pretty much sums up peoples expectations on web browsers, which
is a shame. There are no technical reasons for why css/html4/xhtml can not
produce every bit as high quality as any other page layout format.

Also, what is it with people and the lack of interest in using fonts. Do
people actually think that you only need one font, possibly in bold, italic
and regular style? Do they think that other languages, e.g. Chinese, do not
use styles? Text should be beautiful to look at too!



Regards,
em2 Solutions
Michael Jansson




RE: Can browsers show text? I don't think so!

2002-07-02 Thread Murray Sargent

Michael Jansson says:

 There are no technical reasons for why css/html4/xhtml can not produce
every bit as high quality
 as any other page layout format.


Sadly this is currently far from the case. HTML/CSS even including CSS3
is far from a professional document publishing format. It doesn't even
have center/right/decimal tabs and tab leaders, which virtually all WP
systems have. The list of DTP omissions goes on and on. Defining their
own XMLs is the direction that WP systems are going in for interchange.
XSLT can be used to translate between these XMLs to the extent that the
features are translatable. XHTML/CSS is only used as a fallback for
browsers.

Which isn't to say that XHTML/CSS isn't cool. It is. But currently it's
a weak DTP format at best.

Murray




Re: Can browsers show text? I don't think so!

2002-07-02 Thread Stefan Persson

- Original Message -
From: Michael Jansson [EMAIL PROTECTED]
To: 'David Starner' [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Tuesday, July 02, 2002 11:16 PM
Subject: RE: Can browsers show text? I don't think so!

 http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html

Let me see... And if you'd like to copy some text from that page and paste
it into some document...?

Stefan


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com





Re: (long) Re: Chromatic font research

2002-07-02 Thread Kenneth Whistler

[*groans in the audience*]

I know, I know -- another contribution in the endless thread...

In re:
 
 The Respectfully Experiment 

 I used it as evidence that ideas about what should not be
 included in Unicode can change over a period of time as new scientific
 evidence is discovered.

Having been intimately involved in nearly all the decisions made
about what was included in Unicode over the last 13 years, and also
being formally trained as a scientist, I think I may be qualified
to dispute this conclusion.

Most of the change in ideas about what can be included in Unicode
have been the result of two types of influence:

  A. The encountering of legacy practice in preexisting character
 encodings which had to be accomodated for interoperability
 reasons. This accounts for many, if not all of the hinky little
 edge cases where Unicode appears to depart from its general
 principles for how to encode characters.

  B. The development of new processing requirements that required
 special kinds of encoded characters. This accounted for strange
 animals such as the bidi format controls, the BOM, the object
 replacement character, and the like.

There is a very narrow window of opportunity for *scientific*
evidence contributing to this -- namely, the result of graphological
analysis of previously poorly studied ancient or minority scripts,
which conceivably could turn up some obscure new principle of writing 
systems that would require Unicode to consider adding a new type of
character to accomodate it. But at this point, with Unicode having managed
to encode everything from Arabic to Mongolian to Han to Khmer..., I
consider it rather unlikely that scientific graphological study is going
to turn up many new fundamental principles here. As a scientific
*hypothesis* I think this surmise is proving to hold up rather well,
as our premier encoder of historic and minority scripts, Michael
Everson, has managed to successfully pull together encoding proposals,
based on current principles in Unicode, for dozens more scripts,
with little difficulty except for that inherent in extracting
information about rather poorly documented writing systems.

 it just seems to me that some
 extra ligature characters in the U+FB.. block would be useful.

Best practice, and near unanimous consensus in the Unicode Technical
Committee and among the correspondents on this list, would be
aligned with exactly the opposite opinion.

 In the
 light of this new evidence, I am wondering whether the decision not to
 encode any new ligatures in regular Unicode could possibly be looked at
 again.

As others have pointed out, The Respectfully Experiment did not
constitute new *evidence* of anything in this regard.

In any case, the UTC is quite unlikely to look at that decision again.

The exception that the UTC *has* considered recently was the Arabic
bismillah ligature, and the reason for doing so again was the result
of considering legacy practice. This thing exists in implemented
character encodings as a single encoded character. And furthermore,
it is used as a unitary symbol, in such a way that substituting out
an actual (long) string of Arabic letters and expecting the software
to ligate it correctly precisely in the contexts where it was being
used as a symbol, would place an unnecessary burden on both users and
on software implementations. That is *quite* different from the position
that claims that one, two, or dozens more Latin ligatures of two letters
need to be given standard Unicode encodings.

if it cannot be done or would cause great anguish and
 arguments, well, that is that, forget it.

Good idea.

--Ken





Re: ZWJ and Latin Ligatures

2002-07-02 Thread John Cowan

Michael Everson scripsit:

 I have to confess I don't understand what you are talking about at 
 all. Get me them tools, John!

Ligature tables at a high level tell you things like The glyph 'a'
and the glyph 'acute accent' should be merged to form the glyph
'aacute'.  Internally, though, it reads more like A #502 followed
by a #397 should be replaced by a #929, where the numbers (or
names, in some contexts) *represent* the actual glyph outlines.
You could write #202 followed by #999 becomes SHAVIAN PEEP glyph
without there being any actual outlines for #202 or #999, but as
John says, if something actually called for a #202 to be imaged,
the rendering software would go belly-up.

I hope this helps.

-- 
John Cowan[EMAIL PROTECTED]
At times of peril or dubitation,  http://www.ccil.org/~cowan
Perform swift circular ambulation,http://www.reutershealth.com
With loud and high-pitched ululation.




Re: RE: Can browsers show text? I don't think so!

2002-07-02 Thread starner

My point is that it would be great if browsers supported all languages, no
matter how complicated the language is. Still, even with languages that does
not require shaping, you have problems. For example, a typical Western
Mac/Win/Unix user may not have a Georgian/Chinese/(insert your favorite
language here) font on his machine. This is a problem that is solved with
CSS 2. Still, there is not any wide spread support for web fonts in modern
browsers. I wonder why?

Most people have fonts to display their own language - they came with their
operating system. I'm a Unicode geek, and it doesn't really matter if I can't 
see whether it displays correctly or not. My friends couldn't care less.

The link below will take you to a web page that shows 500 Japanese
characters (courtesy of Morisawa Co Ltd) and a fairly large point size
(18px). 

And I can't change the point size, which sucks. I can install a language
pack, which will let me change the point size, and work for all pages,
whether or not they share the font resources. 

This scales up very well as well, because pages may share font resources. A
font with 2000 characters would be 80kb in this case, and would perhaps work
for hundres of pages. 

And would fail the instant someone added a new character.

Also, what is it with people and the lack of interest in using fonts. Do
people actually think that you only need one font, possibly in bold, italic
and regular style? Do they think that other languages, e.g. Chinese, do not
use styles? Text should be beautiful to look at too!

But text should be readable first. Typographers will probably flame me for 
this, but for English, there's only two or three distinct readable fonts (with 
a thousand minor variations on the form.) I'd usually prefer to see my serif 
font, instead of some bitmap font someone else chose, as mine will be scalable 
and anti-aliased. Pictures work better than fonts for fancy titles, and are 
already used for that.





Re: Can browsers show text? I don't think so!

2002-07-02 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B


  http://fairy.em2-solutions.com/userfiles/morisawa/rll500.html

I loaded the beginning of that document, and it looks like just a bunch of 
characters from the start of a list of characters in "aiueo-jun" (Japanese 
"alphabetical order"). Not a real "document",

Is what you want something like what you can find at www.shodouka.com? Like 
if you are trying to view your message board on an American library 
computer and all you get is mojibake instead of a Japanese message. 
Shodouka will display images for text.

There is a Web site that can do furigana, kind of. (Its mistakes are 
sometimes funny, but if you are a student of Japanese trying to read 
Japanese Web pages, it can be a lot of help.) If you do a search on 
kids.goo.ne.jp, and choose "furigana ari", it will give you your furigana. 
I wonder if there is a romaji version we could use?



$B==0l$A$c$s??$N0&$OB8:_$7$J$$$N!)(B

_
$B%&%#%k%9%a!<%k!"LBOG%a!<%kBP:v$J$i(B MSN Hotmail http://www.hotmail.com/JA