On Fri, Aug 04, 2006 at 09:04:43AM +0200, Werner LEMBERG wrote:
> > With my proposed context system it doesn't save but a few bytes
> > total in the font file since the context rules can be shared by all
> > the characters that need them.
> 
> Details, please.

I've got an email I was preparing to send with more details. Got
interrupted earlier and then my screen session crashed but I recovered
it and will send soon, I expect.

> > > Having a string of input character codes, you apply the first
> > > lookup table, then you start again and process the next one, and
> > > so on until all lookup tables have been applied.
> >
> > Wow, what a horribly bad design. No wonder including arabic
> > initial/medial/final information would make the font so big.
> 
> Why do you think that it is bad design?

Let's say you have 40 characters that all need to change glyph when
they are preceded by any one of 30 other characters. This makes 30*40
substitution rules! If instead you could express the "preceded by one
of those 30 characters" as a single context definition, then you would
just need one context definition and 40 rules, one for each character
to tell the alternate glyph to use under this context.

> How would you activate and
> deactive typographical features?  This goes far beyond `console
> fonts', so it is of course more `complicated' than you expect.

Yes it's outside the scope of what I'm considering; however it might
be worth making a spec that could be used for high quality scalable
fonts too.

> However, it works reasonably well, and noone asks you to use more than
> a single lookup.

What do you mean by this "single lookup"?

BTW another issue of the substitution rules is that, as far as I can
tell, they can delete or insert extra glyphs arbitrarily. From a
character cell perspective that's very bad, since it makes it possible
that the font represents things that cannot be displayed consistently
in the cells.

> You might compare this with AAT from Apple (the `morx' table as
> documented in the URL I posted in a previous mail).  This is something
> similar but far more complicated to debug since it uses automatons
> which can have almost infinite states.

Yes I read a little bit about it and I agree. In a way it's more like
what I want, but overly complex. My proposed bytecode (or rather
vlc-code) system intentionally lacks all constructs that can lead to
loops and near-infinite state spaces.

> Aah, you will receive the Nobel prize for this.  What you've done is
> apparently better than man-years of works done by font experts.

Font experts are not machine experts.

> Be serious!  I want to see not only ideas but a complete
> specification.  THEN I believe you.

You probably won't be satisfied yet but I'll post anyway so you can
see how it's progressing.

> And there is still the question
> who is going to implement this.

Anyone can since the spec is trivial to implement. With all but the
context-matcher implemented so far, my implementation compiles to 528
bytes of i386 code.

> > GPOS is undesirable for character cell glyphs.
> 
> Not at all.  It handles accent stacking, and it can be even used for
> fixed-width fonts (which simplifies the tables enormously).

Well, it does reduce the number of glyph variants needed for accent
marks, but at the expense of allowing the font to specify something
that cannot be represented in the character cells, and of complicating
the rendering implementation.

> > > > Mongolian can be and is written horizontally as well.
> > >
> > > Using Cyrillic, yes, but not the traditional script, AFAIK.
> >
> > No, in the traditional script.
> 
> I stand corrected, I haven't known that.  Can you give a URL?

I didn't find any better references searching google than you would.
It seems to be a new invention, and the glyphs are rotated 90 degrees
from their vertical presentation in order to combine nicely. I don't
know what the people's attitude towards this style is.

> > Ask yourself this: what would a speaker of Urdu do if they needed to
> > write a message and the only paper they had was barely tall enough
> > for one handwritten letter. If your answer is "write it all on one
> > line" or even "write it essentially on one line with each word
> > slanted slightly diagonal" then there's absolutely no reason the
> > same can't be done on a computer terminal.
> 
> Uh, oh, I can also write German with uppercase letters only if there
> isn't enough room for descenders.  Is this a solution?

Very different issue. The Urdu system required unbounded height for a
single line of text. Surely you can shrink your text by a few percent
to fit descenders, but can you shink it by an unbounded factor? :)

> > [...] I do believe very strongly that all people (including English
> > speakers) should tolerate having their language displayed in a form
> > that respects both the intended look of the script and the nature of
> > the medium on which it's displayed.
> 
> Of course!  I've asked you already *which scripts* you want to
> support, and you still haven't answered.

In terms of what I "want" (for my own use): Latin, Tibetan, Japanese,
and mathematical notations.

In terms of what I demand that my system support, the answer should be
"everything", modulo strange line flow conventions that are
incompatible with the notion of a character cell terminal. At this
time I don't know enough to make a good bidi terminal implementation,
so we'll say the initial release of uuterm will probably only support
l2r scripts properly. However this will not be a limitation of the
glyph system, just a limitation of my terminal emulation which will be
remedied either as I learn more about bidi, or as someone else
volunteers to write the support. :)

> Just say that the normal
> Urdu style of writing isn't going to be supported; they have to use
> standard Arabic writing (using their extended character set, of
> course).

OK, we're just arguing over defintions. I say Urdu script is
supported, but Urdu line layout style is not. You can say this however
you like.

Rich


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to