subject:"Re\: Vertical scripts \(was\: Tategaki \(was\: Re\: Updated...\)\)"

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-05 Thread DougEwell2


I hasten to add:

> UTF-8 and UTF-32, at least, already have the architecture 
> to represent 2^31 and 2^32 code points, respectively.  The definitions 
would 
> simply have to changed to make the additional code points legal.
>
> Only UTF-16 would truly need to be redesigned, and that has already been 
> proposed.

None of this is actually going to happen, of course.  Unicode and 10646 are 
committed to staying with 17 planes.  I was just pointing out that certain 
individuals had made informal proposals to extend the code space.

-Doug Ewell
 Fullerton, California

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-05 Thread DougEwell2

In a message dated 2002-01-02 5:05:23 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

> There are worse things than thi: what if someone discovers a script with
> more than 1,114,111 characters? Back to the drawing board to redesign all
> the UTF's!

Not all of them.  UTF-8 and UTF-32, at least, already have the architecture 
to represent 2^31 and 2^32 code points, respectively.  The definitions would 
simply have to changed to make the additional code points legal.

Only UTF-16 would truly need to be redesigned, and that has already been 
proposed.  For example, Masahiko Maedera once proposed a "UTF-16x" in which 
code points in the U+EExxx block were designated as "super surrogates."  
Three of these "super surrogates," or six 16-bit words, would be combined to 
represent code points beyond plane 17.  (This was back in the days when some 
people felt that a great and crippling schism existed between Unicode and ISO 
10646 because the former disallowed such code points and the latter allowed 
them.)

-Doug Ewell
 Fullerton, California

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-03 Thread Michael Everson


At 09:38 -0500 2002-01-03, John Cowan wrote:

>This leads to an interesting, if so far theoretical, Unicode question:
>how to encode abjads and abugidas that have vowel signs which are
>pronounced *before* the base consonant.  Two Unicode principles,
>logical order and base-before-combining, are thus put into conflict.
>
>In (Feanorian) Tengwar itself, the reading order is actually
>language-dependent: thus "Quenya" (a Quenya word) is written
>QU-e-N-y-a (where caps are base, smalls are combining), but
>"Sindarin" (a Sindarin word) would be "S-N-i-D-R-a-N-i", if written with
>base-before-combining, or "S-i-N-D-a-R-i-N" if written with logical order,
>in which case the default grapheme clusters have to be broken up using
>complex rendering code in order to get i over N and a over R.

Did you not read my draft paper proposing the solution for this 
feature of this script?
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-03 Thread John Cowan

I wrote:

> Vowel marks appearing to the left of the
> consonants are pronounced before them; those to the right, after them.

This leads to an interesting, if so far theoretical, Unicode question:
how to encode abjads and abugidas that have vowel signs which are
pronounced *before* the base consonant.  Two Unicode principles,
logical order and base-before-combining, are thus put into conflict.

In (Feanorian) Tengwar itself, the reading order is actually
language-dependent: thus "Quenya" (a Quenya word) is written
QU-e-N-y-a (where caps are base, smalls are combining), but
"Sindarin" (a Sindarin word) would be "S-N-i-D-R-a-N-i", if written with
base-before-combining, or "S-i-N-D-a-R-i-N" if written with logical order,
in which case the default grapheme clusters have to be broken up using
complex rendering code in order to get i over N and a over R.

The problem could be sidestepped with a grapheme-cluster encoding such as
is used for Ethiopic, but the feel is very different: Ethiopic vowel
signs are normally treated as part of the letter, whereas Tengwar
vowel signs are more like typical abjad signs: partly optional
indications of "colorings" to the fundamental consonant structure.

Unicode tribal elders are invited to mention which of the two conflicting
principles they reckon to be the more important.

-- 
John Cowan   http://www.ccil.org/~cowan  [EMAIL PROTECTED]
Please leave your values|   Check your assumptions.  In fact,
   at the front desk.   |  check your assumptions at the door.
 --sign in Paris hotel  |--Miles Vorkosigan

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-03 Thread John Cowan


Patrick Andries scripsit:

> This is the time for an aspiring J. R. R. Tolkien to leave his mark in 
> the Unicode saga by adopting a new strictly vertical script...à la Tengwar.

JRRT actually did create such a vertical script, which was used in
the Blessed Realm before Feanor got around to creating the Tengwar
as we know them today: the Sarati of Ruumil.  This is a TTB LTR
abjad, like Mongolian.  Vowel marks appearing to the left of the
consonants are pronounced before them; those to the right, after them.

http://user.tninet.se/~xof995c/sarati.htm

-- 
John Cowan   http://www.ccil.org/~cowan  [EMAIL PROTECTED]
Please leave your values|   Check your assumptions.  In fact,
   at the front desk.   |  check your assumptions at the door.
 --sign in Paris hotel  |--Miles Vorkosigan

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-03 Thread Michael Everson


At 16:51 -0800 2002-01-02, Kenneth Whistler wrote:
>John Wilcock wrote:
>
>>  All *known* vertical scripts! What happens if someone discovers a
>  > hitherto-unknown vertical script that is never written horizontally?

It would be unthinkable that merchants using such a script wouldn't 
have horizontal and vertical variants for shop signs and neon.

And crossword puzzles.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-03 Thread Michael Everson


At 20:14 -0500 2002-01-02, Patrick Andries wrote:

>This is the time for an aspiring J. R. R. Tolkien to leave his mark 
>in the Unicode saga by adopting a new strictly vertical script...à 
>la Tengwar.

That would be Sarati. Which I have already proposed for addition to 
the SMP, though for now it is waiting in the wings.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-02 Thread Patrick Andries

Kenneth Whistler wrote:

>
>Also, you'd have to go pretty far out to find a "hitherto-unknown
>vertical script" that has escaped the eagle eyes of the Unicode
>Roadmap committee. See, for example:
>
>http://www.unicode.org/roadmaps/smp-3-1.html
>

This is the time for an aspiring J. R. R. Tolkien to leave his mark in 
the Unicode saga by adopting a new strictly vertical script...à la Tengwar.

He will, of course, first have to convince his editor...

Best wishes for 2002

Unicode en français
http://hapax.iquebec.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-02 Thread Kenneth Whistler

John Wilcock wrote:

> All *known* vertical scripts! What happens if someone discovers a
> hitherto-unknown vertical script that is never written horizontally?

I predict that the people who want to write about it will quickly
render it LTR horizontally, to match the metadirectionality of
the script they use to write about it.

Scholars already regularly turn RTL epigraphy into LTR when they
want to cite it in text (other than in facsimiles), to avoid the
bidi problem.

Also, you'd have to go pretty far out to find a "hitherto-unknown
vertical script" that has escaped the eagle eyes of the Unicode
Roadmap committee. See, for example:

http://www.unicode.org/roadmaps/smp-3-1.html

--Ken

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-02 Thread Kenneth Whistler


Sampo Syreeni wrote a fine FAQ answer about rendering directionality
and then asked:

> BTW, something akin to the above should really go in a FAQ. Is there
> anything resembling a Unicode FAQ in existence, anywhere?

Well, you could start on the Unicode home page http://www.unicode.org/
and click on the "FAQ" link. ;-) There's even a section in the FAQ
on "Writing Directions", to which a distilled-down version of
some of this discussion might be a fine addition.

--Ken

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-02 Thread Tex Texin


Forget about ancient and dead scripts. What about the future, when we
start communicating with extra-terrestrials and have to start encoding
all the other scripts in the galaxy! utf-googoolplex!

;-)

(And don't nobody bring up klingon...)

And so the new year begins on the Unicode list
tex


Marco Cimarosti wrote:
> 
> John Wilcock wrote:
> > All *known* vertical scripts! What happens if someone discovers a
> > hitherto-unknown vertical script that is never written horizontally?
> 
> There are worse things than thi: what if someone discovers a script with
> more than 1,114,111 characters? Back to the drawing board to redesign all
> the UTF's!
> 
> :-)
> _ Marco

-- 
-
Tex TexinDirector, International Business
mailto:[EMAIL PROTECTED]Tel: +1-781-280-4271
the Progress Company Fax: +1-781-280-4655
-
For a compelling demonstration for Unicode:
http://www.geocities.com/i18nguy/unicode-example.html

RE: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-02 Thread Marco Cimarosti

John Wilcock wrote:
> All *known* vertical scripts! What happens if someone discovers a
> hitherto-unknown vertical script that is never written horizontally?

There are worse things than thi: what if someone discovers a script with
more than 1,114,111 characters? Back to the drawing board to redesign all
the UTF's!

:-)
_ Marco

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-02 Thread John Wilcock


On Wed, 2 Jan 2002 11:27:02 +0100 , Marco Cimarosti wrote:
> Because all vertical scripts (CJK and Mongolian) can also be written
> horizontally, whereas modern right-to-left script cannot be written
> left-to-right.

All *known* vertical scripts! What happens if someone discovers a
hitherto-unknown vertical script that is never written horizontally?

John.

-- 
-- Over 1600 webcams from ski resorts around the world - http://www.snoweye.com/
-- Translate your technical documents and web pages- http://www.tradoc.fr/

RE: Vertical scripts (was: Tategaki (was: Re: Updated...))

2002-01-02 Thread Marco Cimarosti

Doug Ewell wrote:
> TUS 3.0 states (p. 24): "In contrast to the bidirectional 
> case, the choice to lay out text either vertically or
> horizontally is treated as a formatting style.
> [...] why should overrides of default horizontal
> directionality be a plain-text issue but overrides of
> default vertical directionality be a higher-level 
> "formatting style" issue?

Because all vertical scripts (CJK and Mongolian) can also be written
horizontally, whereas modern right-to-left script cannot be written
left-to-right.

Also, all horizontal scripts, when embedded in Far East text, may be written
vertically by rotating them 90° degrees (clockwise for LTR scripts,
counterclockwise RTL scripts).

So you can happily define a system-level vertical/horizontal preference, and
use it blindly for plain text in any kind of script.

_ Marco

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-31 Thread Asmus Freytag


At 12:22 PM 12/31/01 -0500, Tex Texin wrote:
>I was fooled by that earlier in the year as well. The links to the other
>pages should be at the top of the web page to highlight that the page is
>a partial list and to make it easy to reference the other pages. Most
>people will not scroll to the bottom of the page to find the other
>links.

That's in the plan for the 3.2 upgrade I'm working on.

A./

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-31 Thread Tex Texin


I was fooled by that earlier in the year as well. The links to the other
pages should be at the top of the web page to highlight that the page is
a partial list and to make it easy to reference the other pages. Most
people will not scroll to the bottom of the page to find the other
links.


Michael Everson wrote:
stuff deleted...
> >Did you miss these?
> 
> Bloody hell. Yes, I missed them, because I assumed that the
> charindex.html indexed all the characters. It does NOT! It indexes
> A-D. Now I assumed that when it loaded I could just command-F and
> find the text. So I did not scroll down the list. Therefore:
> 
> I suggest that the Title of this document be changed to:
> 
> Unicode 3.0.0 Character Name Index A-D
> 
> and the other two (charindex2.html and charindex3.html) to
> 
> Unicode 3.0.0 Character Name Index E-N
> and
> Unicode 3.0.0 Character Name Index O-Z
> 
> --
> Michael Everson *** Everson Typography *** http://www.evertype.com

-- 
-
Tex TexinDirector, International Business
mailto:[EMAIL PROTECTED]Tel: +1-781-280-4271
the Progress Company Fax: +1-781-280-4655
-
For a compelling demonstration for Unicode:
http://www.geocities.com/i18nguy/unicode-example.html

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-31 Thread Michael Everson


At 20:31 -0800 2001-12-30, Asmus Freytag wrote:
>At 12:50 PM 12/30/01 +, Michael Everson wrote:
>>At 18:31 -0800 2001-12-29, Asmus Freytag wrote:
>>>
>>>Please see
>>>
>>>http://www.unicode.org/charts/charindex.html
>>
>>That's not very helpful, Asmus. I went there and tried searching 
>>"override", "left-to-right", and "left to right" and nothing was 
>>found.
>
>Quoting right from the file:
>
>these entries should have been what you were looking for:
>
>LEFT-TO-RIGHT OVERRIDE 202D
>OVERRIDE, LEFT-TO-RIGHT 202D
>
>and even:
>OVERRIDE, RIGHT-TO-LEFT 202E
>
>Did you miss these?

Bloody hell. Yes, I missed them, because I assumed that the 
charindex.html indexed all the characters. It does NOT! It indexes 
A-D. Now I assumed that when it loaded I could just command-F and 
find the text. So I did not scroll down the list. Therefore:

I suggest that the Title of this document be changed to:

Unicode 3.0.0 Character Name Index A-D

and the other two (charindex2.html and charindex3.html) to

Unicode 3.0.0 Character Name Index E-N
and
Unicode 3.0.0 Character Name Index O-Z


-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-30 Thread Tex Texin


Thanks for the explanation Asmus.
tex

Asmus Freytag wrote:
> 
> At 02:33 PM 12/30/01 -0500, Tex Texin wrote:
> >It is a bit inconsistent and therefore confusing.
> >
> >I searched for "bidirectional" which immediately pointed me at the
> >general punctuation pages in a pdf file.
> >Searching for "bidrectional" in that file turns up empty.
> 
> This is one of the few cases of an index entry that has no corresponding
> line in the nameslist file. Usually the index entry is derived directly
> from the character names and aliases, or the text of the block names and
> sub headers. That's the reason you couldn't find "bidirectional" in the pdf
> file. The subheaderin this case is just "Formatting characters" and that's
> not very specific.
> 
> >If you search
> >for left-to-right, right-to-left, override, or embed, there you do get
> >to the characters. However a saving grace is that when you are first
> >pointed at the general punctuation file, the character code 202A is
> >mentioned, so if you notice that you can go right to the character
> >range.
> 
> I'll make sure that is clearly worded in the instructions.
> 
> >Maybe the initial index needs to be more comprehensive. It is usually a
> >difficult task for any large book to get right. However, tracking the
> >web queries might help improve it over time...
> 
> The problem you encountered was one where the index is already more
> comprehensive and detailed than the nameslist. ;-)
> 
> One could monkey with the nameslist, adding the subheader for the
> bidirectional controls, but then we would pick up a number of one-character
> ranges with subheaders, which becomes awkward in itself.
> 
> A./
> 
> PS: I'm in the process of updating the HTML files for the index to match
> the contents of the Index-3.2.0dnn.txt file in the BETA directory. That
> file covers the new 3.2 character names etc. but does not pick up new or
> revised aliases and subheaders in the existing repertoire...

-- 
-
Tex TexinDirector, International Business
mailto:[EMAIL PROTECTED]Tel: +1-781-280-4271
the Progress Company Fax: +1-781-280-4655
-
For a compelling demonstration for Unicode:
http://www.geocities.com/i18nguy/unicode-example.html

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-30 Thread Asmus Freytag

At 02:33 PM 12/30/01 -0500, Tex Texin wrote:
>It is a bit inconsistent and therefore confusing.
>
>I searched for "bidirectional" which immediately pointed me at the
>general punctuation pages in a pdf file.
>Searching for "bidrectional" in that file turns up empty.

This is one of the few cases of an index entry that has no corresponding 
line in the nameslist file. Usually the index entry is derived directly 
from the character names and aliases, or the text of the block names and 
sub headers. That's the reason you couldn't find "bidirectional" in the pdf 
file. The subheaderin this case is just "Formatting characters" and that's 
not very specific.

>If you search
>for left-to-right, right-to-left, override, or embed, there you do get
>to the characters. However a saving grace is that when you are first
>pointed at the general punctuation file, the character code 202A is
>mentioned, so if you notice that you can go right to the character
>range.

I'll make sure that is clearly worded in the instructions.

>Maybe the initial index needs to be more comprehensive. It is usually a
>difficult task for any large book to get right. However, tracking the
>web queries might help improve it over time...

The problem you encountered was one where the index is already more 
comprehensive and detailed than the nameslist. ;-)

One could monkey with the nameslist, adding the subheader for the 
bidirectional controls, but then we would pick up a number of one-character 
ranges with subheaders, which becomes awkward in itself.

A./

PS: I'm in the process of updating the HTML files for the index to match 
the contents of the Index-3.2.0dnn.txt file in the BETA directory. That 
file covers the new 3.2 character names etc. but does not pick up new or 
revised aliases and subheaders in the existing repertoire...

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-30 Thread Asmus Freytag


At 12:50 PM 12/30/01 +, Michael Everson wrote:
>At 18:31 -0800 2001-12-29, Asmus Freytag wrote:
>>
>>Please see
>>
>>http://www.unicode.org/charts/charindex.html
>
>That's not very helpful, Asmus. I went there and tried searching 
>"override", "left-to-right", and "left to right" and nothing was found.

Quoting right from the file:

these entries should have been what you were looking for:

LEFT-TO-RIGHT OVERRIDE 202D
OVERRIDE, LEFT-TO-RIGHT 202D

and even:
OVERRIDE, RIGHT-TO-LEFT 202E

Did you miss these?

A./

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-30 Thread Tex Texin

It is a bit inconsistent and therefore confusing.

I searched for "bidirectional" which immediately pointed me at the
general punctuation pages in a pdf file.
Searching for "bidrectional" in that file turns up empty. If you search
for left-to-right, right-to-left, override, or embed, there you do get
to the characters. However a saving grace is that when you are first
pointed at the general punctuation file, the character code 202A is
mentioned, so if you notice that you can go right to the character
range.

Maybe the initial index needs to be more comprehensive. It is usually a
difficult task for any large book to get right. However, tracking the
web queries might help improve it over time...

tex

Michael Everson wrote:
> 
> At 18:31 -0800 2001-12-29, Asmus Freytag wrote:
> >At 12:07 PM 12/29/01 +0100, Stefan Persson wrote:
> >>  > Seeing that Unicode already has left-to-right and right-to-left override
> >>>  characters, I wonder if a top-to-bottom override character might also be
> >>>  reasonable.
> >>
> >>Which are the code points for these characters?
> >
> >Please see
> >
> >http://www.unicode.org/charts/charindex.html
> 
> That's not very helpful, Asmus. I went there and tried searching
> "override", "left-to-right", and "left to right" and nothing was
> found.
> --
> Michael Everson *** Everson Typography *** http://www.evertype.com

-- 
-
Tex TexinDirector, International Business
mailto:[EMAIL PROTECTED]Tel: +1-781-280-4271
the Progress Company Fax: +1-781-280-4655
-
For a compelling demonstration for Unicode:
http://www.geocities.com/i18nguy/unicode-example.html

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-30 Thread Michael Everson


At 18:31 -0800 2001-12-29, Asmus Freytag wrote:
>At 12:07 PM 12/29/01 +0100, Stefan Persson wrote:
>>  > Seeing that Unicode already has left-to-right and right-to-left override
>>>  characters, I wonder if a top-to-bottom override character might also be
>>>  reasonable.
>>
>>Which are the code points for these characters?
>
>Please see
>
>http://www.unicode.org/charts/charindex.html

That's not very helpful, Asmus. I went there and tried searching 
"override", "left-to-right", and "left to right" and nothing was 
found.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Asmus Freytag


At 12:07 PM 12/29/01 +0100, Stefan Persson wrote:
> > Seeing that Unicode already has left-to-right and right-to-left override
> > characters, I wonder if a top-to-bottom override character might also be
> > reasonable.
>
>Which are the code points for these characters?

Please see

http://www.unicode.org/charts/charindex.html

A./

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Curtis Clark


At 04:00 AM 12/29/01, Michael Everson wrote:
>When written in manuscripts and on computers, Ogham is written as Latin 
>is. When inscribed on stone, it is written bottom-to-top, along the top of 
>the stone, and then down to the bottom on the other side. I don't believe 
>that there are any examples of multiple-line Ogham lapidary text.

One could well argue, too, that when computer-controlled devices for 
cutting ogham stones become common, higher-level protocols will be 
necessary for proper placement of the glyphs.


-- 
Curtis Clark  http://www.csupomona.edu/~jcclark/
Mockingbird Font Works  http://www.mockfont.com/

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Sampo Syreeni

On Sat, 29 Dec 2001 [EMAIL PROTECTED] wrote:

>Tex's example may or may not be realistic -- I have no way of knowing --
>but in suggesting a top-to-bottom directional override, I had hoped it
>would be possible to represent a run of text such as Tex describes
>without resorting to the infamous "higher protocol."

But it is. Unicode just does not take a stand on how it should be
formatted. See below.

>This may seem arbitrary to some; why should overrides of default
>horizontal directionality be a plain-text issue but overrides of default
>vertical directionality be a higher-level "formatting style" issue? I
>hope this discussion can shed some light on this question, and possibly
>help me see what I may be missing.

I think this has to do with the way people conceive the term "plaintext"
-- anything beyond a simple line (or column) based flow layout will likely
be thought of as "rich" instead. The reason is both historical and
practical. Text is laid out like this in most cultures, and early
printing/computer/typewriter technology followed suit. The matter of mixed
writing directions is a relatively new one, and so isn't really covered by
the concept of "plaintext".

The practical reason is that comprehensive layout of fully free direction
text is really difficult, if not impossible, whereas writing systems with
identical line progression directions are more or less compatible, using a
simplish algorithm (Unicode BiDi). If you look at the way text is normally
displayed on 2D media, it's printed in a unidirectional stream and then
chopped into lines at sheet edge. As long as the lines progress in the
same direction, you can always manipulate the order of the symbols within
the stream to get more or less correct display of mixed script
directionalities. (Yes, line breaking and deeply nested BiDi levels are
still troublesome.) This way, lr-tb is sorta compatible with rl-tb.
There are of course three more pairs, not counting boustrophedon and the
likes, but AFAIK this is the most common combination.

It's also where the ease stops. If you try to mix opposite line
progression directions, you will end up with something like the Unicode
BiDi algo, only applied at the paragraph level. That soon becomes
unreadable, and makes for really lousy APIs. (Even BiDi is difficult, as
one usually needs to render entire paragraphs at a time.) Mixing vertical
and horizontal writing modes is even more complicated since you cannot
think of the text as a directional, chopped-into-lines stream, anymore.
You *can* use all sorts of funky heuristics, but keeping the text both
readable and "plain" is pretty much impossible. (If you don't believe
that, think about how you would format a string of 1000 lr-tb, 100 tb-lr,
100 rl-bt and 1000 bt-lr characters. This is not a realistic example, of
course, but illustrates the general point.)

Now, there are many ways to cope with simplified variations of the theme.
One is to rotate nested characters of foreign directionality so that the
character progression direction for all the scripts present remains the
same, no matter what the script. E.g. XSL-FO documentation gives a number
of examples of this approach. Another is to force the character
progression direction to agree between scripts, without rotation. This
only works when characters are graphically separate, like they are in the
Latin script or scripts based on Han ideographs. Top-to-bottom Latin
within Japanese is a good example. (It also illustrates the effects on
readability of messing with the natural directionality of text.) You can
also print short spans of foreign text in its natural direction, within a
line of text of differing native directionality. Metric units, printed in
Latin within tb-rl traditional Japanese, are probably the most common
case. I'm sure that people on this list could cite countless weirder
examples.

The point is, all such solutions are for special cases. They do not solve
the problem of how to fit longer, nested spans with arbitrary
directionality on a page without in some cases making the text as a whole
illegible and/or unaesthetic. Hence, it's better to handle the special
cases as what they are, instead of bringing them all into Unicode and
forcing every Unicode compatible application to incorporate a full page
layout engine. I think this is the ultimate reason why TUS 3.0 leaves this
stuff to those "higher level protocols".

We might in fact say that the Unicode Standard has two completely separate
parts. The first is the logical encoding of any character based script as
a stream of character codes, the second is an actual 2D, line based
rendering of the encoding for the very special case where two scripts of
identical line progression direction are mixed. Anything beyond this could
well be said to be beyond the scope of TUS. We might indeed go as far as
to say that certain combinations of scripts which *can* be encoded in
Unicode, *cannot* actually be consistently rendered on 2D graphical media.

(Afte

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Michael Everson

At 02:47 -0500 2001-12-29, [EMAIL PROTECTED] wrote:

>Actually, there is a more serious problem involved with vertical directional
>overrides: They would force the Unicode plain-text mechanism to become aware
>of both vertical directionality and directional priority.  This sounds
>obvious, but in fact there are not two, but THREE issues involved with text
>directionality:
>
>1.  Horizontal, that is, left-to-right (LTR) versus right-to-left (RTL).
>2.  Vertical, that is, top-to-bottom (TTB) versus bottom-to-top (BTT).
>3.  Priority of direction (e.g. (LTR, TTB) versus (TTB, LTR)).

There are more complex aspects of layout that might apply to Egyptian 
and Mayan.

> Ogham is either (LTR, TTB) or (BTT, ???).

When written in manuscripts and on computers, Ogham is written as 
Latin is. When inscribed on stone, it is written bottom-to-top, along 
the top of the stone, and then down to the bottom on the other side. 
I don't believe that there are any examples of multiple-line Ogham 
lapidary text. By analogy with the manuscript tradition, I would 
recommend (BTT, LTR) for Ogham vertical columnar display.

>Unicode characters have a default directionality, but both this and the
>override mechanism cover only the horizontal aspect, not the vertical aspect
>or the priority of one over the other.  Thus, Mongolian characters are
>assigned the same directionality code as Latin ("L") even though the TTB
>directionality takes precedence over the LTR, the opposite of Latin.

Not in mixed Latin/Mongolian text. Mongolians do interesting things 
too with Latin words in predominantly Mongolan text. But it seems 
that the whole thing is done by rotating the whole text field.

>And there is no plain-text way to indicate the alternative directionality of
>Ogham or Han.

I think it is a question of DTP layout for Ogham, at least.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Michael Everson

At 12:07 +0100 2001-12-29, Stefan Persson wrote:

>Someone said that Unicode contains switches for LTR & RTL. By adding
>switches for TTB and BTT this problem could be solved. It would also be
>necessary to define a priority order (i.e. which of them that should come
>first).
>
>As an alternative solution, the current switches could be considered LTR,
>TTB and RTL, TTB. Then 6 other code points would be necessary for the other
>directions.

I can't imagine this working for Egyptian or Mayan, or indeed 
Mongolian or Ogham.

Mongolian and Ogham, when mixed with Latin text, are traditionally 
written LTR (sometimes Mongolian is RTL). It isn't normal or natural 
to do otherwise.

Egyptian LTR or RTL is not problematic. But for columnar display, in 
current applications, markup is used.

Mayan writes LTR or RTL in repeated columns of two, as I recall. I 
strongly suspect markup is required for this behaviour.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Stefan Persson

- Original Message -
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: den 29 december 2001 08:47
Subject: Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

> 1.  Horizontal, that is, left-to-right (LTR) versus right-to-left (RTL).
> 2.  Vertical, that is, top-to-bottom (TTB) versus bottom-to-top (BTT).
> 3.  Priority of direction (e.g. (LTR, TTB) versus (TTB, LTR)).
> [...]
> An elaboration of the directional override mechanism to handle vertical
> directionality would have to take priority into account as well.  Instead
of
> two directionalities, LTR and RTL, the Unicode Standard would have to
> consider eight.  The Bidirectional Algorithm might have to become
> Octodirectional, with a commensurate increase in complexity.  Perhaps this
is
> the problem that is avoided by declaring vertical directionality to be a
> higher-level "formatting style" issue.  But it still seems arbitrary.

Someone said that Unicode contains switches for LTR & RTL. By adding
switches for TTB and BTT this problem could be solved. It would also be
necessary to define a priority order (i.e. which of them that should come
first).

As an alternative solution, the current switches could be considered LTR,
TTB and RTL, TTB. Then 6 other code points would be necessary for the other
directions.

_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Stefan Persson


- Original Message - 
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: den 26 december 2001 06:48
Subject: Re: Vertical scripts (was: Tategaki (was: Re: Updated...))


> Seeing that Unicode already has left-to-right and right-to-left override 
> characters, I wonder if a top-to-bottom override character might also be 
> reasonable.

Which are the code points for these characters?


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread DougEwell2


Tex Texin replied to Marco Cimarosti:

>> Right-to-left vs. left-to-right are attributes of arbitrary *spans* of 
text,
>> which can easily be mixed within the same paragraph.
>>
>> On the other hand, horizontal vs. vertical are attributes that can be only
>> be applied to a whole paragraph or section.
>
> Marco, is that true? I thought that sometimes numbers for example "123."
> might be written horizontally in the middle of a vertical run.

Marco responded:

> But that would a limited case for horizontal text embedded in vertical text:
> I cannot imagine a real-world situation for a vertical text embedded in
> horizontal text.

And Sampo Syreeni weighed in:

> I think this is something better handled by special-casing in rendering
> software -- the numbers (and whatnot) could be rendered as rotated or
> straight top-to-bottom as well. Considering this, it seems like a
> stylistic variation better controlled by an upper level protocol, if at
> all.

Tex's example may or may not be realistic -- I have no way of knowing -- but 
in suggesting a top-to-bottom directional override, I had hoped it would be 
possible to represent a run of text such as Tex describes without resorting 
to the infamous "higher protocol."

TUS 3.0 states (p. 24): "In contrast to the bidirectional case, the choice to 
lay out text either vertically or horizontally is treated as a formatting 
style.  Therefore, the Unicode Standard does not provide directionality 
controls to specify that choice."  This may seem arbitrary to some; why 
should overrides of default horizontal directionality be a plain-text issue 
but overrides of default vertical directionality be a higher-level 
"formatting style" issue?  I hope this discussion can shed some light on this 
question, and possibly help me see what I may be missing.

Actually, there is a more serious problem involved with vertical directional 
overrides: They would force the Unicode plain-text mechanism to become aware 
of both vertical directionality and directional priority.  This sounds 
obvious, but in fact there are not two, but THREE issues involved with text 
directionality:

1.  Horizontal, that is, left-to-right (LTR) versus right-to-left (RTL).
2.  Vertical, that is, top-to-bottom (TTB) versus bottom-to-top (BTT).
3.  Priority of direction (e.g. (LTR, TTB) versus (TTB, LTR)).

If you think about it, all text of non-trivial length has both horizontal and 
vertical directionality, and also a priority to the directionality.  
Horizontal and vertical directionalities are not opposites, they are 
complements.  The Latin script is written (LTR, TTB) which means not only 
that there is a horizontal directionality of left-to-right and a vertical 
directionality of top-to-bottom, but also that the horizontal directionality 
takes precedence over the vertical.  That is, we complete a horizontal (LTR) 
line before moving down the page (TTB) to start another line.

According to TUS 3.0,
Latin and most other European scripts are (LTR, TTB).
Arabic and most other Middle Eastern scripts are (RTL, TTB).
Ogham is either (LTR, TTB) or (BTT, ???).
Han is traditionally written (TTB, RTL) and more recently (LTR, TTB).
Mongolian is written (TTB, LTR).

Unicode characters have a default directionality, but both this and the 
override mechanism cover only the horizontal aspect, not the vertical aspect 
or the priority of one over the other.  Thus, Mongolian characters are 
assigned the same directionality code as Latin ("L") even though the TTB 
directionality takes precedence over the LTR, the opposite of Latin.  And 
there is no plain-text way to indicate the alternative directionality of 
Ogham or Han.

An elaboration of the directional override mechanism to handle vertical 
directionality would have to take priority into account as well.  Instead of 
two directionalities, LTR and RTL, the Unicode Standard would have to 
consider eight.  The Bidirectional Algorithm might have to become 
Octodirectional, with a commensurate increase in complexity.  Perhaps this is 
the problem that is avoided by declaring vertical directionality to be a 
higher-level "formatting style" issue.  But it still seems arbitrary.

-Doug Ewell
 Fullerton, California

RE: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-28 Thread Sampo Syreeni

On Fri, 28 Dec 2001, Marco Cimarosti wrote:

>>I thought that sometimes numbers for example "123." might be written
>>horizontally in the middle of a vertical run.
>>y
>>a
>>d
>>d
>>a
>>   123.
>
>That's true: an extra complication! However, I have only seen that for
>one- or two-digit numbers.

I think this is something better handled by special-casing in rendering
software -- the numbers (and whatnot) could be rendered as rotated or
straight top-to-bottom as well. Considering this, it seems like a
stylistic variation better controlled by an upper level protocol, if at
all.

>But that would a limited case for horizontal text embedded in vertical
>text: I cannot imagine a real-world situation for a vertical text
>embedded in horizontal text.

If you think about the history of this particular rendering, it's about
the Western/Arabic numbers intruding the East Asian writing system. If
there's anything to believe in cyberpunk, the tide might well turn one
day. I'm not quite sure we couldn't one day have residual English embedded
with native Japanese terms. ;)

Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

RE: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-28 Thread Marco Cimarosti

Tex Texin wrote:
> > 
> > On the other hand, horizontal vs. vertical are attributes 
> that can be only
> > be applied to a whole paragraph or section.
> 
> Marco, is that true? I thought that sometimes numbers for 
> example "123."
> might be written horizontally in the middle of a vertical run.
>y
>a
>d
>d
>a
>   123.

That's true: an extra complication! However, I have only seen that for one-
or two-digit numbers.

This is also used for single letters or two-letter acronyms (such as "Km").
Probably this is the reason for the "squared letters" in range
U+3380..U+33DD (some of which are 3-letter long, BTW).

But that would a limited case for horizontal text embedded in vertical text:
I cannot imagine a real-world situation for a vertical text embedded in
horizontal text.

_ Marco

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-28 Thread Tex Texin


Marco Cimarosti wrote:
> I see a big difference between the two cases.
> 
> Right-to-left vs. left-to-right are attributes of arbitrary *spans* of text,
> which can easily be mixed within the same paragraph.
> 
> On the other hand, horizontal vs. vertical are attributes that can be only
> be applied to a whole paragraph or section.

Marco, is that true? I thought that sometimes numbers for example "123."
might be written horizontally in the middle of a vertical run.
   y
   a
   d
   d
   a
  123.
   y
   a
   ...

tex





-- 
-
Tex TexinDirector, International Business
mailto:[EMAIL PROTECTED]Tel: +1-781-280-4271
the Progress Company Fax: +1-781-280-4655
-
For a compelling demonstration for Unicode:
http://www.geocities.com/i18nguy/unicode-example.html

RE: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-28 Thread Marco Cimarosti

Doug Ewell wrote:
> > Unicode doesn't have some way to indicate vertical writing. 
> I think the
> > only consideration for it is vertical presentation forms of some
> > characters. Anything more is left for other software layers to deal
> > with.
> 
> Seeing that Unicode already has left-to-right and 
> right-to-left override 
> characters, I wonder if a top-to-bottom override character 
> might also be reasonable.

I see a big difference between the two cases.

Right-to-left vs. left-to-right are attributes of arbitrary *spans* of text,
which can easily be mixed within the same paragraph.

On the other hand, horizontal vs. vertical are attributes that can be only
be applied to a whole paragraph or section.

So, an hypothetical pair (start/end) of top-to-bottom override character
should probably also act as paragraph separators.

I wish a decent 2002 to everybody (as wishing more than "decent" would be
quite irrealistic).

_ Marco

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-25 Thread DougEwell2

In a message dated 2001-12-25 16:57:39 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

> Unicode doesn't have some way to indicate vertical writing. I think the
> only consideration for it is vertical presentation forms of some
> characters. Anything more is left for other software layers to deal
> with.

Seeing that Unicode already has left-to-right and right-to-left override 
characters, I wonder if a top-to-bottom override character might also be 
reasonable.

-Doug Ewell
 Fullerton, California

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-25 Thread Michael \(michka\) Kaplan


From: "&Agr;&lgr;&eacgr;&xgr;&agr;&ngr;&dgr;&rgr;&ogr;&sfgr; &Dgr;&igr;&agr;&mgr;&agr;&ngr;&tgr;&iacgr;&dgr;&eegr;&sfgr;" <[EMAIL PROTECTED]>

> By the way, does any browser in common use
> support the Ruby extensions to HTML?

Well, looking at links like:

http://msdn.microsoft.com/workshop/author/dhtml/reference/objects/rt.asp

(all on one line) and just doing a random search on
http://msdn.microsoft.com/ for keywords like "HTML Ruby" make me think its
supported in IE5 and later?


MichKa

Michael Kaplan
Trigeminal Software, Inc.  -- http://www.trigeminal.com/

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-25 Thread Αλέξανδρος Διαμαντίδης


* Stefan Persson <[EMAIL PROTECTED]> [2001-12-26 00:02]:
> Is there some way to indicate vertical writing (in columns from right to
> left) for Japanese and Chinese? Is there a Unicode code point assigned for
> this, a HTML command, or just a special option in some word processors?

Well, some word processors and typesetting systems do support vertical
writing. It's probably more common in software oriented towards
Chinese and Japanese, but I can't help you there. I do know that
the Omega typesetting system supports vertical writing. Omega is
based on TeX but with many extensions and some changes, and uses
Unicode as its internal text encoding.

Unicode doesn't have some way to indicate vertical writing. I think the
only consideration for it is vertical presentation forms of some
characters. Anything more is left for other software layers to deal
with.

As for HTML, I don't know (I'm sure someone will fill us in) but even if
some mechanism for vertical writing is defined, I don't think any
current browser supports it. By the way, does any browser in common use
support the Ruby extensions to HTML?

While doing a web search for the word "tategaki", looking for its
meaning, I found a Java program that formats Japanese text for vertical
display using HTML tables with a cell for each character. It's here:

http://homepage.mac.com/kkonaka/TategakiProg.html

This is kind of a kludge, but it may be useful in some circumstances.
The author warns though:

> (this generates far many cells in a table commonly observed in normal
> web pages). - many browser cannot display text layout this way of more
> than a few pages... (they'd run out of memory).


-- 
&Agr;&lgr;&eacgr;&xgr;&agr;&ngr;&dgr;&rgr;&ogr;&sfgr; &Dgr;&igr;&agr;&mgr;&agr;&ngr;&tgr;&iacgr;&dgr;&eegr;&sfgr; * [EMAIL PROTECTED]

37 matches

Mail list logo