Re: Character identities

2002-10-23 Thread Doug Ewell
David Starner  wrote:

> Likewise, ä is printed as a with e above in old texts.* Would it be
> acceptable to make a font with a a^e glyph for ä? It's not even
> changing the meaning of the character in any way.

Indeed, that is exactly what Sütterlin fonts do.  (Then again, Sütterlin
fonts assign the long-s glyph to U+0073 and make you type $ to get a
round s, so they may not be the best example.)

Stefan Persson  replied:

> Unicode defines "a^e" as U+0061 U+0364 (though it's exactly the same
> character as "ä"). Why?

They're not exactly the same, except in this particular German example.

Combining superscript e was encoded along with combining superscript a,
i, o, u, c, d, h, m, r, t, v, and x, none of which evolved into a "real"
diacritical mark the way e did.  Combining e had non-German uses as
well, as in early modern English "Yͤ" (which did not become "Ÿ").

As for the diaeresis, its use in French, English ("coöperate"), and
other languages often has no relationship to the letter e.  Indeed, in
the sequence "güe" in Spanish, the diaeresis serves as a sort of anti-e,
ensuring the separate pronunciation of the u when the e would otherwise
prevent it!

Historically speaking, I and J were once equivalent, and U and V were
once equivalent, but they are all encoded today.

-Doug Ewell
 Fullerton, California





Re: Character identities

2002-10-23 Thread David Starner
On Wed, Oct 23, 2002 at 06:49:38PM -0400, David J. Perry wrote:
> > First, is it compliant with Unicode for an Antiqua font to use an s
> > glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur
> > fonts possible, and it is arguably the glyph given to the middle s
> > in modern Antiqua fonts. 
>
> If you are sure that the font will only be used for printing German
> this might be OK as a stopgap.

Why? Yes, if you want to use a true long s, you're going to need a
different font. But I can see this paired with an old Antiqua font, too,
if you want to use it for an exact copy of the American Constitution or
something.

> However, even with German, here's the
> problem: if a user searched for a word containing -s at the end, and
> typed it using the s key, then it would not be matched (unless the
> search engine already knew that long s and s are equivalent).  

You've got the long s and s reversed. In old printing, the s is the
letter that appears at the end. I don't see it as a problem; if you
typed in the long s, search for the long s. It might get confusing if
more general purpose fonts started doing this, but unless you have a
need to exactly reproduce the original document, you probably shouldn't
use the long s anyway.

> An
> OpenType font that is smart enough to substitute a long s glyph at the
> right spots is the much superior long-term solution.

There are two problems with this; one, German has had a number of
orthography changes, each time changing slightly when you're supposed to
use the long s (IIRC). Secondly, no matter what the convention, it
requires a dictionary lookup for various case; I'm not sure you can do
that in an OpenType font, and it's not something I'm sure I want a
renderer doing in the first place.

-- 
David Starner - [EMAIL PROTECTED]
Great is the battle-god, great, and his kingdom--
A field where a thousand corpses lie. 
  -- Stephen Crane, "War is Kind"




Unicode Display

2002-10-23 Thread nandu patil
Hi,
 can you tell me how to display unicode in RichEdit VB Applicatioan. I am trying to develop multilangual application . I will be very glad of you if you will help me for displaying Unicode.
Thanx a lot
Bye for now
 Markus Scherer <[EMAIL PROTECTED]> wrote:
David Starner wrote:> First, is it compliant with Unicode for an Antiqua font to use an s> glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur> fonts possible, and it is arguably the glyph given to the middle s in> modern Antiqua fonts. > > Likewise, ä is printed as a with e above in old texts.* Would it be> acceptable to make a font with a a^e glyph for ä? It's not even changing> the meaning of the character in any way.In my opinion, this is all reasonable and should be allowed.Viel Erfolg!> As a third case, I looked briefly at information and advocacy of the> duodecimal system. Chi and epsilon have been used as glyphs for 10 and ...I assume that the answer will be that these things are just alternate uses of existing characters.markus-- Opinions expressed here may not reflect my company's positions unless otherwise noted.Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site

Re: Character identities

2002-10-23 Thread Markus Scherer
David Starner wrote:

First, is it compliant with Unicode for an Antiqua font to use an s
glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur
fonts possible, and it is arguably the glyph given to the middle s in
modern Antiqua fonts. 

Likewise, ä is printed as a with e above in old texts.* Would it be
acceptable to make a font with a a^e glyph for ä? It's not even changing
the meaning of the character in any way.

In my opinion, this is all reasonable and should be allowed.
Viel Erfolg!


As a third case, I looked briefly at information and advocacy of the
duodecimal system. Chi and epsilon have been used as glyphs for 10 and  ...


I assume that the answer will be that these things are just alternate uses of existing characters.

markus

--
Opinions expressed here may not reflect my company's positions unless otherwise noted.





Re: Character identities

2002-10-23 Thread Stefan Persson
- Original Message -
From: "David Starner" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, October 23, 2002 7:00 PM
Subject: Character identities

> Likewise, ä is printed as a with e above in old texts.* Would it be
> acceptable to make a font with a a^e glyph for ä? It's not even changing
> the meaning of the character in any way.

Unicode defines "a^e" as U+0061 U+0364 (though it's exactly the same
character as "ä"). Why?

Stefan

_
Gratis e-mail resten av livet på www.yahoo.se/mail
Busenkelt!





Character identities

2002-10-23 Thread David Starner
I have several questions about character identities.

First, is it compliant with Unicode for an Antiqua font to use an s
glyph for ſ (U+017F)? It makes switching between Antiqua and Fraktur
fonts possible, and it is arguably the glyph given to the middle s in
modern Antiqua fonts. 

Likewise, ä is printed as a with e above in old texts.* Would it be
acceptable to make a font with a a^e glyph for ä? It's not even changing
the meaning of the character in any way.

(I suspect the answer is it's not technically complaint, but nobody
cares.)

(To my surprise, I came across a text from 1920 that used the e-above
instead of a diearsis. The only other texts I've see with this date
before 1810. It was "Islands Kultur zur Wikingerzeit" by Felix Niedner,
in the series (?) "Thule: Altnordische Dichtung und Prosa", which leads
me to believe, based off my limited German, that it's a deliberate
anacronism. Right?)

As a third case, I looked briefly at information and advocacy of the
duodecimal system. Chi and epsilon have been used as glyphs for 10 and
11, as well as an upside-down 2 and 3, a chi and reversed pound symbol
(? I'd need at that one again . . .) and * and #. Unified, they might a
proposal here, if someone still cares enough to make it. Would it be
unreasonable to unify them? There's quite a disparity in glyphs, but not
much argument against them all being the same character, and I don't
think there's anyone wanting to make the distinction.

-- 
David Starner - [EMAIL PROTECTED]
Great is the battle-god, great, and his kingdom--
A field where a thousand corpses lie. 
  -- Stephen Crane, "War is Kind"




Taiwanese proposal

2002-10-23 Thread Doug Ewell
The WG2 home page was updated today to add a link to document N2507,
"Draft of Proposal to add Latin characters required by Latinized
Taiwanese Holo language to ISO/IEC 10646" [1], by a group called the
Department of Language Education of National Taitung Teachers College.
The document is dated either 2002-03-11 or 2002-03-31, depending on what
part of the title page you look at.

This document proposes a COMBINING RIGHT DOT ABOVE for use in a popular
Latin-script orthography of the Taiwanese Holo language.  Some time ago
(I can't look up exactly when because the unicode.org archives are
unavailable), I wrote that this combining character should be added in
lieu of a largish collection of precomposed characters.  Ken Whistler
responded that the issue had already been debated, and a solution
already presented to use U+0307 COMBINING DOT ABOVE (possibly
incorporating a Taiwanese font-specific glyph variation to move the dot
to the right).

Evidently the Taiwanese teachers did not consider this satisfactory, as
they have responded with this new proposal to encode a separate
COMBINING RIGHT DOT ABOVE.

Whether this new combining character makes sense, however, the rest of
the proposal clearly does not.  The group has proposed no less than 42
precomposed Latin characters, all of which can be formed using existing
Latin letters and combining marks (together with the proposed RIGHT DOT
ABOVE).

The 42 precomposed letters are proposed "to be added to Latin
Extended-B," which is a puzzle to me since that block has only 25
available code positions as of Unicode 4.0.

Much more troubling, however, is the fact that this group has apparently
ignored or disregarded the Unicode/10646 policy against standardizing
new precomposed letters that can be composed with existing characters.
The document says:

"The precomposed characters are proposed to ensure compatibility with
the existing font "HoloWin" in the word-processing software HOTSYS
widely employed in the user community.  We have been promised composing
characters in major (Microsoft etc.) implementations since 1997.  Now, 5
years later, we still have nothing."

Compatibility with 8-bit legacy fonts and software is *not* sufficient
cause for encoding new precomposed characters.  The WG2 "Principles and
Procedures" document [2] specifically states that a precomposed
character should not be encoded "if solely intended to overcome
short-term deficiency of rendering technology."  The Taiwanese document
does not say which "major (Microsoft etc.) implementation" fails to
support composition using combining marks, but as a previous thread on
this list has shown, there is at least some support in Internet Explorer
for such characters.

Try this experiment:  One of the precomposed characters proposed by the
Taiwanese teachers is LATIN SMALL LETTER N WITH CIRCUMFLEX.  Here it is,
encoded properly as U+006E U+0302:

n̂

Some of you will be able to see this character, others will not.
Rendering technology is not perfect yet.  But this is the correct way to
create new accented letters in Unicode/10646, not by adding more
precomposed characters.

The proposal for a new COMBINING RIGHT DOT ABOVE may or may not have
merit -- I'm not going to commit firmly to the idea that it does, like I
did last time -- but the 42 precomposed letters have no business being
encoded and should not be debated further.

-Doug Ewell
 Fullerton, California


-Doug Ewell
 Fullerton, California

[1] http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2507.pdf
[2] http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2352r.pdf





Another candidate for the squared precomposed Latin block

2002-10-23 Thread Michael Everson
Ken Whistler especially may be pleased to learn that the following 
was discussed on the TYPO-L list today:

At 09:06 -0400 2002-10-23, Richard Kegler wrote:
 > From: Joe Clark <[EMAIL PROTECTED]>
 > Reply-To: Discussion of Type and Typographic Design
 > Date: Sun, 20 Oct 2002 20:33:12 -0400
 > To: [EMAIL PROTECTED]
 > Subject: Obsessive typography in Building Accessible Websites


 We used every ligature you can name


wouldn't a ligature for "http://"; be nice and handy.

  :^)

Richard Kegler



--
Michael Everson * * Everson Typography *  * http://www.evertype.com
48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire
Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)