subject:"Are these characters encoded\?"

Re: Are these characters encoded?

2001-12-08 Thread Michael Everson


In the gif that Doug sent out, the third of the ampersands was not an 
ampersand. It was a plus sign.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Are these characters encoded?

2001-12-04 Thread DougEwell2


In a message dated 2001-12-04 2:48:55 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

> The overbar being a flat form of tilde, which in medieval hands were 
> used to indicate an omitted m or n following.

Ah.  So it is "cum" after all.  Thank you.

-Doug Ewell
 Fullerton, California

Re: Are these characters encoded?

2001-12-04 Thread Michael Everson


At 07:33 -0700 04/12/2001, Tom Gewecke wrote:


>I believe there is also a (medical) s-overbar abbreviation for "without"
>(latin sin, no doubt) and an ss-overbar abbreviation for "one-half."
>Presumably these are only used in handwriting by specially trained people.

The first would be "sine" 'without'.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Are these characters encoded?

2001-12-04 Thread Tom Gewecke

>At 00:31 -0500 04/12/2001, [EMAIL PROTECTED] wrote:
>
>>Yes, you are all right: the character used in (as it turns out)
>>the medical field to mean "with" is, in fact, c-overbar and not c-underbar.
>>In Unicode we would say U+0063 U+0305.
>
>The overbar being a flat form of tilde, which in medieval hands were
>used to indicate an omitted m or n following.

I believe there is also a (medical) s-overbar abbreviation for "without"
(latin sin, no doubt) and an ss-overbar abbreviation for "one-half."
Presumably these are only used in handwriting by specially trained people.

Re: Are these characters encoded?

2001-12-04 Thread Michael Everson


At 00:31 -0500 04/12/2001, [EMAIL PROTECTED] wrote:

>Yes, you are all right: the character used in (as it turns out)
>the medical field to mean "with" is, in fact, c-overbar and not c-underbar.
>In Unicode we would say U+0063 U+0305.

The overbar being a flat form of tilde, which in medieval hands were 
used to indicate an omitted m or n following.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Are these characters encoded?

2001-12-03 Thread DougEwell2

In a message dated 2001-12-03 12:20:46 Pacific Standard Time, [EMAIL PROTECTED] 
writes:

>  Perhaps a corruption of "c-overbar," which is a medical abbreviaton for
>  "with," sometimes used by nurses, doctors, and pharmacies?

Thanks to everyone who, directly or indirectly, corrected me on this 
character.  Yes, you are all right: the character used in (as it turns out) 
the medical field to mean "with" is, in fact, c-overbar and not c-underbar.  
In Unicode we would say U+0063 U+0305.

So to get back to my original questions about this thing, (a) is it a 
character in its own right, (b) if so, is there any justification in encoding 
it separately rather than using a combining sequence, and (c) is this not 
*exactly* the same set of issues as the question of encoding the Swedish 
o-underbar?

-Doug Ewell
 Fullerton, California

Re: Are these characters encoded?

2001-12-03 Thread Tom Gewecke


>When I've seen the "c-underbar" in print, it has always meant "circa", as
>in "circa 1800".
>Jim
>
>At 10:14 PM 2001-12-01 + Saturday, Michael Everson wrote:
>>>(As a side note, this "o-underbar" form reminds me of the "c-underbar" which
>>>is sometimes used in handwritten English to mean "with."  Does anyone know
>>>the origin of this symbol?  Is it possibly derived from the Latin word cum,
>>>meaning "with"?  Does it have any claim to being a character in its own
>>>right?)

Perhaps a corruption of "c-overbar," which is a medical abbreviaton for
"with," sometimes used by nurses, doctors, and pharmacies?

Re: Are these characters encoded?

2001-12-03 Thread Stefan Persson


- Original Message -
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: den 3 december 2001 02:35
Subject: Re: Are these characters encoded?


> Perhaps they should be.

Er... So 3 and 三 are the same character...?

> I wonder: When transcribing a foreign name (like a business name) that
includes the ampersand, would a Swede use the "och" sign?

Sometimes yes, sometimes no.

> In other words, does there exist a case where the ampersand and the "och"
sign are not interchangeable?

No. At least not if the text is in Swedish.

Stefan


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Are these characters encoded?

2001-12-03 Thread Jim Melton


When I've seen the "c-underbar" in print, it has always meant "circa", as 
in "circa 1800".
Jim

At 10:14 PM 2001-12-01 + Saturday, Michael Everson wrote:
>>(As a side note, this "o-underbar" form reminds me of the "c-underbar" which
>>is sometimes used in handwritten English to mean "with."  Does anyone know
>>the origin of this symbol?  Is it possibly derived from the Latin word cum,
>>meaning "with"?  Does it have any claim to being a character in its own
>>right?)
>
>I've never seen this in handwritten English. Cappelli's Dizionario di 
>Abbreviature latine ed italiane shows several abbreviations for cum, none 
>of which are a c with underbar.


Jim Melton --- Editor of ISO/IEC 9075-* (SQL) Phone: +1.801.942.0144
Oracle CorporationOracle Email: mailto:[EMAIL PROTECTED]
1930 Viscounti Drive  Standards email: mailto:[EMAIL PROTECTED]
Sandy, UT 84093-1063   Personal email: mailto:[EMAIL PROTECTED]
USAFax : +1.801.942.3345

=  Facts are facts.  However, any opinions expressed are the opinions  =
=  only of myself and may or may not reflect the opinions of anybody   =
=  else with whom I may or may not have discussed the issues at hand.  =

RE: Are these characters encoded?

2001-12-03 Thread Kent Karlsson




Summary answer to the question in the subject 
line: yes.
 
As I tried to express as succinctly as possible 
before is that:1) & and o̲ (underlined o, 
sometimes used as an abbreviation for 'och', as is 'o.' 
(dictionaries)and 
'o', and even 'å') is 
definitely not a glyph variant issue, they are not interchangeable,even 
though the meaning is the same. Asmus gave an example.  Further one can use 
&without spaces around it (since the ligature is so highly ligated), but 
for o̲ there shouldalways be spaces around 
it.  B.t.w. 
& is called et-tecken in Swedish.  Getting et-teckenrendered 
as o̲ (underlined o) would be surprising 
indeed.2) o̲ (underlined o; it even displays fair, but not 
good, in the font I'm using right now) isalready 
perfectly well available in Unicode.  There no need to encode it again. 
Raising ita little bit (not much) over the baseline (that some do in 
handwriting) would be fine tuningthat 
is not appropriate for a character 
encoding, but might be for a handwriting imitatingfont, or for 
typographic fine tuning markup.
3) The following ones are all 
inappropriate:00B0;DEGREE SIGN;So;0;ET;N;00BA;MASCULINE ORDINAL INDICATOR;Ll;0;L; 
006FN;2070;SUPERSCRIPT ZERO;No;0;EN; 
0030;0;0;0;N;SUPERSCRIPT DIGIT ZERO
the 
first and last are obviously(?) wrong.  Why not 00BA?  There are two 
reasons: the glyphfor 00BA is not always underlined (even though a plain o 
can be used for 'och' in sloppyhandwriting or (rare) "spell as you speak" 
texts), and the glyph for 00BA is (always) 
raisedtoo much for the o̲ (underlined o for 'och') 
usage.  (But, but for "numero", which is also 
usedhere, I would use Nº (<004E, 
00BA>) rather than № (2116) or No̲ (<004E, 006F, 
0332>.)
    
Kind regards
    /kent k

RE: Are these characters encoded?

2001-12-03 Thread Marco Cimarosti

Asmus Freytag wrote:
> Overloading the existing 00BA º is tempting, but would likely 
> result in
> incorrect output unless special purpose (read private use) 
> fonts are used,
> or unless it became common to have a Swedish glyph overrides 
> in fonts and
> rendering engines that applied them. Since the usage and typographic
> convention for 'och' and the raised o for numbering are not 
> related, this
> unification smells more of shoehorning than encoding.

Perhaps there is also a "logical" difference.

The Swedish "o" represents the *first* letter of a word (och), and can thus
be interpreted as  (o *followed* by a dot); 00BA represents the *last*
letter of a word (it abbreviates ordinal adjectives like primero, segundo,
tercero... primo, secondo, terzo...), so it may logically be interpreted as
<.o> (o *preceded* by a dot).

_ Marco

Re: Are these characters encoded?

2001-12-03 Thread Roozbeh Pournader

On Sun, 2 Dec 2001 [EMAIL PROTECTED] wrote:

> [...] (cf. GREEK QUESTION MARK).
> 
> [...] This would be like using U+003B at the end of a Greek question.

Sorry, but U+037E GREEK QUESTION MARK is cannonically equivalent to U+003B 
SEMICOLON. I guess it is there only because ISO 8859-7 wanted to disunify 
them.

-- 
Note: If you want me to read a message, please make sure you include my
address in "To" or "CC" fields. I may not be able to follow all the
discussions on the mailing lists I subscribe. Sorry. (No, there's no
problem to receive duplicates.) --roozbeh

Re: Are these characters encoded?

2001-12-02 Thread John Hudson

At 21:33 12/1/2001, Asmus Freytag wrote:

>If the character can be shown to have as much justification for existence
>as coded character as similar characters in the standard, i.e. if it's
>ever used in printed handwriting, etc., etc., than we will have a tough
>time coming up with a unification that's not (far) worse than just adding
>it by itself.

Indeed. If it is not suitable to treat the och sign as a variant form of 
the ampersand, it would be better to give it its own codepoint rather than 
try to unify it with some other character(s) that would require more 
convoluted rendering.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

... es ist ein unwiederbringliches Bild der Vergangenheit,
das mit jeder Gegenwart zu verschwinden droht, die sich
nicht in ihm gemeint erkannte.

... every image of the past that is not recognized by the
present as one of its own concerns threatens to disappear
irretrievably.
   Walter Benjamin

Re: Are these characters encoded?

2001-12-02 Thread Asmus Freytag

At 05:29 PM 12/1/01 -0600, David Starner wrote:
> > It is certainly not a glyph variant of an ampersand. An ampersand is
> > a ligature of e and t. This is certainly an abbreviation of och. That
> > both mean "and" is NOT a reason for unifying different signs.
>
>But the fact that they never appear in the same text in the same font,
>and that one appears in handwritten text in the same places as the
>ampersand appears in machine written text means that it is a glyph
>variant. In any case, if it never appears in machine-written text, (if
>there's no font, as you point out for proposed ConScript additions),
>then there's no need to encode it.

Signs for faithful renderings of manuscript are - at least at the moment -
somewhat outside the scope of Unicode. Having said that, an exception for
current practice can be certainly be considered, as instances of type-set
"handwriting" are not generally uncommon, even if we can't lay our hands on
them on demand. So, on this aspect of the character alone I would not like
to make a ruling one way or another, but getting a printed 'och' would
certainly make the counterargument moot.

I wish that Unicode encoding principles were as easy as "If entity A only
occurs in one context and entity B occurs only in another, they can be
unified". Well, taking this argument to extreme, we could unify a lot of
unrelated things. Unicode might have fit in 64K after all. ;-)

Michael's argument that "and" (Sw. 'och') and "et" are different words and
need to be distinguished on that score alone is interesting, because
semantics and usage are so close. For letters we have long held that if it
is the same letter, we don't disunify it across languages. Why this
necessarily breaks down for abbreviation of a near universal word as 'and',
is not necessarily clear.

However, the Swedish case is really that the handwriting uses o-underbar
*NOT* in place of the ampersand, but in places where the typeset text
presumeably would have the word 'och' spelled out. In fact, I would guess
that a handwritten text referring to a company name, for example
Rabén&Sjögren might use the & and not the o-underbar in Swedish. I don't
know this for sure, but I strongly suspect that such differentiation of
usage exists that would make it awkward to convert printed handwriting
into printed text by a pure font change.

Overloading the existing 00BA º is tempting, but would likely result in
incorrect output unless special purpose (read private use) fonts are used,
or unless it became common to have a Swedish glyph overrides in fonts and
rendering engines that applied them. Since the usage and typographic
convention for 'och' and the raised o for numbering are not related, this
unification smells more of shoehorning than encoding.

(BTW it's not B0 as someone noted, that's a raised digit 0).

The strongest surviving candidate is the composed sequence U+006F U+0332,
but 0332 is an underscore, and not something that sits on-line. Again,
it would take special-purpose or specifically Swedish aware fonts and/
or rendering engines that support them to get the right result. That would
argue against this particular unification - even though it would be quite
acceptable for rough plain text usage.

If the character can be shown to have as much justification for existence
as coded character as similar characters in the standard, i.e. if it's
ever used in printed handwriting, etc., etc., than we will have a tough
time coming up with a unification that's not (far) worse than just adding
it by itself.

A./

Re: Are these characters encoded?

2001-12-02 Thread juuichiketajin


Perhaps they should be. I wonder: When transcribing a foreign name (like a business 
name) that includes the ampersand, would a Swede use the "och" sign?
I can't answer that.

In other words, does there exist a case where the ampersand and the "och" sign are not 
interchangeable?


-Original Message-
From: John Hudson <[EMAIL PROTECTED]>
Date: Sun, 02 Dec 2001 16:33:04 -0800
To: [EMAIL PROTECTED]
Subject: Re: Are these characters encoded?


> At 15:16 12/2/2001, [EMAIL PROTECTED] wrote:
> 
> >Then why not unify DIGIT THREE with HAN DIGIT THREE?
> 
> I don't know enough about the Han encoding to answer that. Because they are 
> distinguished in existing character sets? Because someone has a need to 
> distinguish them in plain text?
> 
> I'm not saying that the Swedish och sign should automatically be unified 
> with the ampersand. I'm simply pointing out that, as described to date on 
> this list, it is not clear that this sign needs to be separately encoded. 
> We know that is can be treated as a language-specific glyph variant because 
> Swedish readers apparently accept both forms to means exactly the same 
> thing. Whether such treatment is sufficient depends on whether there is 
> also need to distinguish the two forms, and to do so in plain text. I think 
> Michael Everson made a strong case for separate encoding of the Tironian et 
> sign, and I think a similarly strong case would need to be made for 
> separately encoding the Swedish och sign.
> 
> I'm perfectly happy to include the och sign in my fonts, whether it is 
> encoded or not, and to provide mechanisms to access the glyph. At the 
> moment, though, I don't think it is clear whether it is best for this sign 
> to be encoded or not. What might be the impact on Swedish keyboard drivers? 
> Is the intention that a new och sign character should replace the ampersand 
> character in Swedish text processing, or should both be used? What is the 
> impact on existing documents?
> 
> John Hudson
> 
> Tiro Typeworkswww.tiro.com
> Vancouver, BC [EMAIL PROTECTED]
> 
> ... es ist ein unwiederbringliches Bild der Vergangenheit,
> das mit jeder Gegenwart zu verschwinden droht, die sich
> nicht in ihm gemeint erkannte.
> 
> ... every image of the past that is not recognized by the
> present as one of its own concerns threatens to disappear
> irretrievably.
>Walter Benjamin
> 
> 
> 

-- 

___
Get your free email from http://www.ranmamail.com

Powered by Outblaze

Re: Are these characters encoded?

2001-12-02 Thread John Hudson


At 15:16 12/2/2001, [EMAIL PROTECTED] wrote:

>Then why not unify DIGIT THREE with HAN DIGIT THREE?

I don't know enough about the Han encoding to answer that. Because they are 
distinguished in existing character sets? Because someone has a need to 
distinguish them in plain text?

I'm not saying that the Swedish och sign should automatically be unified 
with the ampersand. I'm simply pointing out that, as described to date on 
this list, it is not clear that this sign needs to be separately encoded. 
We know that is can be treated as a language-specific glyph variant because 
Swedish readers apparently accept both forms to means exactly the same 
thing. Whether such treatment is sufficient depends on whether there is 
also need to distinguish the two forms, and to do so in plain text. I think 
Michael Everson made a strong case for separate encoding of the Tironian et 
sign, and I think a similarly strong case would need to be made for 
separately encoding the Swedish och sign.

I'm perfectly happy to include the och sign in my fonts, whether it is 
encoded or not, and to provide mechanisms to access the glyph. At the 
moment, though, I don't think it is clear whether it is best for this sign 
to be encoded or not. What might be the impact on Swedish keyboard drivers? 
Is the intention that a new och sign character should replace the ampersand 
character in Swedish text processing, or should both be used? What is the 
impact on existing documents?

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

... es ist ein unwiederbringliches Bild der Vergangenheit,
das mit jeder Gegenwart zu verschwinden droht, die sich
nicht in ihm gemeint erkannte.

... every image of the past that is not recognized by the
present as one of its own concerns threatens to disappear
irretrievably.
   Walter Benjamin

Re: Are these characters encoded?

2001-12-02 Thread DougEwell2

In a message dated 2001-12-02 11:00:32 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

> "o." and "o-with-underscore" are NOT glyph variants of a ligature of 
> e and t (at a character level), no matter what they mean.

I suggested that Stefan's o-underscore "and" might OR might not be a 
variation of the ampersand, in all its many existing glyph variants.

The "glyph variant" side is bolstered by the argument that it's a symbol, 
just like &, used to mean "and" without any translation necessarily taking 
place; that it's only used in Swedish; and that users consider it equivalent 
to & and use different forms depending on whether the text is handwritten or 
typed.

The "separate character" side can point to the fact that its derivation is 
completely different from that of &; that it looks nothing like any of the 
existing forms of & (like TIRONIAN SIGN ET); and that it's only used in 
Swedish (cf. GREEK QUESTION MARK).

I don't think there is one obvious answer to this.  I will say this, however: 
The majority of posts stating that some character or other is "not in 
Unicode" turn out to be bogus; the proposed character is really a glyph 
variant or presentation form.  Stefan's original post had the following three 
points:

1.  Swedish "o-underscore" -- maybe, maybe not
2.  Fraction slash -- already encoded
3.  Roman numerals -- overextension of compatibility forms; rendering issue

When two of three proposals can be quickly blown off, it is human nature that 
sometimes it is difficult to see the potential virtue in the third.

I also want to say that, although Michael is of course correct that & was 
originally a ligature of e and t, many, many of the & glyphs seen today do 
not even remotely resemble such a ligature.  Consider the top three glyphs in 
the attached GIF (only 290 bytes).  The first is obviously still an e-t 
ligature, the second is one with centuries of typographical evolution applied 
to it (and today more closely resembles a treble clef), the third is not at 
all.  If traceability to the original Latin "et" were what made these 
characters the same or different, then that might have spoken against the 
separate encoding of TIRONIAN SIGN ET.

I never think of & as meaning "et," even the glyph variants that do look like 
an e-t ligature.  I assume that practically all users of this symbol treat it 
as a logograph meaning "and" in the language of the surrounding text.  (I 
have, rarely, seen & used in Spanish text, which strikes me as funny since 
the Spanish words for "and" ("y" and "e") would not seem to need 
abbreviating.)

So the question might be posed, do Swedish users think of o-underscore as a 
logograph meaning "och" or as an abbreviation for the spelled-out word "och"?

In a message dated 2001-12-02 9:23:51 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

>>> Having said that, it seems to me that U+00B0 would represent Stefan's
>>> character easily enough.
>>
>> No. It's not a degree sign.  Nor is 00BA appropriate: the underlined o is
>> not superscripted/raised (much, if at all).
>
> Sorry, I did mean U+00BA, and subscription or superscription of the 
> glyph in that character is a matter of glyph choice.

I think, though, that use of U+00BA MASCULINE ORDINAL INDICATOR would be a 
classic example of hijacking a character for an unintended and inappropriate 
purpose simply because its glyph looks "close enough."  This would be like 
using U+003B at the end of a Greek question.  I stick to my original 
suggestion of U+006F U+0332, crossing my fingers that rendering engines will 
handle this correctly.

-Doug Ewell
 Fullerton, California

Re: Are these characters encoded?

2001-12-02 Thread juuichiketajin


Then why not unify DIGIT THREE with HAN DIGIT THREE?


-Original Message-
From: John Hudson <[EMAIL PROTECTED]>
Date: Sun, 02 Dec 2001 10:05:36 -0800
To: Michael Everson <[EMAIL PROTECTED]>
Subject: Re: Are these characters encoded?


> At 14:14 12/1/2001, Michael Everson wrote:
> 
> >It is certainly not a glyph variant of an ampersand. An ampersand is a 
> >ligature of e and t. This is certainly an abbreviation of och. That both 
> >mean "and" is NOT a reason for unifying different signs.
> 
> The fact that & is accepted by Swedish readers as a substitute for the 
> 'och' sign, and that the latter seems to be limited to manuscript, suggests 
> a glyph variant. I do not consider the fact that both mean 'and' to be a 
> reason for unifying different signs. I ponder whether two different signs 
> that are apparently used *interchangeably* might be unified?
> 
> John Hudson
> 
> Tiro Typeworkswww.tiro.com
> Vancouver, BC [EMAIL PROTECTED]
> 
> ... es ist ein unwiederbringliches Bild der Vergangenheit,
> das mit jeder Gegenwart zu verschwinden droht, die sich
> nicht in ihm gemeint erkannte.
> 
> ... every image of the past that is not recognized by the
> present as one of its own concerns threatens to disappear
> irretrievably.
>Walter Benjamin
> 
> 
> 

-- 

___
Get your free email from http://www.ranmamail.com

Powered by Outblaze

Re: Are these characters encoded?

2001-12-02 Thread Michael Everson

At 10:05 -0800 2001-12-02, John Hudson wrote:
>At 14:14 12/1/2001, Michael Everson wrote:
>
>>It is certainly not a glyph variant of an ampersand. An ampersand 
>>is a ligature of e and t. This is certainly an abbreviation of och. 
>>That both mean "and" is NOT a reason for unifying different signs.
>
>The fact that & is accepted by Swedish readers as a substitute for 
>the 'och' sign, and that the latter seems to be limited to 
>manuscript, suggests a glyph variant. I do not consider the fact 
>that both mean 'and' to be a reason for unifying different signs. I 
>ponder whether two different signs that are apparently used 
>*interchangeably* might be unified?

Um, I accept "etc." and "&c." and "7c." (the last with a Tironian et, 
admittedly peculiar to most readers of English) as "meaning" the same 
thing but that doesn't mean that & and 7 are the same character. They 
have different origins which are well known. You don't unify that 
kind of thing.

In Irish many people accept "srl" and "&rl" and "7rl" as meaning the 
same thing as well. The form with the actual & is considered peculiar.

"o." and "o-with-underscore" are NOT glyph variants of a ligature of 
e and t (at a character level), no matter what they mean.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Are these characters encoded?

2001-12-02 Thread John Hudson

At 14:14 12/1/2001, Michael Everson wrote:

>It is certainly not a glyph variant of an ampersand. An ampersand is a 
>ligature of e and t. This is certainly an abbreviation of och. That both 
>mean "and" is NOT a reason for unifying different signs.

The fact that & is accepted by Swedish readers as a substitute for the 
'och' sign, and that the latter seems to be limited to manuscript, suggests 
a glyph variant. I do not consider the fact that both mean 'and' to be a 
reason for unifying different signs. I ponder whether two different signs 
that are apparently used *interchangeably* might be unified?

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

... es ist ein unwiederbringliches Bild der Vergangenheit,
das mit jeder Gegenwart zu verschwinden droht, die sich
nicht in ihm gemeint erkannte.

... every image of the past that is not recognized by the
present as one of its own concerns threatens to disappear
irretrievably.
   Walter Benjamin

Re: Are these characters encoded?

2001-12-02 Thread John Hudson

At 06:17 12/2/2001, Stefan Persson wrote:

>Well, this character is *only* used in Swedish, while & is used in most
>(all?) languages using Roman letters, so it has a partially different usage!
>Using this character in, for example, an English text would be *wrong*!

Which is why I went on to suggest that the Swedish manuscript ampersand 
form (the 'och' abbreviation) might be substituted 'in Swedish text'. The 
OpenType glyph substitution model, for example, associates lookups with 
particular script and language system combination, so it is possible to to 
have something like this:

 Latin 
 Swedish 
 Stylistic Alternates 
 ampersand -> ampersand.swe

This substitution would only be applied in Swedish text. Now, this 
particular aspect of OpenType is not well supported yet, but it is a viable 
mechanism for the kind of substitution that the 'och' glyph requires.

Please note that I am not saying that the 'och' should not be encoded, only 
that there may well be good reasons to consider this form as a glyph 
variant and existing technologies for dealing with it as such. In order to 
make a case for encoding the 'och' ampersand, I think you will need to 
demonstate a need to distinguish it from the regular ampersand in plain 
text documents.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

... es ist ein unwiederbringliches Bild der Vergangenheit,
das mit jeder Gegenwart zu verschwinden droht, die sich
nicht in ihm gemeint erkannte.

... every image of the past that is not recognized by the
present as one of its own concerns threatens to disappear
irretrievably.
   Walter Benjamin

RE: Are these characters encoded?

2001-12-02 Thread Michael Everson


At 17:12 +0100 2001-12-02, Kent Karlsson wrote:

>Similarly, COMBINING OVERLINE and COMBINING LOW LINE
>should be used, together with ordinary I, V etc. (when possible)
>to get "lined" roman numerals.

What? Surely this is a font matter, and using combining characters a 
hack here. In Quark one might just draw a line and align it with the 
font.

>  > It is certainly not a glyph variant of an ampersand. An ampersand is
>  > a ligature of e and t.
>
>True (both). ("ampersand" is somewhat of a misnomer.)

It derives from "and per se and", apparently.

>  > This is certainly an abbreviation of och. That
>  > both mean "and" is NOT a reason for unifying different signs.
>  >
>  > Having said that, it seems to me that U+00B0 would represent Stefan's
>  > character easily enough.
>
>No. It's not a degree sign.  Nor is 00BA appropriate: the underlined o is
>not superscripted/raised (much, if at all).

Sorry, I did mean U+00BA, and subscription or superscription of the 
glyph in that character is a matter of glyph choice.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

RE: Are these characters encoded?

2001-12-02 Thread Kent Karlsson



> >>  1.) Swedish ampersand (see "&.bmp"). It's an "o" (for 
> "och", i.e. "and")
> >>  with a line below. In handwritten text it is almost 
> always used instead of
> >>  &, in machine-written text I don't think I've ever seen it.
> >
> >This might be a character in its own right, as different 
> from the ampersand
> >as U+204A TIRONIAN SIGN ET.  Or it might be simply a glyph 
> variant of  the
> >ampersand.

No.

> If you have never seen o-underbar in machine-written text, I
> >doubt that this will help your cause much.  You might try 
> U+006F U+0332,

Yes. (But some write "o.", esp. in the rare event this is typed.)

Similarly, COMBINING OVERLINE and COMBINING LOW LINE
should be used, together with ordinary I, V etc. (when possible)
to get "lined" roman numerals.

> >though this will probably not give you the vertical spacing you expect.
> 
> It is certainly not a glyph variant of an ampersand. An ampersand is 
> a ligature of e and t. 

True (both). ("ampersand" is somewhat of a misnomer.)

> This is certainly an abbreviation of och. That 
> both mean "and" is NOT a reason for unifying different signs.
> 
> Having said that, it seems to me that U+00B0 would represent Stefan's 
> character easily enough.

No. It's not a degree sign.  Nor is 00BA appropriate: the underlined o is
not superscripted/raised (much, if at all).

Kind regards
/kent k

Re: Are these characters encoded?

2001-12-02 Thread Michael Everson


Stafan, can you do up a web page or PDF file with samples of the 
"och" abbreviation in different manuscripts and in print? Or is it 
never found in print?
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Are these characters encoded?

2001-12-02 Thread Stefan Persson

- Original Message -
From: "John Hudson" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: den 1 december 2001 21:01
Subject: Re: Are these characters encoded?

> >1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
> >with a line below. In handwritten text it is almost always used instead
of
> >&, in machine-written text I don't think I've ever seen it.
>
> This is, as your analysis suggests, a glyph variant, not a distinct
> character.

Well, this character is *only* used in Swedish, while & is used in most
(all?) languages using Roman letters, so it has a partially different usage!
Using this character in, for example, an English text would be *wrong*! Or
is "α" a glyph variant of "a" and "あ?" Or even better, what about "A" and
"Α?"

Stefan

_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Are these characters encoded?

2001-12-02 Thread Stefan Persson

- Original Message -
From: "John Hudson" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: den 1 december 2001 21:01
Subject: Re: Are these characters encoded?

> >1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
> >with a line below. In handwritten text it is almost always used instead
of
> >&, in machine-written text I don't think I've ever seen it.
>
> This is, as your analysis suggests, a glyph variant, not a distinct
> character.

Well, this character is *only* used in Swedish, while & is used in most
(all?) languages using Roman letters, so it has a partially different usage!
Using this character in, for example, an English text would be *wrong*! Or
is "α" a glyph variant of "a" and "あ?" Or even better, what about "A" and
"Α?"

Stefan

_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Are these characters encoded?

2001-12-02 Thread Stefan Persson

- Original Message -
From: "John Hudson" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: den 1 december 2001 21:01
Subject: Re: Are these characters encoded?

> >1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
> >with a line below. In handwritten text it is almost always used instead
of
> >&, in machine-written text I don't think I've ever seen it.
>
> This is, as your analysis suggests, a glyph variant, not a distinct
> character.

Well, this character is *only* used in Swedish, while & is used in most
(all?) languages using Roman letters, so it has a partially different usage!
Using this character in, for example, an English text would be *wrong*! Or
is "α" a glyph variant of "a" and "あ?" Or even better, what about "A" and
"Α?"

Stefan

_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Are these characters encoded?

2001-12-02 Thread Stefan Persson

- Original Message -
From: "John Hudson" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: den 1 december 2001 21:01
Subject: Re: Are these characters encoded?

> >1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
> >with a line below. In handwritten text it is almost always used instead
of
> >&, in machine-written text I don't think I've ever seen it.
>
> This is, as your analysis suggests, a glyph variant, not a distinct
> character.

Well, this character is *only* used in Swedish, while & is used in most
(all?) languages using Roman letters, so it has a partially different usage!
Using this character in, for example, an English text would be *wrong*! Or
is "α" a glyph variant of "a" and "あ?"

Stefan

_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Are these characters encoded?

2001-12-01 Thread David Starner


On Sat, Dec 01, 2001 at 10:14:06PM +, Michael Everson wrote:
> At 16:02 -0500 2001-12-01, [EMAIL PROTECTED] wrote:
> >
> >> 1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
> >> with a line below. In handwritten text it is almost always used instead of
> >> &, in machine-written text I don't think I've ever seen it.
> >
> >This might be a character in its own right, as different from the ampersand
> >as U+204A TIRONIAN SIGN ET.  Or it might be simply a glyph variant of  the
> >ampersand.  If you have never seen o-underbar in machine-written text, I
> >doubt that this will help your cause much.  You might try U+006F U+0332,
> >though this will probably not give you the vertical spacing you expect.
> 
> It is certainly not a glyph variant of an ampersand. An ampersand is 
> a ligature of e and t. This is certainly an abbreviation of och. That 
> both mean "and" is NOT a reason for unifying different signs.

But the fact that they never appear in the same text in the same font,
and that one appears in handwritten text in the same places as the
ampersand appears in machine written text means that it is a glyph
variant. In any case, if it never appears in machine-written text, (if
there's no font, as you point out for proposed ConScript additions),
then there's no need to encode it.

-- 
David Starner - [EMAIL PROTECTED], ICQ #61271672
Pointless website: http://dvdeug.dhis.org
"I saw a daemon stare into my face, and an angel touch my breast; each 
one softly calls my name . . . the daemon scares me less."
- "Disciple", Stuart Davis

Re: Are these characters encoded?

2001-12-01 Thread G. Adam Stanislav


At 16:02 2001-12-01 EST, [EMAIL PROTECTED] wrote:
>(As a side note, this "o-underbar" form reminds me of the "c-underbar" which 
>is sometimes used in handwritten English to mean "with."  Does anyone know 
>the origin of this symbol?  Is it possibly derived from the Latin word cum, 
>meaning "with"?  Does it have any claim to being a character in its own 
>right?)

I don't know about c-underbar, but in medical documents (at least in the US)
the c with a bar above does indeed mean "with" and is derived from the Latin
word cum.

Adam
--- 
http://EasyDomain.com/
Domains for less

Re: Are these characters encoded?

2001-12-01 Thread Michael Everson


At 16:02 -0500 2001-12-01, [EMAIL PROTECTED] wrote:
>
>>  1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
>>  with a line below. In handwritten text it is almost always used instead of
>>  &, in machine-written text I don't think I've ever seen it.
>
>This might be a character in its own right, as different from the ampersand
>as U+204A TIRONIAN SIGN ET.  Or it might be simply a glyph variant of  the
>ampersand.  If you have never seen o-underbar in machine-written text, I
>doubt that this will help your cause much.  You might try U+006F U+0332,
>though this will probably not give you the vertical spacing you expect.

It is certainly not a glyph variant of an ampersand. An ampersand is 
a ligature of e and t. This is certainly an abbreviation of och. That 
both mean "and" is NOT a reason for unifying different signs.

Having said that, it seems to me that U+00B0 would represent Stefan's 
character easily enough.

>(As a side note, this "o-underbar" form reminds me of the "c-underbar" which
>is sometimes used in handwritten English to mean "with."  Does anyone know
>the origin of this symbol?  Is it possibly derived from the Latin word cum,
>meaning "with"?  Does it have any claim to being a character in its own
>right?)

I've never seen this in handwritten English. Cappelli's Dizionario di 
Abbreviature latine ed italiane shows several abbreviations for cum, 
none of which are a c with underbar.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Are these characters encoded?

2001-12-01 Thread DougEwell2

At 2001-12-01 11:24:04 Pacific Standard Time, 
[EMAIL PROTECTED] (Stefan Persson) wrote:

> I was thinking if this was encoded:
>
> 1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
> with a line below. In handwritten text it is almost always used instead of
> &, in machine-written text I don't think I've ever seen it.

This might be a character in its own right, as different from the ampersand 
as U+204A TIRONIAN SIGN ET.  Or it might be simply a glyph variant of  the 
ampersand.  If you have never seen o-underbar in machine-written text, I 
doubt that this will help your cause much.  You might try U+006F U+0332, 
though this will probably not give you the vertical spacing you expect.

(As a side note, this "o-underbar" form reminds me of the "c-underbar" which 
is sometimes used in handwritten English to mean "with."  Does anyone know 
the origin of this symbol?  Is it possibly derived from the Latin word cum, 
meaning "with"?  Does it have any claim to being a character in its own 
right?)

> 2.) Fractions with any number, see "bråk.bmp."

U+2044 FRACTION SLASH is exactly what you are looking for.  Whether your 
browser or other rendering engine will display it the way you want is another 
matter.

On page 154 of TUS 3.0, there is a two-paragraph description of the use of 
U+2044.  Note particularly the sentence:

"The standard form of a fraction built using the fraction slash is defined as 
follows: Any sequence of one or more decimal digits, followed by the fraction 
slash, followed by any sequence of one or more decimal digits."

This would give you the results you expect for "123/456" but not for "x/y" or 
even "14658.48/13789".  However, it is not clear to me that this "standard 
form" is normative, and it is conceivable that a fraction-slash-aware 
renderer could generalize this to "one or more non-space characters, fraction 
slash, one or more non-space characters."

> 3.) Roman numerals. I know I-XII are encoded, but what if you want to use
> higher numbers? Typing "XX," you might suggest.

The set of Roman numerals, at least through 4999, can be completely specified 
with the characters U+2160 "I", U+2164 "V", U+2169 "X", U+216C "L", U+216D 
"C", U+216E "D", and U+216F "M" (or, of course, with the equivalent Latin 
letters).  According to TUS 3.0, page 299, "Upper- and lowercase variants of 
the Roman numerals through 12, plus L, C, D, and M, have been encoded for 
compatibility with East Asian standards."  Requests for additional 
precomposed Roman numerals will almost certainly be denied.

> This is not always
> sufficient; in Sweden we often put a line under and one above the numbers,
> see "Roma.bmp."

Sounds like a glyph-variant issue.  Font designers might want to ensure that 
the glyphs for the Roman numeral forms do have the over- and underlines.  
Then, if a user doesn't want them, she can always use the plain Latin letters 
instead.

> And what about ten thousands? Neither "X¯" nor "X¯" are
> displayed properly!

They should be; that's what the combining characters are there for.  (Hint: 
you want U+0305 COMBINING OVERLINE, not U+0304 COMBINING MACRON.)

To be fair to Stefan, most rendering engines have a long way to go to catch 
up with the Unicode ideal of being able to attach arbitrary combining marks 
(like U+0305) to arbitrary base characters (like U+2169).  Many renderers 
simply replace the sequence with a precomposed glyph.  This approach looks 
really sharp IF such a glyph is available, but breaks down otherwise.

-Doug Ewell
 Fullerton, California

Re: Are these characters encoded?

2001-12-01 Thread John Hudson

At 05:52 12/1/2001, Stefan Persson wrote:

>1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
>with a line below. In handwritten text it is almost always used instead of
>&, in machine-written text I don't think I've ever seen it.

This is, as your analysis suggests, a glyph variant, not a distinct 
character. If the same text would have this Swedish form in manuscript, but 
the regular ampersand form in print, this is something that needs to be 
handled, if at all, at the font level. The logical implementation would be 
to substitute the Swedish manuscript ampersand form in Swedish text set in 
handwriting and calligraphic fonts.

>2.) Fractions with any number, see "brÃ¥k.bmp."

This is a layout issue, not an encoding issue. Arbitrary fraction forming 
can be handled in selected runs with contextual lookups (I devised the 
system now used in most OpenType fonts and can send you a more detailed 
explanation if you would like).

>3.) Roman numerals. I know â -â« are encoded, but what if you want to use
>higher numbers? Typing "XX," you might suggest. This is not always
>sufficient; in Sweden we often put a line under and one above the numbers,
>see "Roma.bmp." And what about ten thousands? Neither "XÌ" nor "XÌ" are
>displayed properly!

The lines above and below are stylistic variant roman numerals and, like 
the Swedish ampersand, they can be handled at the font level.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

... es ist ein unwiederbringliches Bild der Vergangenheit,
das mit jeder Gegenwart zu verschwinden droht, die sich
nicht in ihm gemeint erkannte.

... every image of the past that is not recognized by the
present as one of its own concerns threatens to disappear
irretrievably.
   Walter Benjamin

Are these characters encoded?

2001-12-01 Thread Stefan Persson


Hi!

I was thinking if this was encoded:

1.) Swedish ampersand (see "&.bmp"). It's an "o" (for "och", i.e. "and")
with a line below. In handwritten text it is almost always used instead of
&, in machine-written text I don't think I've ever seen it.

2.) Fractions with any number, see "bråk.bmp."

3.) Roman numerals. I know Ⅰ-Ⅻ are encoded, but what if you want to use
higher numbers? Typing "XX," you might suggest. This is not always
sufficient; in Sweden we often put a line under and one above the numbers,
see "Roma.bmp." And what about ten thousands? Neither "X̅" nor "X̄" are
displayed properly!

Is anything of what I mentionned here encoded? I don't think so, if not I
suggest that it should be added.

Stefan



Roma.bmp
Description: Windows bitmap


&.bmp
Description: Windows bitmap


=?utf-8?Q?br=C3=A5k.bmp?=
Description: Windows bitmap

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

RE: Are these characters encoded?

RE: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

RE: Are these characters encoded?

RE: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Re: Are these characters encoded?

Are these characters encoded?

34 matches

Site Navigation

Mail list logo

Footer information