Re: Small Latin Letter m with Macron

2003-01-16 Thread John Hudson
At 12:29 PM 1/16/2003, Timothy Partridge wrote:


Charles Trice Martin wrote "The Record Interpreter" which lists words in
record type and their expansion. The 2nd Edition (1910) has been reprinted
many times. The 1999 reprint is a facsimile of the 1910 edition, rather than
being re-typeset.


The other standard text, which has the added benefit of being more 
international than _The Record Interpreter_, is Cappelli's _Lexicon 
abbreviaturarum_ . Other useful titles are Miller's _Abbreviations in 
Latin_ and Pelzer's supplement to Cappelli _Abreviations latines medievales_.

John Hudson

Tiro Typeworks		www.tiro.com
Vancouver, BC		[EMAIL PROTECTED]

A book is a visitor whose visits may be rare,
or frequent, or so continual that it haunts you
like your shadow and becomes a part of you.
   - al-Jahiz, The Book of Animals




Re: Small Latin Letter m with Macron

2003-01-16 Thread John Hudson
At 01:59 AM 1/16/2003, Otto Stolz wrote:


Kenneth Whistler wrote:

Handwritten forms and arbitrary manuscript abbreviations
should not be encoded as characters. The text should just
be represented as "m" + "m". Then, if you wish to *render*
such text in a font which mimics this style of handwriting
and uses such abbreviations, then you would need the font
to ligate "mm" sequences into a *glyph* showing an "m" with
an overbar.


This will not work, as all 'mm' occurences are not written as
m-overbar. E. g., G. Keller's "Die drei gerechten Kammacher"

could not be written with m-overbar, as the two "m" characters
belong to different syllables; in modern orthography, you would
write "Kammmacher", or -- if you wish so -- Kamacher.


Ken's suggestion works fine, but only on discreetly selected runs of text. 
In other words, it would be up to the user *not* to apply the glyph 
substitution layout feature in the circumstances Otto describes. I drafted 
an OpenType Layout feature description last year for a Scribal Contractions 
feature to do exactly this sort of thing, but I recommended to MS and Adobe 
that it not be included in version 1.4 of the OT spec because I think the 
issues need to be better understood before publishing a general solution. 
Obviously this is not a plain text solution: markup is required.

John Hudson

Tiro Typeworks		www.tiro.com
Vancouver, BC		[EMAIL PROTECTED]

A book is a visitor whose visits may be rare,
or frequent, or so continual that it haunts you
like your shadow and becomes a part of you.
   - al-Jahiz, The Book of Animals




[elfling] John Cowan is all right

2003-01-16 Thread John Cowan
Apologies for the cross-post.

I amd my family are all fine and safe at home, about 3 km from ground zero.
There is no problem here except a touch of air pollution.


-- 
John Cowan   http://www.ccil.org/~cowan  [EMAIL PROTECTED]
Please leave your values|   Check your assumptions.  In fact,
   at the front desk.   |  check your assumptions at the door.
 --sign in Paris hotel  |--Miles Vorkosigan

--
Manager address: [EMAIL PROTECTED]
Unsub address: [EMAIL PROTECTED]
Elfling welcome: http://www.terracom.net/~dorothea/elfling.html
Elfling FAQ: http://www.terracom.net/~dorothea/elfaq.html 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 






Re: Small Latin Letter m with Macron

2003-01-16 Thread John H. Jenkins

On Thursday, January 16, 2003, at 01:29 PM, Timothy Partridge wrote:


Yes, especially early printing of Latin documents. See for example
Gutenberg's bibles.



Well, for that matter, even current editions of Spenser's _Faerie 
Queene_ will use the occasional "õ" for "on," and so on.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.tejat.net/




Re: Small Latin Letter m with Macron

2003-01-16 Thread Timothy Partridge
Cristoph Päper recently said:

> Kenneth Whistler:
> > Christoph Päper asked:
> >
> >> writing "mm"  as only one "m" with a macron above.
> >
> > Handwritten forms and arbitrary manuscript abbreviations
> > should not be encoded as characters.
>
> Although I've got no proof for it, I was told that it has also been used in
> print.

Yes, especially early printing of Latin documents. See for example
Gutenberg's bibles.

In the nineteenth century, in England, many old handwritten records were
were printed in record type. This is like ordinary type but contains extra
characters for the abbreviation marks. (It is in a typical serif font, not a
handwriting style font.) I think the reason for reproducing in the condensed
form rather than expanding the abbreviations, was that some abbreviations
have more than one interpretation. For legal records an incorrect expansion
can have a significant effect. The literal transcription reduces this risk.
(It still requires someone to read the old handwriting correctly.)

Charles Trice Martin wrote "The Record Interpreter" which lists words in
record type and their expansion. The 2nd Edition (1910) has been reprinted
many times. The 1999 reprint is a facsimile of the 1910 edition, rather than
being re-typeset.

 Tim

-- 
Tim Partridge. Any opinions expressed are mine only and not those of my employer





Re: newbie 18030 font question

2003-01-16 Thread Stefan Persson
John H. Jenkins wrote:


Well, not from Apple's, anyway.  Several GB18030 fonts come with Mac 
OS X 10.2, but we don't have a license to make them freely downloadable.

I see.  BTW, what exactly does the law require?  I have understood that 
software has to support displaying and inputting characters but that the 
characters don't have to be readable.  What about inputting, is support 
for ALT+number sufficient, or are keyboard drivers and/or IMEs required?

Stefan




Re: newbie 18030 font question

2003-01-16 Thread John H. Jenkins

On Thursday, January 16, 2003, at 12:25 PM, Stefan Persson wrote:


I assume that you mean GB18030, right?  Due to a change in Chinese 
laws, Apple and Microsoft had to make fonts supporting all those 
characters available.  You may download those fonts from the 
companies' respective home pages.


Well, not from Apple's, anyway.  Several GB18030 fonts come with Mac OS 
X 10.2, but we don't have a license to make them freely downloadable.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.tejat.net/




Re: newbie 18030 font question

2003-01-16 Thread Stefan Persson
[EMAIL PROTECTED] wrote:


Hello, all.

I'm new to 18030 and was hoping that someone could verify this.
We're implementing a browser-delivered database application and would
like to support 18030.

One fairly straightforward way of implementing this
seems to be to accept 18030 at the browser
and then transcode to Unicode when the
data first reaches the server.
When sending data back to the browser,
we'd transcode back to 18030.

OK so far, right?

Unicode fonts don't support all characters in 18030, correct?
Let's assume our client makes use of 18030 characters not in unicode fonts.

What font could we use for a 3rd party reporting tool 
that read data straight from the unicode db, bypassing our transcoding layer?

Thanks you for your time; I've learned a lot reading through 
the archives of this maillist.

--Erik Ostermueller
[EMAIL PROTECTED]

I assume that you mean GB18030, right?  Due to a change in Chinese laws, 
Apple and Microsoft had to make fonts supporting all those characters 
available.  You may download those fonts from the companies' respective 
home pages.

Stefan




Re: newbie 18030 font question

2003-01-16 Thread Markus Scherer
GB 18030 is defined with a 1:1 mapping table to Unicode. It has large code spaces for user-defined 
characters, but the standard repertoire is the same as Unicode's.

In practice, all modern browsers work internally with Unicode no matter what page charset is 
received. They all convert from the page charset to Unicode, and for form submissions convert from 
Unicode back to the appropriate charset.

GB 18030 is supported by some newer browsers, but Unicode will be more reliable. (Usually UTF-8 for 
HTML, but UTF-16 is also possible.)

In other words, your browser will display Unicode text even if you send GB 18030. Unicode fonts are 
all you need. You might have to look for fonts with large repertoires though. Others on this list 
will be able to point you to such fonts.

For GB 18030 support it is sufficient to support the repertoire (Unicode allows this) and to be able 
to input and output GB 18030 via conversion - most vendors do it this way.

Best regards,
markus

http://oss.software.ibm.com/icu/docs/papers/gb18030.html

--
Opinions expressed here may not reflect my company's positions unless otherwise noted.




newbie 18030 font question

2003-01-16 Thread Erik.Ostermueller
Hello, all.

I'm new to 18030 and was hoping that someone could verify this.
We're implementing a browser-delivered database application and would
like to support 18030.

One fairly straightforward way of implementing this
seems to be to accept 18030 at the browser
and then transcode to Unicode when the
data first reaches the server.
When sending data back to the browser,
we'd transcode back to 18030.

OK so far, right?

Unicode fonts don't support all characters in 18030, correct?
Let's assume our client makes use of 18030 characters not in unicode fonts.

What font could we use for a 3rd party reporting tool 
that read data straight from the unicode db, bypassing our transcoding layer?

Thanks you for your time; I've learned a lot reading through 
the archives of this maillist.

--Erik Ostermueller
[EMAIL PROTECTED]




Web Form: Old Russian charcaters

2003-01-16 Thread Magda Danish (Unicode)
Can anyone on the Unicode list help?
Thanks,
Magda


> -Original Message-
> 
> Date/Time:Thu Jan 16 13:11:06 EST 2003
> Contact:  [EMAIL PROTECTED]
> Report Type:  Other Question, Problem, or Feedback
> 
> I am looking for a way to use old Russian charcaters that are 
> no longer used in modern Russian langauage.
> 
> Can you help or provide a direction in which to look?
> 
> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
> (End of Report)
 




Re: Small Latin Letter m with Macron

2003-01-16 Thread Otto Stolz
Dominikus Scherkl wrote:


"i. e." is an latin abbreviation for "in exemplum" meaning "for example"
not "that is".



"i. e." = "id est" = "that is"
"e. g." = "exempli gratia" = "for example"

Cassel's English-German Dictionary, ISBN 0-02-522920-6, also says so.

Best wishes,
  Otto Stolz








Re: Small Latin Letter m with Macron

2003-01-16 Thread Doug Ewell
I've got a lot less to write since everybody else got there first.

Christoph Päper 
wrote:

> I recently learned in  that there has
> been a tradition (in handwritten text more than in print) of writing
> "mm"  as only one "m" with a macron above. I can't find any such
> character in Unicode, just  U+1E3F and U+1E41.

Assuming that you want to encode the m-macron directly--rather than
encoding "mm" and letting a German-handwriting-specific rendering system
convert this to m-macron, as Ken suggested--the correct solution would
be to use a combining sequence, "m" followed by U+0304 COMBINING MACRON.

I suppose you could use U+0305 COMBINING OVERLINE instead, but the
decision of which mark to use should be based on whether the mark really
is a macron or an overline, not on the width of the glyph.  U+0304
already has to adjust its width depending on whether it appears over an
"i" or an "a".

> You could of course build something similar with "m"+U+0305 to
> resemble the look, but that won't become "mm" (just "m" or "m¯") after
> a conversion to e.g. ISO-8859-1.

Two important points here.  First, a combining sequence doesn't simply
"resemble the look" of a precomposed character; it is *completely
equivalent* to the precomposed character.  If you wanted to represent an
"a" with macron, which does exist in a precomposed form, you would be
just as correct using either U+0101 or a combination of U+0061 and
U+0304 (though normalization might require you to choose one or the
other; see Unicode Standard Annex #15).

Second, no Unicode character that is not already in (e.g.) ISO 8859-1 is
ever "automatically" converted to an 8859-1 character.  You will always
have to have some explicit mapping table or logic to perform such a
conversion.  This is just as true for a precomposed character as it is
for a combining sequence.  If you wanted to build a conversion layer to
convert between "m̄" and "mm" you could certainly do so.

-Doug Ewell
 Fullerton, California





Re: Small Latin Letter m with Macron

2003-01-16 Thread John H. Jenkins

On Wednesday, January 15, 2003, at 01:35 PM, Kenneth Whistler wrote:


Handwritten forms and arbitrary manuscript abbreviations
should not be encoded as characters. The text should just
be represented as "m" + "m". Then, if you wish to *render*
such text in a font which mimics this style of handwriting
and uses such abbreviations, then you would need the font
to ligate "mm" sequences into a *glyph* showing an "m" with
an overbar.



Remembering, of course, to use ZWNJ to mark places where this ligature 
may not be used.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.tejat.net/




RE: Small Latin Letter m with Macron

2003-01-16 Thread Frank da Cruz
> The convention of using a horizontal line to mark an abbreviation, often
> the omission of m or n, goes back to the middle ages (if not earlier)
> and was often used in early printed books; apparently it has lived on in
> some handwriting, to judge from your post.
>
It was used in English too, see:

  http://www.columbia.edu/kermit/st-erkenwald.html

> I think that U+0305, the combining overscore, is the right thing to use
> for marking such abbreviations.  I would like to get confirmation of
> this from others on the list just to be sure.  The only alternative
> would be the combining macron, U+0304, which in many fonts would look
> too short.
>
See the above-referenced page.  For putting a line over a single letter,
the macron looks better (the overline is too wide), but you need the
overline for making a line over a series of letters because the macron
is not guaranteed to join.

- Frank




Re: Small Latin Letter m with Macron

2003-01-16 Thread John Cowan
Dominikus Scherkl scripsit:

> "i. e." is an latin abbreviation for "in exemplum" meaning "for example"
> not "that is". (or am I not even average at english?!?)

It is a Latin abbreviation, but it stands for "id est", and therefore
corresponds to German "d. h."  The abbreviation for "for example" 
(German "z. B.") is "e. g." for "exempli gratia".

-- 
John Cowanhttp://www.ccil.org/~cowan  [EMAIL PROTECTED]
Please leave your values|   Check your assumptions.  In fact,
   at the front desk.   |  check your assumptions at the door.
 --sign in Paris hotel  |--Cordelia Vorkosigan




RE: Small Latin Letter m with Macron

2003-01-16 Thread Dominikus Scherkl
> the spelling "i. e." would [not] distort the content of "that is"
?
"i. e." is an latin abbreviation for "in exemplum" meaning "for example"
not "that is". (or am I not even average at english?!?)
-- 
Dominikus Scherkl
[EMAIL PROTECTED]




Re: Small Latin Letter m with Macron

2003-01-16 Thread Otto Stolz
Christoph Päper had asked:

there has been a
tradition (in handwritten text more than in print) of writing "mm"  as only
one "m" with a macron above. I can't find any such character in Unicode,



You could of course build something similar with "m"+U+0305 to resemble the
look, but that won't become "mm" (just "m" or "m¯") after a conversion to
e.g. ISO-8859-1.



This depends on the program used to do the conversion. When you want to
properly handle a particular writing tradition, you cannot rely on off-
the-shelf tools, unaware of the particular requirements.


Kenneth Whistler wrote:

Handwritten forms and arbitrary manuscript abbreviations
should not be encoded as characters. The text should just
be represented as "m" + "m". Then, if you wish to *render*
such text in a font which mimics this style of handwriting
and uses such abbreviations, then you would need the font
to ligate "mm" sequences into a *glyph* showing an "m" with
an overbar.



This will not work, as all 'mm' occurences are not written as
m-overbar. E. g., G. Keller's "Die drei gerechten Kammacher"

could not be written with m-overbar, as the two "m" characters
belong to different syllables; in modern orthography, you would
write "Kammmacher", or -- if you wish so -- Kamacher.

So, if you want to render m-overbar, you would have to mark it
in text, and the only way Unicode has to offer, is U+006D U+0304.
(I would not use U+0305, as this is too high and too wide.
I reckon, a good rendering engine should adapt U+304's width to
the pertenent base character's width.)


To do otherwise, either representing the plain text content
as  or with a newly encoded m-macron
character, would just distort the *content* of the text,
which is what the character encoding should be about.



It would not distort the content of the text for readers that
are accustomed to this sort of abbreviation -- no more than
the spelling "i. e." would distort the content of "that is"
for an average English reader.


If and only if an m-macron became a part of the accepted,
general orthography of German



It used to be.

Markus Scherer wrote:


I can confirm the use of m+overline from my family, [...]
I always considered those personal variations, "font styles" if you wish.



Now I know that the m+overline was used elsewhere,


In German handwriting (Kurrent), the sequences of the letters "m",
"n", "u", and "i" look very confusing: an "ü" is written exactly
as "ii" would be (if it ever were) written; the "u" needs a hook
above (akin to U+0306, but it is an intrinsic part uf the Kurrent
"u" glyph) to distinguish it from the "n", and "mm" cannot be
distinguished from "nnn" (which came into German orthography only
in 1996, so there was no ambiguity when Kurrent was widely used).
Try to read, e. g., the penultimate word of the first line of
H. Carossas poem in :
it's "immer" (and the penultimate word of the 2nd line reads
"Brunnens").

This problem was even worse  with the medieval Textura font.

Hence, medieval scribes developed a rich set of abbreviations,
including the overbar for an omitted "m" or "n". The latter has
survived into German handwriting, at least until the 1st half of
the 20th century.

Best wishes,
  Otto Stolz

PS. Never write "Hawaii" in German Kurrent ;-)





Re: Small Latin Letter m with Macron

2003-01-16 Thread Kenneth Whistler
Christoph Päper asked:

> I recently learned in  that there has been a
> tradition (in handwritten text more than in print) of writing "mm"  as only
> one "m" with a macron above. I can't find any such character in Unicode,
> just  U+1E3F and U+1E41.
> You could of course build something similar with "m"+U+0305 to resemble the
> look, but that won't become "mm" (just "m" or "m¯") after a conversion to
> e.g. ISO-8859-1.
> 
> Should such a character be added to Unicode (or did I miss it)?

Neither.

Handwritten forms and arbitrary manuscript abbreviations
should not be encoded as characters. The text should just
be represented as "m" + "m". Then, if you wish to *render*
such text in a font which mimics this style of handwriting
and uses such abbreviations, then you would need the font
to ligate "mm" sequences into a *glyph* showing an "m" with
an overbar.

To do otherwise, either representing the plain text content
as  or with a newly encoded m-macron
character, would just distort the *content* of the text,
which is what the character encoding should be about.

If and only if an m-macron became a part of the accepted,
general orthography of German would it make sense to start
representing textual content in terms of such a character.
And in such a hypothetical future, you would use
, because it already exists in
Unicode, and there is no point to encoding another
canonically equivalant precomposed character for that
sequence.

--Ken