Allen,
Thanks for this. Good points. The ideographic description approach is also a
good idea.
The table of Japanese counting characters was a bit off-topic for the
multicurrency page, and the footnotes that are there now for the Jo/Shi
character are a further distraction, but seem adequate enoug
Hi,
Any similarity between U+5b50 子 at 3 strokes, and U+4e88 予 at 4, is
superficial. For example, U+2007 迺 is a common variant for 4e43 乃 ("is,"
"then") but can't be confused 逎 900e ("alcoholic beverage"). (Context
usually makes that clear pretty quickly!) Since 79ed and 25771 seem to be
both inter
>
>I do the same thing: I compress all blanks with an invariable and
>automatic search&replace... before actually formatting
Me too. Only a small minority of the papers I edit come with two spaces
anyway, so it's less work to replace the double with single spaces. As well,
the single space looks b
From: "Philippe Verdy" <[EMAIL PROTECTED]>
Unicode already defines with character properties those punctuations
that terminate sentences. Why would you need to recognize sequences of
two spaces as meaning an end of sentence???
Ambiguity remains. My colleague David Palmer did some testing of
va
At 08:51 07/07/2003, Ted Hopp wrote:
> > ... Given the small number of attested sequences that would be
> > adversely affected by normalisation re-ordering, I'm beginning to
> > favour the idea of encoding these sequences as individual characters.
> > We'd probably only need three or four, plus a
On Mon, 7 Jul 2003, Philippe Verdy wrote:
> On Monday, July 07, 2003 9:41 PM, Michael Everson <[EMAIL PROTECTED]> wrote:
> > At 15:03 -0400 2003-07-07, Tex Texin wrote:
> > > When is a character properly called a currency sign?
> >
> > Hunh? When you use it to represent currency. DM was two chara
On Tuesday, July 08, 2003 12:57 AM, Stefan Persson <[EMAIL PROTECTED]> wrote:
> Philippe Verdy wrote:
>
> > "XEU" (the past European Currency Unit replaced by the Euro in a
> > different area of countries excluding GB and DK,
>
> Also excluding SE.
Sorry, I should have named it. But has ever Sw
There are lots of ways to indicate a currency, but I wouldn't think of EUR or
the other three character codes listed in this note as signs. (Although the
ISO 4217 3-letter codes replace where signs were previously used, in most
cases.)
tex
Philippe Verdy wrote:
>
> On Monday, July 07, 2003 9:4
Right. I was only thinking that if U+202F wasn't available it might be a
better choice than NBSP.
tex
Jim Allan wrote:
>
> Tex Texin posted on my indication that only U+00A0 NO-BREAK SPACE and
> U+202F NARROW NO-BREAK SPACE are available in Unicode for a
> digit-grouping space in numbers:
>
> >
On 07/07/2003 2:51 PM, Peter Kirk wrote:
> ... Also a surprising number of languages have been
> written in Hebrew script at various times. ...
One doesn't have to look at exotic languages (or the Hebrew Bible) to find
strange uses of Hebrew characters. I have a modern Hebrew-English dictionary
th
At 07:41 PM 7/7/03 +0100, Michael Everson wrote:
The typing habit was designed to assist typesetters in reading the
manuscript as they were setting type. Traditionally, the typesetters never
set the extra space.
Apparently not universally true (see my other mail).
Also, when lines had to be s
John H. Jenkins wrote:
> On Monday, July 7, 2003, at 4:38 PM, Michael Everson wrote:
>
> > At 16:22 -0600 2003-07-07, John H. Jenkins wrote:
> >
> >> IIRC the English prefer to say "Mr Roberts."
> >
> > The, ahem, Irish too. ;-)
> >
>
> Well, to be frank, I'm sure that the Welsh, Scots, and Manx p
At 17:00 -0600 2003-07-07, John H. Jenkins wrote:
IIRC the English prefer to say "Mr Roberts."
The, ahem, Irish too. ;-)
Well, to be frank, I'm sure that the Welsh, Scots, and Manx probably
do, too. (Did I leave anybody out *this* time?)
The Cornish, of course. :-)
--
Michael Everson * * Everson
At 01:10 +0200 2003-07-08, Philippe Verdy wrote:
I forgot to ask something: is there a Unicode codepoint assigned to
the abbreviation dot (a narrower dot with less margins on left and
right than the standard dot), as it seems to be used in some
typesetted texts to differentiate it from the punc
Philippe Verdy wrote:
"XEU" (the past European Currency Unit replaced by the Euro in a different
area of countries excluding GB and DK,
Also excluding SE.
Stefan
On Monday, July 7, 2003, at 4:38 PM, Michael Everson wrote:
At 16:22 -0600 2003-07-07, John H. Jenkins wrote:
IIRC the English prefer to say "Mr Roberts."
The, ahem, Irish too. ;-)
Well, to be frank, I'm sure that the Welsh, Scots, and Manx probably
do, too. (Did I leave anybody out *this* tim
At 16:22 -0600 2003-07-07, John H. Jenkins wrote:
IIRC the English prefer to say "Mr Roberts."
The, ahem, Irish too. ;-)
--
Michael Everson * * Everson Typography * * http://www.evertype.com
On Tuesday, July 08, 2003 12:08 AM, Frank da Cruz <[EMAIL PROTECTED]> wrote:
> > It is worth noting that what is described here is the default
> > running mode of Emacs for the English locale. There are a lot more
> > "modes" on Emacs to handle various languages (including programming
> > language
At 18:08 -0400 2003-07-07, Frank da Cruz wrote:
> It is worth noting that what is described here is the default
running mode of
Emacs for the English locale. There are a lot more "modes" on Emacs to
handle various languages (including programming languages).
Of course. But without two spaces y
On Monday, July 7, 2003, at 4:08 PM, Frank da Cruz wrote:
Of course. But without two spaces you have greater ambiguity, at
least in
English: In "Mr. Roberts", what is the function of the period?
Don't call me Mr. Roberts is my name.
Don't call me Mr. Roberts is my name.
IIRC the English
> It is worth noting that what is described here is the default running mode of
> Emacs for the English locale. There are a lot more "modes" on Emacs to
> handle various languages (including programming languages).
>
Of course. But without two spaces you have greater ambiguity, at least in
Englis
On Monday, July 07, 2003 10:03 PM, Frank da Cruz <[EMAIL PROTECTED]> wrote:
> Here, by the way, is a the formal definition of a sentence in EMACS:
> http://www.gnu.org/manual/emacs-lisp-intro/html_node/sentence-end.html
>
> A great deal of other text processing software uses similar rules.
It is
Michael Everson posted:
Typists were taught to do it generally, but the origin of the
practice is to assist the typesetters.
No so. It predates typewriters and one can see this style in the
typography in many books of the Victorian era and the early decades of
the twentieth century.
From Rober
Kuro,
I guess the fact that yen sign, won sign, etc. are sometimes used as a file
separators, doesn't diminish their perception as currency signs (or because
they are functions picked up after they were established as signs). ;-)
OK. It makes sense to me that a character is a currency sign if tha
On Monday, July 07, 2003 9:41 PM, Michael Everson <[EMAIL PROTECTED]> wrote:
> At 15:03 -0400 2003-07-07, Tex Texin wrote:
>
> > When is a character properly called a currency sign?
>
> Hunh? When you use it to represent currency. DM was two characters
> used as a character sign in Germany.
As
Tex Texin posted on my indication that only U+00A0 NO-BREAK SPACE and
U+202F NARROW NO-BREAK SPACE are available in Unicode for a
digit-grouping space in numbers:
Jim, Why do you leave out U+2007 figure space?
U+2007 FIGURE SPACE is also a non-breaking space.
But Philip Verdy claimed (and I ag
From Robert Bringhurst's Elements of Typographic Style, pp. 28-20:
"Use a single word space between sentences. In the nineteenth
century, which was a dark and inflationary age in typography and type
design, many compositors were encouraged to stuff extra space between
sentences. Generations of
Mon, 7 Jul 2003 19:41:21 +0100 Michael Everson wrote:
> At 14:27 -0400 2003-07-07, Frank da Cruz wrote:
>
> > EMACS aside, it's still an interesting question why -- in English at
> > least -- it was customary thoughout the 20th century to put two spaces
> > after a period when typing. I expect it
At 15:12 -0400 2003-07-07, John Cowan wrote:
Michael Everson scripsit:
The typing habit was designed to assist typesetters in reading the
manuscript as they were setting type.
Either this says that double-spacing after a sentence improves the readability
of monospaced documents, or I misundersta
Forgot to copy to the list...
-Original Message-
From: Kurosaka, Teruhiko
Sent: Monday, July 07, 2003 12:44 PM
To: 'Tex Texin'
Subject: RE: When is a character a currency sign?
Hello Tex,
> When is a character properly called a currency sign?
If a character is used EXCLUSIVELY for the
At 15:03 -0400 2003-07-07, Tex Texin wrote:
When is a character properly called a currency sign?
Hunh? When you use it to represent currency. DM was two characters
used as a character sign in Germany.
--
Michael Everson * * Everson Typography * * http://www.evertype.com
Michael Everson scripsit:
> The typing habit was designed to assist typesetters in reading the
> manuscript as they were setting type.
Either this says that double-spacing after a sentence improves the readability
of monospaced documents, or I misunderstand you entirely. After all, typists
are
I had a couple people comment on the currency page that U+5186 is not a yen
sign but a character.
I see it used regularly as a currency symbol instead of U+00A5.
Is there a distinction between the two?
When is a character properly called a currency sign?
The page has had a number of updates ove
Jim, Why do you leave out U+2007 figure space?
Jim Allan wrote:
>
> Philippe Verdy posted:
>
> > I can't make a recommandation on which space figure to use.
> > Ideally, it must just be *less wide* than a digit and *not justified*, it must
> > be *unbreakable*. The ideal space to use depends on
On Monday, July 07, 2003 8:27 PM, Frank da Cruz <[EMAIL PROTECTED]> wrote:
> I vaguely recall seeing this same discussion play out some years ago.
> EMACS aside, it's still an interesting question why -- in English at
> least -- it was customary thoughout the 20th century to put two
> spaces after
On 07/07/2003 11:01, John Cowan wrote:
Well, that's true up to a point, but only up to a point. Tomorrow someone
may conceive a need to express Tibetan using Hebrew vowel points instead of
Tibetan vowel signs, whilst keeping the Tibetan consonants, but he
should not complain if neither rendering
At 14:27 -0400 2003-07-07, Frank da Cruz wrote:
EMACS aside, it's still an interesting question why -- in English at
least -- it was customary thoughout the 20th century to put two
spaces after a period
when typing. I expect it must have been an aesthetic decision. What else
could it have been
From: "Michael Everson" <[EMAIL PROTECTED]>
> At 13:29 -0400 2003-07-07, Frank da Cruz wrote:
>
> >Nobody is springing to the defense of this so I'll only say that
> >it's a time-honored practice and we shouldn't be so quick to
> >disparage it, lest we be disparaged several years hence for the
> Unicode already defines with character properties those punctuations that
> terminate sentences. Why would you need to recognize sequences of two spaces
> as meaning an end of sentence??? This would be wrong to select sentenced in
> a preformated plain-text, even in English...
>
Because it has "
From: "Frank da Cruz" <[EMAIL PROTECTED]>
> In the world of plain text, two spaces after a sentence-ending period,
> exclamation mark, question mark, or other mark is actually rather handy to
> distinguish sentence enders from the same marks used in other ways,
> esp. periods in abbreviations. Th
John Burger scripsit:
> Really? TeX seems to "stretch" this space more than ordinary
> inter-word spaces in justified text - there are even special commands
> to tell TeX when a period really is (or isn't) end-of-sentence. I had
> always assumed that this came from established type-setting pr
Michael Everson scripsit:
> >In the world of plain text, two spaces after a sentence-ending
> >period, exclamation mark, question mark, or other mark is actually
> >rather handy to distinguish sentence enders from the same marks used
> >in other ways, esp. periods in abbreviations.
>
> Fie! Fi
Peter Kirk scripsit:
> Don't tell the Georgians you said their country was in Asia, you might
> get in trouble! They certainly consider themselves Europeans, in line
> with their culture and religion. As for the geography, atlases differ.
The U.N.'s Statistics Division, which has absolutely no
Ted Hopp scripsit:
> I think we need to keep Peter Constable's point in mind that current usage
> should not define the limits of Unicode functionality. Since the principle
> is that all sequences of character codes are permitted (2.10), it seems
> wrong to supply a fix for only "the small number
At 13:29 -0400 2003-07-07, Frank da Cruz wrote:
Nobody is springing to the defense of this so I'll only say that
it's a time-honored practice and we shouldn't be so quick to
disparage it, lest we be disparaged several years hence for the
things we do :-)
It's rotten, and when I typeset books
(
From: John Cowan <[EMAIL PROTECTED]>
It's a typewriter-based convention, and is suitable for monowidth fonts
only. The space after a sentence-ending full stop in justified
contexts
is no bigger than any other space, in general.
Really? TeX seems to "stretch" this space more than ordinary
inter
At Mon, 7 Jul 2003 17:12:25 +0100, Michael Everson wrote:
> At 11:49 -0400 2003-07-07, John Cowan wrote:
>
> > It's a typewriter-based convention, and is suitable for monowidth
> > fonts only.
>
> It's a beastly practice held over from the time when it was useful
> (that is, when typesetters set
From: "Stefan Persson" <[EMAIL PROTECTED]>
To: "Philippe Verdy" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Monday, July 07, 2003 5:22 PM
Subject: Re: 24th Unicode Conference - Atlanta, GA - September 3-5, 2003 [OT]
> Philippe Verdy wrote:
>
> > I do agree: indicating "Atlanta, GA, USA" is
Sorry for the intrusion, but... If anyone knows an off-line way to get a
hold of Barry Caplan, could they please give him a call or send him a
letter, or knock on his door? He has been doing some standards archaeology,
and has sent messages to several people in the past few weeks, but his
i
At 11:49 -0400 2003-07-07, John Cowan wrote:
It's a typewriter-based convention, and is suitable for monowidth fonts only.
It's a beastly practice held over from the time when it was useful
(that is, when typesetters set the type from the typescript), and I
wouldn't use it in monowidth typesetti
On Mon, 7 Jul 2003, Steve Vernon wrote:
> Hiya!
>
> Any help would be appreciated. Not sure if I should send to a MySQL
> list, or this one ( I didn't want to cross post), so sorry if not
> applicable. If this is not ok to ask in this group, can someone tell me
> please!
>
> Because from what I u
On 07/07/2003 8:52 AM, Peter Kirk wrote:
> On 06/07/2003 17:22, John Hudson wrote:
> > ... Given the small number of attested sequences that would be
> > adversely affected by normalisation re-ordering, I'm beginning to
> > favour the idea of encoding these sequences as individual characters.
> > W
On 07/07/2003 08:22, Stefan Persson wrote:
Note that the country is abbreviated GE, not GA (and that it's in Asia).
Stefan
Don't tell the Georgians you said their country was in Asia, you might
get in trouble! They certainly consider themselves Europeans, in line
with their culture and relig
Philippe Verdy scripsit:
> Some other conventions use in English is the double-space after a
> sentence-ending dot: this convention does not exist in French, and I do
> think that it exist in English as a way to represent a large (cadratin
> minimum width) space after this dot.
It's a typewriter
Philippe Verdy wrote:
I do agree: indicating "Atlanta, GA, USA" is not a big effort.
Note that Georgia is also a Central European country, and
saying "GA" or "Georgia" alone is not good...
Note that the country is abbreviated GE, not GA (and that it's in Asia).
Stefan
Hiya!
Any help would be appreciated. Not sure if I should send to a MySQL
list, or this one ( I didn't want to cross post), so sorry if not
applicable. If this is not ok to ask in this group, can someone tell me
please!
Because from what I understand, MySQL supports unicode, but various
features
Hello, everyone.
JIS X 0213:2000 has the character in question
as 1-89-39 (plane/row/cell).
According to a recent review of JIS X 0213,
this character will be mapped to U+25771.
References (sorry, in Japanese):
(1) http://www.jsa.or.jp/domestic/instac/review/0213review.htm
http://www.jsa.o
On Monday, July 07, 2003 2:04 PM, Peter Kirk <[EMAIL PROTECTED]> wrote:
> On 07/07/2003 04:15, Philippe Verdy wrote:
> > The list separator in French is preferably the semicolon, rather
> > than a comma (which must then have a space):
> > => "123;456"
> > The is here also encoded accroding to the
On Monday, July 07, 2003 3:36 PM, Karl Pentzlin <[EMAIL PROTECTED]> wrote:
> Am Sonntag, 6. Juli 2003 um 22:24 schrieb Tex Texin:
> > Having said that, I will pass your comment along to the appropriate
> > people and suggest they consider adding "USA" and/or spell out
> > Georgia, in the notice and
Philippe Verdy posted:
I can't make a recommandation on which space figure to use.
Ideally, it must just be *less wide* than a digit and *not justified*, it must
be *unbreakable*. The ideal space to use depends on the available fonts,
and in practive most texts are coded with NBSP (sometimes a sta
On 07/07/2003 06:36, Karl Pentzlin wrote:
Am Sonntag, 6. Juli 2003 um 22:24 schrieb Tex Texin:
TT> Having said that, I will pass your comment along to the appropriate people and
TT> suggest they consider adding "USA" and/or spell out Georgia, in the notice and
TT> on the web site. (Or did you wan
Am Sonntag, 6. Juli 2003 um 22:24 schrieb Tex Texin:
TT> Having said that, I will pass your comment along to the appropriate people and
TT> suggest they consider adding "USA" and/or spell out Georgia, in the notice and
TT> on the web site. (Or did you want a different kind of change?)
All except
On 06/07/2003 17:22, John Hudson wrote:
Thanks for the thoughtful analysis, Peter. Eli Evans and I have been
documenting all of the unique mark sequences in the Michigan-Claremont
text and WTS morphology database that are potentially incorrectly
re-ordered in Unicode normalisation (I say potent
On 07/07/2003 04:15, Philippe Verdy wrote:
The list separator in French is preferably the semicolon, rather than a comma
(which must then have a space):
=> "123;456"
The is here also encoded accroding to the character encoding
constraints and fonts (here also less wide than a digit, unbreakable a
On Monday, July 07, 2003 8:41 AM, Tex Texin <[EMAIL PROTECTED]> wrote:
> Stefan,
> Thanks for your comments.
> Philippe,
> Thanks for your comments. I may add some of the notes to the page.
> However, I want to question your recommendation of U+2009 as I believe
> that is a breaking space. Perhaps
Stefan,
Thanks for your comments.
My sense is that number format varies somewhat depending on the application or
vertical industry, so it can be hard to say what the most popular usage is in
any regional market. I try to ignore the question of which format is right for
each market and just point o
66 matches
Mail list logo