Re: All-kana documents

2002-03-04 Thread Michael \(michka\) Kaplan
One has naught to do but look. :-)

cf: http://www.unicode.org/unicode/reports/tr6/

MichKa

Michael Kaplan
Trigeminal Software, Inc.  -- http://www.trigeminal.com/

- Original Message -
From: "$B$m!;!;!;!;(B $B$m!;!;!;(B" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, March 04, 2002 3:47 PM
Subject: All-kana documents


> If I have some all-kana documents (like, say, if I decide to encode some
> old women's literature, not that I will, but you might), is there an
> extension of UTF-8 that will alow me to strip off the redundant "this is
> kana" byte from most of the kana? After the first few thousand kana, it
> might be like, "Yeah, we get it already! It's kana! It's KANA!! You can
> stop reminding us now!!"
>
> This goes too for Hebrew, Greek, etc.
>
> $B==0l$A$c$s!!0&2CMvGO(B
>
> _
> $B$*E9$h$j$b5$7Z$K!*9%$-$J%b%N9%$-$J$@$18+$i$l$k(B MSN $B%7%g%C%T%s%0(B
> http://shopping.msn.co.jp/
>
>
>


RE: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread Kenneth Whistler

Kent Karlsson's suggestion:

> I vaguely suggested adding
> an enclosing (in some sense) invisible combining character to
> solve this: .
> No character has been designated for such use, though.  And I
> haven't made a formal proposal yet.
>

(i.e. create a generic way to make a non-enclosing combining mark
apply to a grapheme cluster, by encoding an invisible enclosing
combining mark) 

or Doug Ewell's suggestion:

> Dan is looking for COMBINING DOUBLE MACRON and COMBINING DOUBLE BREVE,
> analogous to U+0360 through U+0362, and if I were Dan I'd be looking for
> them too.

(i.e. simply encode two more explicit double combining marks for the
missing items)

are the two contenders still standing in the ring.

In any case, there is no recognized way to represent these two
accents at the moment in Unicode, and some kind of encoding action
will need to be taken by the UTC to make it possible.

--Ken




Re: All-kana documents

2002-03-04 Thread Kenneth Whistler


> If I have some all-kana documents ..., is there an 
> extension of UTF-8 that will alow me to strip off the redundant "this is 
> kana" byte from most of the kana? 

No.

> After the first few thousand kana, it 
> might be like, "Yeah, we get it already! It's kana! It's KANA!! You can 
> stop reminding us now!!"

If I decide to emulate the Buddha and fill text files with a million
DEVANAGARI OM symbols in a row, each instance is still U+0950, whether
represented in UTF-16 or UTF-8 (or UTF-32, for that matter).

Stop thinking in terms of bytes and start thinking in terms of
characters.

For that matter, say you were reading the genetic code:
ATG, Methionine; ATG, Methionine; ATG, Methionine; ATG, Methionine;
ATG, Methionine; ATG, Methionine; ATG, Methionine; ATG, Methionine;
ATG, Methionine; ATG, Methionine; ATG, Methionine; ATG, Methionine;...

Yeah, we get it already! It's methionine! It's METHIONINE!! You can
stop reminding us now!!

A code is what it is.

> 
> This goes too for Hebrew, Greek, etc.

What you are looking for are text compression algorithms. See UTS #6,
A Standard Compression Scheme for Unicode.

--Ken





Re: All-kana documents

2002-03-04 Thread Markus Scherer
You could
- use SCSU (UTR 6)
- use BOCU-1 
(http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/conversion/bocu1/bocu1.html)
- invent your own...

markus


All-kana documents

2002-03-04 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B
If I have some all-kana documents (like, say, if I decide to encode some 
old women's literature, not that I will, but you might), is there an 
extension of UTF-8 that will alow me to strip off the redundant "this is 
kana" byte from most of the kana? After the first few thousand kana, it 
might be like, "Yeah, we get it already! It's kana! It's KANA!! You can 
stop reminding us now!!"

This goes too for Hebrew, Greek, etc.

$B==0l$A$c$s!!0&2CMvGO(B

_
$B$*E9$h$j$b5$7Z$K!*9%$-$J%b%N9%$-$J$@$18+$i$l$k(B MSN $B%7%g%C%T%s%0(B 
http://shopping.msn.co.jp/


Re: [OT beyond any repair] House numbers

2002-03-04 Thread John Cowan

Barry Caplan wrote:

 > Doesn't every address that USPS delivers to have a unique 9 digit
 > zip code, making house numbers a legacy?

In fact no.  As a trivial counterexample, P.O. Box Numbers
become ZIP+4 codes by adding the 5-digit ZIP code to the 4 low order
digits of the box number (as in my case), but I have seen P.O. Box
addresses with more than 4 digits.

For another example, the building in New York City that I live in
contains 10 apartments, equally divided among two ZIP+4 codes.

 > From the US, couldn't I
 > get a letter to you just by putting 12017-0042 on the envelope?

It happens to work for me because the 12017 post office has
comfortably fewer than 10,000 boxes -- in fact, it has a few
hundred at most.  In general, though, USPS will not deliver
without some kind of addressee's name, so that is required
in addition to the ZIP+4.

-- 
John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com
I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith.  --Galadriel, _LOTR:FOTR_





Re: [OT beyond any repair] House numbers

2002-03-04 Thread Barry Caplan

At 01:16 PM 3/1/2002 -0500, John Cowan wrote:
What about the "100 house numbers per block" convention?
>This does not hold in the older parts of older U.S. cities
>(New York does not obey it south of 8th St. or so),
>but is quite general in the U.S. as a whole.


It holds for the whole of Baltimore and extends on at least the major arteries into 
the suburbs. Some suburbs reset the count from their own city centers, and that may or 
may not include the main arteries. I am not aware of any exceptions at all in 
Baltimore city. Note that the main arteries are more or less in an spoke from the 
center of downtown. All blocks are numbered form the hub (baltimore (east/west) at 
charles (north/south). Thus all 2800 blocks are roughly equidistant form the center. 
It is less well known that even numbers are on the left as you head out of town in any 
direction and odd numbers on the right.

Anyone who wants to reach me by snail (extremely snail)
>mail, can do so at:
>
>Cowan
>12017-0042
>U.S.A.

Doesn't every address that USPS delivers to have a unique 9 digit zip code, making 
house numbers a legacy? From the US, couldn't I get a letter to you just by putting 
12017-0042 on the envelope?


Barry Caplan
Publisher, www.i18n.com






Temporary Outage

2002-03-04 Thread Sarasvati

The mail list server will be shut down temporarily today for
maintenance. It should be back on line shortly. You might
not even notice the outage.
Regards,
-- Sarasvati




Re: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread David Hopwood

-BEGIN PGP SIGNED MESSAGE-

Michael Everson wrote:
> At 20:39 -0800 2002-03-03, Dan Wood wrote:
> >Hi,
> >
> >I'm not finding hints of this in any of the FAQ or "where's my
> >character" docs  I'm trying to create (or find) the "oo" pair
> >with a combining macron (0304) and combining breve (0306) over the
> >pair of them together, as in these images:
> >
> >
> >
> 
> I think the COMBINING GRAPHEME JOINER is supposed to be used here.
> You put it between the two o's and then follow that string with a
> combining macron and it is supposed to centre itself over the whole
> lot.

"oo" is canonically equivalent to "o".
So either both have to be displayed as "", or both have to
be displayed as "o".

My interpretation of PDUTR#28 is that  only affects enclosing marks,
so this example should be displayed as "o".

- -- 
David Hopwood <[EMAIL PROTECTED]>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5  0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip


-BEGIN PGP SIGNATURE-
Version: 2.6.3i
Charset: noconv

iQEVAwUBPIPEbzkCAxeYt5gVAQFt2gf/Wyom9bcWpxoqlioAkdHQD+TtiDuSnQHh
7lJLtrRihplBQn45UqzFvCj31FRQb7iHfIHSWAvcWokhvgMsTUPk8uZcbpH8+q6l
klkIVRczG1g6aw04L+mJwqqC40FwYaNrlwBi9Ju9eGqsE/lhLlouj9X0A1p4lVzJ
o7C3Tz9DBcyS01qf8Ln4V5R58V99lQNh59oTQhVPuKiYgEWZjMdvg/H+FXNT4XyO
kdQ/2Rn4IRHx8nfULbQQ2/pWLfbpLsHF66rWo1o0LndfdTkNeYxeAj73+OsjpeoK
e2acK3AsheDdjBGFWBtdo4CePgLdeH96zN4lb9rwLzt4/9gT0lxGJg==
=JgQf
-END PGP SIGNATURE-




Re: Theban alphabet?

2002-03-04 Thread Kenneth Whistler

Curtis Clark said:

> At 12:27 AM 3/1/02, Philipp Reichmuth wrote:
> > How about a glyph variant of U+2721? ;-) 
> 
> U+2721 U+FE00 U+20DD, perhaps?
  ^

Not to be overly pedantic on a humorous thread, but...

Combinations involving the variation selector-1 are undefined
unless published in StandardizedVariants.html. I just don't
want people to start thinking (even in jest) that such
combinations are freely specifiable.

--Ken





RE: [OT] Broken monetary settings (was RE: [OT] Broken monetary settings in MS Works)

2002-03-04 Thread John McConnell

Marco Cimarosti wrote:

>There are two wrong design assumption in the "monetary settings":
>
>1) The  should not be part of the locale at all. >The fact 
>that US dollars need two decimals does not depend on local 
> convention of any country, but on the fact that there are coins as small
> as 1/100 of a dollar. There should be a locale-independent list of
> currencies that specifies this parameter (and other things, perhaps)-
> 2) The  is to be part of the locale, but each locale
>
>should have an *array* of currency symbols, with an entry for each existing
> currency. Having a single  per locale would be like 
> having one single !

In Windows XP, the Regional and Language Options Control panel has much of this 
design. There is a list of currencies per locale. The default is the Euro for the 
countries in the Euro zone, but the previous currency symbol is available from the 
list. Moreover, as with most Windows locale values, the user can enter their own 
default value by typing it into the field. You can change the symbol to 'DEM' or 
whatever you want.

During development, we considered tying the number of decimals to the currency rather 
than the user. However we found it caused too many problems for existing applications, 
which assumed that there was only one setting in effect at a time. In .NET, the two 
values are grouped together in the RegionInfo class.

John McConnell
Windows Globalization Infrastructure Design & Development







Re: [OT] Automatic vowelizers?

2002-03-04 Thread Philipp Reichmuth

Hallo Marco,

Vowelizers for Arabic:

http://www.cs.ualberta.ca/~zaiane/htmldocs/dea.html
http://www.unesco.org/comnat/france/ali.htm (old)
http://users.kfupm.edu.sa/ICS/muhtaseb/Kacst/project_design/vowelization.htm
(unclear, whether they actually realized it)

Try Google for further results.

In addition, you may contact developers of Arabic or Hebrew
text-to-speech software.

For Hebrew, I know of a project in Munich in the early nineties where
they developed a system for morphological analysis of Hebrew text (W.
Richter, G. Specht, W. Eckardt), but I am not sure whether they
vowelized it.

Having spent about a year developing Arabic morphological analysis
software, I can assure you that the problem is not trivial at all and
requires both lexical lookup mechanisms and morphological and
syntactical analysis, and even then it's not going to be perfect -
ambiguities just remain.

  Philippmailto:[EMAIL PROTECTED]
___
Out of memory / We wish to hold the whole sky / But we never will





RE: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread Kent Karlsson


...
> >The problem here is that the COMBINING GRAPHEME JOINER only affects
> >*enclosing* combining marks (and combining marks *following* an
> >enclosing one).
> 
> I do not know what you mean by this. The CGJ makes what it is joining 
> into a single entity. So if you add a diacritic to it, it should 
> centre itself above that entity.



See http://www.unicode.org/unicode/reports/tr28/, which says:

"On the other hand, where elements are linked by a Grapheme_Link,
non-enclosing combining marks only apply to the last base character."

That sentence is followed by an example with NUKTA.

So Grapheme_links do not affect marks that are neither non-enclosing
nor (Brahmic) dependent vowels (the latter is not properly covered
by the quoted sentence, but talked about elsewhere in the Unicode 3.0
book).

If you will, "non-enclosing" combining marks bind strongly to the
last base character, while "enclosing" combining marks (which
apparently are Me + Brahmic dependent vowels, though there is
no explicit definition) apply to the grapheme cluster also when
it is created via a Grapheme_link, e.g. CGJ.  (There is also a
special case for VIRAMA at the end of a word; but that is a
different detail, and I don't know how, or even if, that is
now resolved.)


/kent k





unicode in a web based application

2002-03-04 Thread James Griffiths

First of all, please forgive me if this message lacks clarity, I am 
struggling to express my question correctly.

I am developing a multi-lingual web application, using cold fusion, an 
apache web server, and sql server 7.  The languages used are english, 
french, german, spanish, italian, polish, and russian.

The application will be used to create, administer and take bookings for 
events.  When an event is set up, questions can be added that the person 
booking is required to answer.The questions will be in added all of the 
languages, but will only appear in the language of the person booking.

I want to have a page to input the questions that consists of 7 text boxes, 
one for each language.  I want to be able to copy text from a word document 
containing the question in a specific language, in to the respective text 
box, and when the page is submit, this text is then stored in the database 
in its original form  (by this i mean not using a numeric character 
reference.)

I am struggling to overcome the first hurdle:  I have copied russian text 
from a web page in to a word document, when i try to copy this directly in 
to a field in sql server, it produces a string of question marks - 

Could someone please tell me
a)  how to copy russian (or anything not included in the iso-8859-1 character 
set) directly in to sql server
b)  Can using unicode solve my problems with input (if so, I assume i set the 
encoding of the web page to utf-8, but how do I set up sql server and do I 
need to do anything different with cold fusion)
c)  If the text is stored in the db correctly, will it be output (on a web 
page) correctly as long as the encoding is set to utf-8

Any help would be very much appreciated,

Jamie Griffiths



_
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.;





Re: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread Doug Ewell

John Cowan <[EMAIL PROTECTED]> wrote:

> In addition, U+0305 COMBINING OVERLINEs (as distinct from U+0304) have
> to join left and right, so "o+combining overline+o+combining overline"
> will produce the correct appearances with pre-3.2 implementations.
> Examining the GIF, it looks more like an overline than a macron
anyway.

Isn't that an example of using the wrong character that happens to have
the right glyph?  That's a no-no, and in any case the breve problem
remains unsolved.

Dan is looking for COMBINING DOUBLE MACRON and COMBINING DOUBLE BREVE,
analogous to U+0360 through U+0362, and if I were Dan I'd be looking for
them too.

-Doug Ewell
 Fullerton, California






[OT] Re: Month names

2002-03-04 Thread Doug Ewell

> And when somebody (from Japan, Korea, China, etc)
> trying to use month numbers - he must understand,
> that traditional calendar system in this countries are _different_.
> What months you will try to number and from what you will start?

This is an inherent hazard of using an all-numeric system.  It must be
understood which calendar is in use.  If I write "January" or "Tishri"
or "Ramadan," the calendar in use is implicit.  (But I am also
language-specific, which is a disadvantage of names.)

When I write 2002-03-04 to mean today (Monday), I am using ISO 8601
notation, which is specified as indicating the Gregorian (not Hebrew or
Arabic/Hijri) calendar.  But you are correct that other numeric systems
could conceal the calendar (although the year, if present, usually gives
it away; this is the year 5762 Hebrew and 1422 Arabic/Hijri).


-Doug Ewell
 Fullerton, California






RE: Standard Conventions and euro

2002-03-04 Thread Suzanne M. Topping

Message being forwarded to the Locales group...

> -Original Message-
> From: Doug Ewell [mailto:[EMAIL PROTECTED]]
> Sent: Saturday, March 02, 2002 12:55 PM
> To: Keld Jørn Simonsen
> Cc: Jungshik Shin; Marco Cimarosti; 'Michael Everson';
> [EMAIL PROTECTED]; David Starner
> Subject: Re: Standard Conventions and euro
> 
> 
> Keld Jørn Simonsen <[EMAIL PROTECTED]> wrote:
> 
> >> Locale systems that force you to pick one immutable set of
> conventions
> >> for a given country are broken in general.  I remember 
> having to tell
> >> MS-DOS that I was in South Africa or someplace, just to get my
> directory
> >> listing the way I wanted it.  *nix systems that start with "fr_FR"
> and
> >> then allow you to define "fr_FR-EURO" or something really 
> aren't much
> >> better; what if I want to deviate from the pre-defined 
> locale in four
> or
> >> five ways instead of just one?
> >
> > You have to pick one, Doug. You cannot write
> > "On the 3/1/02 2002-03-01 1/3/02 1/3/2002 1.3.2002 I went to..."
> > Or you can write it but weiting the same date in 5 different
> > formats in the same line is not customary and superflous.
> 
> That's not what I meant, of course.  I meant, what if I want 
> to use the
> en_US locale (e.g. to make sure a spelling checker uses the right
> dictionary), *but* I also want to use:
> 
> - UTF-8 instead of ISO 8859-1, and
> - -mm-dd instead of m/d/yy, and
> - 24-hour time (with seconds) instead of 12-hour, and
> - some "negative money" format different from the default, and
> - have the week start on Monday instead of Sunday
>   (I don't know if this last one is part of the *nix locale model)
> 
> Several people responded that I can go root and define my own private
> locale, with whatever settings I like.  That's just what I 
> would want to
> do, so the problem would be solved, but then I'm not really sure why I
> would need to give the new locale a name, since it's not 
> interoperable.
> 
> But this is all very OT and I'd better stop now, because I know how
> quickly this discussion can devolve into Operating System Wars.
> 
> -Doug Ewell
>  Fullerton, California
> 
> 
> 
> 




RE: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread Michael Everson

At 15:19 +0100 2002-04-03, Kent Karlsson wrote:
>  > -Original Message-
>>  From: Michael Everson
>...
>>  At 20:39 -0800 2002-03-03, Dan Wood wrote:
>>  >Hi,
>>  >
>>  >I'm not finding hints of this in any of the FAQ or "where's my
>>  >character" docs  I'm trying to create (or find) the "oo" pair
>>  >with a combining macron (0304) and combining breve (0306) over the
>>  >pair of them together, as in these images:
>>  >
>>  >
>>  >
>>
>>  I think the COMBINING GRAPHEME JOINER is suppoised to be used here.
>>  You put it between the two o's and then follow that string with a
>>  combining macron and it is supposed to centre itself over the whole
>>  lot.
>
>The problem here is that the COMBINING GRAPHEME JOINER only affects
>*enclosing* combining marks (and combining marks *following* an
>enclosing one).

I do not know what you mean by this. The CGJ makes what it is joining 
into a single entity. So if you add a diacritic to it, it should 
centre itself above that entity.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




RE: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread Kent Karlsson



> -Original Message-
> From: Michael Everson
...
> At 20:39 -0800 2002-03-03, Dan Wood wrote:
> >Hi,
> >
> >I'm not finding hints of this in any of the FAQ or "where's my 
> >character" docs  I'm trying to create (or find) the "oo" pair 
> >with a combining macron (0304) and combining breve (0306) over the 
> >pair of them together, as in these images:
> >
> >
> >
> 
> I think the COMBINING GRAPHEME JOINER is suppoised to be used here. 
> You put it between the two o's and then follow that string with a 
> combining macron and it is supposed to centre itself over the whole 
> lot.

The problem here is that the COMBINING GRAPHEME JOINER only affects
*enclosing* combining marks (and combining marks *following* an
enclosing one).  ("Enclosing" should here be interpreted as
of general category Me, but also dependent vowels are affected
the same way.)  Since combining macron/breve are not enclosing
(in either sense), this does not work: the breve (macron) only
applies to the last base letter.  I vaguely suggested adding
an enclosing (in some sense) invisible combining character to
solve this: .
No character has been designated for such use, though.  And I
haven't made a formal proposal yet.

/kent k





Re: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread John Cowan

Michael Everson wrote:

> At 20:39 -0800 2002-03-03, Dan Wood wrote:
> 
>> Hi,
>>
>> I'm not finding hints of this in any of the FAQ or "where's my 
>> character" docs  I'm trying to create (or find) the "oo" pair with 
>> a combining macron (0304) and combining breve (0306) over the pair of 
>> them together, as in these images:
>>
>> 
>> 
> 
> 
> I think the COMBINING GRAPHEME JOINER is suppoised to be used here. You 
> put it between the two o's and then follow that string with a combining 
> macron and it is supposed to centre itself over the whole lot.

In addition, U+0305 COMBINING OVERLINEs (as distinct from U+0304) have
to join left and right, so "o+combining overline+o+combining overline"
will produce the correct appearances with pre-3.2 implementations.
Examining the GIF, it looks more like an overline than a macron anyway.

-- 
John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com
I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith.  --Galadriel, _LOTR:FOTR_





[OT] Broken monetary settings (was RE: [OT] Broken monetary settings in MS Works)

2002-03-04 Thread Marco Cimarosti

Otto Stolz wrote:
[...]
> No, I simply wish that they had based their design on correct sup-
> positions. A real-world value (physical, monetary, or whatever) is
> not a mere number, but rather the product of a number and a unit of
> measurement. Changing the unit while keeping the number will affect
> the value no less than changing the number while keeping the unit.
> Example:
> - when you change the width of a diskette from 90 mm to 2286 mm,
>it will no more fit in the drive,
> - likewise, when you change its width from 90 mm to 90",
>it will no more fit in the drive,
> - both 90 mm, and (approximately) 3½" wide diskettes will fit in
>the same drive.
> 
> A value in a spreadsheet should never inadvertently change.
> 
> Now, Works did exactly that: it multiplied every monetary value
> (in Germany by 1,95583) by replacing its unit with a different
> one, [...]

A perfect explanation of the nature problem, but a poor diagnosis of its
location.

Your description makes clear why, in the retail software industry, we
*never* use some the two elements of "monetary settings" those which define
the actual monetary value of a number.

But you fail to notice that the problem is is not with Works, or with any
other applicative program, but rather with the system-level locale settings
themselves. And, unfortunately, all the operating systems and the
programming languages that I know suffer from this problem.

There are two wrong design assumption in the "monetary settings":

1) The  should not be part of the locale at all. The
fact that US dollars need two decimals does not depend on local convention
of any country, but on the fact that there are coins as small as 1/100 of a
dollar. There should be a locale-independent list of currencies that
specifies this parameter (and other things, perhaps)-

2) The  is to be part of the locale, but each locale should
have an *array* of currency symbols, with an entry for each existing
currency. Having a single  per locale would be like having
one single !

E.g.: an United States locale could have "$" as the symbol for dollars and
"Mex$" as the symbol for the Mexican peso, while the Mexican locale could
have "$EE.UU." for the former and "$" for the latter. However, US dollars
remain US dollars in both countries, just like "January" and "enero" remain
month #1 in both countries.

So, it is perfectly OK if 1234 dollars show as "$1,234.00" in the USA and as
"1.234,00 $EE.UU." in Mexico, but it is never OK 1234 dollars should become
1234 pesos! (But the opposite is OK, if you do it on my bank account:-)

This implies, as you argued, that the monetary values should contain a
currency Id. Storing money as a simple number is like storing dates with no
year, or integers with no sign.

It also implies that all user interfaces should be changed to allow users to
specify the currecy Id, in case that it is different from the default.

This also makes it desirable that most systems have some form of exchange()
API or service, which is capable of converting monetary amounts from one
currency to another. But this opens the problem that, unlike measurement
units, the value exchange rates fluctuate daily, so there should be some
mechanism to update them.

I don't agree, however, that the actual formatted string should be stored
with the amount. If you change the symbol of the German mark from "DEM" to
"DM", it is OK that amounts previously formatted as "DEM 10,50" become "DM
10,50".

But if you add a *new* currency Id, say , and set it as the system
default, an existing "DEM 10,50" amount should remain unchanged. The only
effect should be that, if you enter a new amount "10.5" without specifying
the currency, it will be "EUR 10,50".

_ Marco




Re: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread John Cowan
$B$m!;!;!;!;(B $B$m!;!;!;(B wrote:

> This would be a most useful Unicode character for producers of English 
> dictionaries, though I personally would rather they all go the IPA 
> route. (By the way, is that "route" pronounced as "root" or as "rout"? I 
> use the latter for "... go that route", but the former for "Take Route 
> 66...")

As the Irishman said when asked whether ee-ther or eye-ther was
the correct pronunciation of "either":

Ayther will do, ayther will do.

In general, "rout" is the normal pronunciation in technical context such
as "Internet routing".  Otherwise, it depends on dialect, idiolect,
and context.

-- 
John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com
I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith.  --Galadriel, _LOTR:FOTR_


RE: Re[2]: Month names (was: Re: Standard Conventions and euro)

2002-03-04 Thread Jonathan Rosenne

In Hebrew the names of the days of the week are ordinals for Sunday
(first) to Friday (sixth). European umbers are not used, but Hebrew
(Alef to Vav) are.

Jony

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED]] On Behalf Of Serge Nesterovitch
> Sent: Monday, March 04, 2002 9:41 AM
> To: Doug Ewell
> Cc: [EMAIL PROTECTED]
> Subject: Re[2]: Month names (was: Re: Standard Conventions and euro)
> 
> 
> Hello Doug,
> 
> Monday, March 04, 2002, 3:07:44 AM, you wrote:
> 
> DE> In the Hebrew calendar, only Shabbat (Saturday) has a 
> name; the rest 
> DE> of the days are numbered.  In Russian and Portuguese, most of the 
> DE> day names are numeric.
> It's wrong information.
> In Russian NO one days of week name is numeric.
> 3 of it's names are based on the numbers,
> but vtornik =/= vtoroi
> chetverg =/= chetveryi
> pyatnitsa =/= pyatyi
> Using numbers of a days where names must be used is impossible.
> 
> And when somebody trying to use day _numbers_ he must 
> remember, that not in all cultures week begins from the same 
> day. There are some traditions, in which the first day of the 
> week is a sunday, and some - with the first day - monday.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> Best regards,
>  Sergemailto:[EMAIL PROTECTED]
> 
> 
> 
> 





RE: [OT] Automatic vowelizers?

2002-03-04 Thread Jonathan Rosenne

There is Nakdan, http://www.cet.ac.il/home/nakdan.htm

Jony

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED]] On Behalf Of Marco Cimarosti
> Sent: Monday, March 04, 2002 1:03 PM
> To: '[EMAIL PROTECTED]'; [EMAIL PROTECTED]
> Subject: [OT] Automatic vowelizers?
> 
> 
> I read the following question on a local NG:
> 
> Does is exist software which automatically adds vowel marks 
> to Arabic and Hebrew text?
> 
> I realize that such a thing would be very complex, and 
> probably requires dictionary look-up and some degree of 
> understanding of the grammatical context. Yet, it seems like 
> is a prerequisite for implementing things such as automatic 
> transliteration or screen readers.
> 
> TIA.
> 
> _ Marco
> 
> 
> 





Re: How to make "oo" with combining breve/macron over pair?

2002-03-04 Thread Michael Everson

At 20:39 -0800 2002-03-03, Dan Wood wrote:
>Hi,
>
>I'm not finding hints of this in any of the FAQ or "where's my 
>character" docs  I'm trying to create (or find) the "oo" pair 
>with a combining macron (0304) and combining breve (0306) over the 
>pair of them together, as in these images:
>
>
>

I think the COMBINING GRAPHEME JOINER is suppoised to be used here. 
You put it between the two o's and then follow that string with a 
combining macron and it is supposed to centre itself over the whole 
lot.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: [OT] Broken monetary settings in MS Works

2002-03-04 Thread Otto Stolz

I had written:

> When I enter 17,00 DM I do not want to read 17,00 €, the other day.

Lars Kristan wrote:
> I know that values haven't changed.

O yes, they have:  17,00 € = 33,25 DM, i. e. much more than 17,00 DM.

Lars Kristan wrote:

> My point was - if you would change your system (user) settings

> from DM to DEM, you would expect to see DEM everywhere, wouldn't you? 


Now I understand your point.

But no, I wouldn't expect that. Rather, I would expect that new
entries be labelled "DEM", but old entries left alone. When the
system would replace "DM" with "DEM", in every display, some tables
would no more align properly, and many amounts would not fit into
the available space.

> we were [...] discussing [...] MS Works. For home use. A cost
> effective and simple to use tool. You wish they would cut development

> cost at some other feature, but they didn't.


No, I simply wish that they had based their design on correct sup-
positions. A real-world value (physical, monetary, or whatever) is
not a mere number, but rather the product of a number and a unit of
measurement. Changing the unit while keeping the number will affect
the value no less than changing the number while keeping the unit.
Example:
- when you change the width of a diskette from 90 mm to 2286 mm,
   it will no more fit in the drive,
- likewise, when you change its width from 90 mm to 90",
   it will no more fit in the drive,
- both 90 mm, and (approximately) 3½" wide diskettes will fit in
   the same drive.

A value in a spreadsheet should never inadvertently change.

Now, Works did exactly that: it multiplied every monetary value
(in Germany by 1,95583) by replacing its unit with a different
one, without notifying, let alone asking, the user. Cost-effective,
or not: You should make things as simple as possible, but never
simpler than that!

A simple spreadsheet program, would use only one monetary unit (DM,
€, or whatever) per data set (whole spreadsheet); but in contrast
to MS-Works, it would store that unit in a header field of the data
file, so it would display the correct values, regardless of the
locale the data set is displayed in. A more advanced spreadsheet
program would keep the unit with every entry (field), as MS-Excel
indeed does.

Best wishes,
   Otto Stolz





[OT] Automatic vowelizers?

2002-03-04 Thread Marco Cimarosti

I read the following question on a local NG:

Does is exist software which automatically adds vowel marks to Arabic and
Hebrew text?

I realize that such a thing would be very complex, and probably requires
dictionary look-up and some degree of understanding of the grammatical
context. Yet, it seems like is a prerequisite for implementing things such
as automatic transliteration or screen readers.

TIA.

_ Marco




RE: Re[2]: Month names (was: Re: Standard Conventions and euro)

2002-03-04 Thread Marco Cimarosti

Дауг Еуэль wrote:
> The word "numeric" was intended to convey that some day names 
> are based
> on numbers instead of, say, Norse gods.  The Russian derivation of
> certain day names from numbers is not nearly as transparent as the
> Portuguese "segunda-feira."  Perhaps I should have made that
> distinction.

Moreover, the fact that names of calendar items are based on numbers is no
warranty that their modern numbering corresponds to the etymology.

E.g., I guess that most educated Portuguese speakers would say than Tuesday
is the second day of the week, regardless that it is called "terça" ("third
one", literally).

On the other hand, all children who speak romance languages are very puzzled
when they discover that September, October, November and December are not
supposed to be the 7th, 8th, 9th and 10th months, as they names clearly
suggests.

_ Marco




Re: Re[2]: Month names (was: Re: Standard Conventions and euro)

2002-03-04 Thread Doug Ewell

Здравствуйте Сергей,

Serge Nesterovitch <[EMAIL PROTECTED]> wrote:

> DE> In the Hebrew calendar, only Shabbat (Saturday) has a name; the
rest of
> DE> the days are numbered.  In Russian and Portuguese, most of the day
names
> DE> are numeric.
> It's wrong information.
> In Russian NO one days of week name is numeric.
> 3 of it's names are based on the numbers,
> but vtornik =/= vtoroi
> chetverg =/= chetveryi
> pyatnitsa =/= pyatyi
> Using numbers of a days where names must be used is impossible.

The word "numeric" was intended to convey that some day names are based
on numbers instead of, say, Norse gods.  The Russian derivation of
certain day names from numbers is not nearly as transparent as the
Portuguese "segunda-feira."  Perhaps I should have made that
distinction.

-Doug Ewell
 Fullerton, California






Re[3]: Month names (was: Re: Standard Conventions and euro)

2002-03-04 Thread Serge Nesterovitch

SN> but vtornik =/= vtoroi
SN> chetverg =/= chetveryi
misspelling :(
I mean chetverg =/= chetvertyi
SN> pyatnitsa =/= pyatyi
SN> Using numbers of a days where names must be used is impossible.

SN> And when somebody trying to use day _numbers_ he must remember, that
SN> not in all cultures week begins from the same day. There are some
SN> traditions, in which the first day of the week is a sunday,
SN> and some - with the first day - monday.

And when somebody (from Japan, Korea, China, etc)
trying to use month numbers - he must understand,
that traditional calendar system in this countries are _different_.
What months you will try to number and from what you will start?












-- 
Best regards,
 Sergemailto:[EMAIL PROTECTED]





Re[2]: Month names (was: Re: Standard Conventions and euro)

2002-03-04 Thread Serge Nesterovitch

Hello Doug,

Monday, March 04, 2002, 3:07:44 AM, you wrote:

DE> In the Hebrew calendar, only Shabbat (Saturday) has a name; the rest of
DE> the days are numbered.  In Russian and Portuguese, most of the day names
DE> are numeric.
It's wrong information.
In Russian NO one days of week name is numeric.
3 of it's names are based on the numbers,
but vtornik =/= vtoroi
chetverg =/= chetveryi
pyatnitsa =/= pyatyi
Using numbers of a days where names must be used is impossible.

And when somebody trying to use day _numbers_ he must remember, that
not in all cultures week begins from the same day. There are some
traditions, in which the first day of the week is a sunday,
and some - with the first day - monday.









-- 
Best regards,
 Sergemailto:[EMAIL PROTECTED]





Re: Month names (was: Re: Standard Conventions and euro)

2002-03-04 Thread Ben Monroe

From: "Shigemichi Yazawa" <[EMAIL PROTECTED]>

>>ろ ろ〇〇〇 <[EMAIL PROTECTED]> wrote:
>> Month names. Month names. Who needs month names, anyway? Do we name the
>> hours of the day? Do we name the days *within* each month? Tell the
>> Chinese, Japanese, and Koreans that THEY need month names.

>Japan has (or had) month names. 睦月, 如月, etc.

Here's a more complete (but probably not exhaustive) list.
By the way, these are the lunar months, not the solar ones, so calling them
"January" "February" etc. isn't necessarily appropriate.

January睦月 [mutuki "harmonious month"], 睦び月 [mutubiduki], and 睦
びの月 [mutubi no tuki]
February  如月 [kisaragi] 衣更着 [kinusaragi]
March  弥生 [yayoi from iyaoi]
April 卯月 [uduki]
May 皐月・早月 [satuki "early month"] and 早苗月 [sanaeduki
"young seedlings month]
June水無月 [minaduki "waterless month" from 水な月 corresponding
to 水の月 "water month"]
July   文月 [fumiduki or fuduki "letter month?"]
August 葉月 [haduki "month of leaves"
September 長月 [nagatuki "long month"], 菊月 [kikuduki "chrysanthemum
month"], 色取月 [irodoriduki "month of color change"]
October神無月 [kaminaduki "godless month" probably coming from 神な
月 ->神の月 "month of god" like in June]
November霜月 [simotuki "frost month"]
December師走 [siwasu] and 臘月 [rougetu "month at end of year"; from a
ceremony occuring after the winter solstice]

(/tu/ corresponds to つ, /du/ to づ, and /si/ to し. I'm aware that this
romanization system is not preferred by many.)

>Japan also had names of time. 子の刻, 丑の刻, etc.

Old time system was based on 12 "units" per day, each corresponding to two
modern hours. Add の刻 "hour of" to each.

子 [ne]Hour of the Rat11PM-1AM
丑 [usi]   Hour of the Ox  1AM-3AM
寅 [tora] Hour of the Tiger  3AM-5AM
卯 [u]  Hour of the Hare  5AM-7AM
辰 [tatu]  Hour of the Dragon  7AM-9AM
巳 [mi]Hour of the Snake9AM-11AM
午 [uma] Hour of the Horse11AM-1PM
羊 [hituji] Hour of the Sheep   1PM-3PM
申 [saru] Hour of the Monkey  3PM-5PM
酉 [tori]   Hour of the Cock  5PM-7PM
戌 [inu]   Hour of the Dog7PM-9PM
亥 [i]   Hour of the Boar   9PM-11PM

Ben Monroe
門龍弁
(I write it with 龍 stuck inside of 門 but since it's only a personal
preference, I suppose I'll never be able to type it that way.)





Re[2]: Month names (was: Re: Standard Conventions and euro)

2002-03-04 Thread Serge Nesterovitch

Hello Doug,

Monday, March 04, 2002, 3:07:44 AM, you wrote:

DE> In the Hebrew calendar, only Shabbat (Saturday) has a name; the rest of
DE> the days are numbered.  In Russian and Portuguese, most of the day names
DE> are numeric.
It's wrong information.
In Russian NO one days of week name is numeric.
3 of it's names are based on the numbers,
but vtornik =/= vtoroi
chetverg =/= chetveryi
pyatnitsa =/= pyatyi
Using numbers of a days where names must be used is impossible.

And when somebody trying to use day _numbers_ he must remember, that
not in all cultures week begins from the same day. There are some
traditions, in which the first day of the week is a sunday,
and some - with the first day - monday.









-- 
Best regards,
 Sergemailto:[EMAIL PROTECTED]