Re: What code point is assigned for the Newton unit?

2001-09-13 Thread Asmus Freytag

Your letter makes clear that Unicode needs to do a better job of 
identifying the preferred character code for many situations. The 
information is there to a large extent, but buried in the fine print or in 
data tables.

You will see that there is a canonical decomposition from U+212B to U+00C5.
This means that once people use Normalization in a widespread fashion, it 
will become practically impossible to maintain a distinction between these 
two codes.

The inclusion of the U+212B is due to historic reasons.

Many other characters have been included in Unicode over the years for 
legitimate purposes as compatibility characters (to allow round trip 
conversion to/from important legacy character sets).

These have all been given compatibility decompositions.

Unfortunately, many characters that have legitimate uses in a legacy-free 
environment, have also been given compatibility mappings at some time. This 
makes it very hard to use this information in its current form to identify 
cases when a distinction between characters should be kept or when not.

There is some very explicit guidance, however, in Unicode TR#20 (Unicode and
XML). The information there is readily applicable to other environments, if 
you pay attention to the rationale for each recommendation and evaluate 
whether it applies in your specific case.

A./

PS:

Ångström is spelled wrong on the code charts at Unicode's home page, BTW.

Can you cite the page number and approximate location on the page (please 
send this information to me and [EMAIL PROTECTED], not to the whole list).





What code point is assigned for the Newton unit?

2001-09-12 Thread Stefan Persson

Hi!

I recently noticed, that the Unicode does difference between the Swedish
capital letter Å (U+00C5; Å) and the Ångström sign (U+212B; Å). So it
seems that every unit sign has got it's own code point, while the Latin
letters with exactly identical shape to those have other code points. For
example, the CJK Compatibility block contains some unit signs (in katakana):

㌂: anpea/Ampère
㌕: kiroguramu/kilogram
etc.

So, can someone tell me the code points for the Newton unit sign (which
looks exactly like an N)? And can someone tell me why it's necessary to do
this difference?

Ångström is spelled wrong on the code charts at Unicode's home page, BTW.

Stefan


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com





Re: What code point is assigned for the Newton unit?

2001-09-12 Thread Michael \(michka\) Kaplan

Actually, you are mistaken.

The decision to encode the Angstrom sign had more to do with the fact that
it ws encoded in many legacy encoding sets. There is no specific rule that
every unit sign must also be encoded. If you can use Unicode to properly
store and render what you need, then there is no lack that would require new
characters.

MichKa

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/

- Original Message -
From: Stefan Persson [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, September 12, 2001 8:59 AM
Subject: What code point is assigned for the Newton unit?


 Hi!

 I recently noticed, that the Unicode does difference between the Swedish
 capital letter Å (U+00C5; Å) and the Ångström sign (U+212B; Å). So it
 seems that every unit sign has got it's own code point, while the Latin
 letters with exactly identical shape to those have other code points. For
 example, the CJK Compatibility block contains some unit signs (in
katakana):

 ㌂: anpea/Ampère
 ㌕: kiroguramu/kilogram
 etc.

 So, can someone tell me the code points for the Newton unit sign (which
 looks exactly like an N)? And can someone tell me why it's necessary to
do
 this difference?

 Ångström is spelled wrong on the code charts at Unicode's home page,
BTW.

 Stefan


 _
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com