Re: is there any way to change already defined character codes?

2000-08-09 Thread Asmus Freytag

At 11:01 PM 8/7/00 -0800, Jianping Yang wrote:
Not really for Unicode in which we have relocated some codepoints for Hangul
between Unicode 1.1 and 2.0 :)

Regards,
Jianping.

"Christopher J. Fynn" wrote:

  Allowing changes like this would break
  existing implementations of these standards - and of course these standards
  would be useless as standards if they were subject to that kind of change.

Well, those were the early days when there were few implementations of 
Unicode and even fewer that used Unicode to support Korean. Finally, the 
Korean set is so large that we could not use our preferred method of 
'correction' mistakes, i.e. by coding  the 'corrected' characters as new 
characters.

Nowadays the number of implementations supporting Unicode has grown to the 
point that it's impossible to even get an accurate estimate of their number 
and since Georgian does not require unusually complicated rendering, I 
would suspect that there already are a considerable amount of data and 
implementations.





Re: is there any way to change already defined character codes?

2000-08-08 Thread Jianping Yang

Not really for Unicode in which we have relocated some codepoints for Hangul
between Unicode 1.1 and 2.0 :)

Regards,
Jianping.

"Christopher J. Fynn" wrote:

 Sandro

 I'm sure someone official will give you an official answer, but I know the only
 answer you are going to get to your question is NO - there is no way to change
 the encoding point of a character (or to change a character name) once it is in
 the Unicode or ISO 10646 standards. Allowing changes like this would break
 existing implementations of these standards - and of course these standards
 would be useless as standards if they were subject to that kind of change.

 Proposals to encode new characters in the Unicode and ISO 10646 standards have
 to go through a lengthy process of consideration and there is ample opportunity
 to submit comments on any proposal during that process. However once characters
 are finally assigned code points in the Unicode and ISO 10646 standards that's
 it.

 May I ask what is the reason these people from the government of Georgia want
 to change the codepoints of some Georgian characters? There is probably another
 good solution (or solutions) for whatever problem they think would be solved by
 changing encoding points.

 Regards

 - Chris

 "Sandro Karumidze" [EMAIL PROTECTED] wrote:

  There are people from the government of Georgia interested in possibility in
  altering Unicode standard it terms of changing codes for some of Georgian
  characters.

  Does this type of things happen in Consortium and if yes under what
 circumstances.

  If not can you specify in which rules is it defined that this types of
 changes are
  not allowed..

  Thanks in advance for your support,

  Best regards,

  Sandro Karumidze




Re: is there any way to change already defined character codes?

2000-08-08 Thread Michael \(michka\) Kaplan

Sandro,

Are you basically wanting the ordering to be different?

Unicode does not have any expressed or implied warranty that the ordering of
characters will be anything like what a user would expect (how can it, when
even so many languages that use the same scripts have entirely different,
occasionally conflicting, collation rules?

It is up to the software to make the necessary collation rules happen.

For example, in Windows 2000 there are two different sorts supported for
Georgian: "modern" and "traditional." The difference is that modern has four
letters (He, Hie, We, and Har, both Capital and Small) sort at the end of
the alphabet (which I presume corresponds to the sort that you do not
like?), while the traditional sort has:

* He appearing between Zen and Tan
* Hie appearing between Nar and On
* We appearing between Un and Phar
* Har appearing between Xan and Jhan

I presume the above "exceptions" more closely match the sort you would
expect? And if there are more, this would be very valuable information (as
the rules behind all new "sorts" like this are that a valid need to sort
text differently was identified.

As a rule, Unicode order is not intended to be nor does it explicitly decide
to follow any kind of collation rules for code point order.

FWIW, the LCIDs behind these two sorts under Windows 2000 (used in the C
CompareString and the VB StrComp) are:

Traditional: 1079 (0x0437)
Modern: 66615 (0x10437)

michka


- Original Message -
From: "Sandro Karumidze" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Cc: "Unicode List" [EMAIL PROTECTED]
Sent: Tuesday, August 08, 2000 3:26 AM
Subject: Re: is there any way to change already defined character codes?


 Dear Chris,

 Thank you for your answer.

  May I ask what is the reason these people from the government of Georgia
want
  to change the codepoints of some Georgian characters? There is probably
another
  good solution (or solutions) for whatever problem they think would be
solved by
  changing encoding points.

 The issue is that in Unicode there is a  sequence of Georgian caracters
different
 from what this people think should be.

 In modern Georgian there are 33 widely used characters. However before
there were
 38 characters. In beginning of this century 5 characters were dropped,
though still
 used in old texts and by language specialists.

 In Unicode this 5 characters follow 33. There is a different point of view
that
 those 5 should be included among the ohters.

 This is all the issue - there are no specific implementation difficulties
or
 problems. The only point is that 5 among the rest 33 is more "correct".

 Best regards,

 Sandro Karumidze





 
  Regards
 
  - Chris
 
  "Sandro Karumidze" [EMAIL PROTECTED] wrote:
 
   There are people from the government of Georgia interested in
possibility in
   altering Unicode standard it terms of changing codes for some of
Georgian
   characters.
 
   Does this type of things happen in Consortium and if yes under what
  circumstances.
 
   If not can you specify in which rules is it defined that this types of
  changes are
   not allowed..
 
   Thanks in advance for your support,
 
   Best regards,
 
   Sandro Karumidze






Re: is there any way to change already defined character codes?

2000-08-08 Thread John Cowan

On Mon, 7 Aug 2000, Jianping Yang wrote:

 Not really for Unicode in which we have relocated some codepoints for Hangul
 between Unicode 1.1 and 2.0 :)

Yes, but NEVER AGAIN.

-- 
John Cowan   [EMAIL PROTECTED]
C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant
le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux,
de rapport nyait pas.   -- Jacques Lacan, "L'Etourdit"





Re: is there any way to change already defined character codes?

2000-08-08 Thread John Cowan

On Tue, 8 Aug 2000, Sandro Karumidze wrote:

 The issue is that in Unicode there is a  sequence of Georgian caracters different
 from what this people think should be.
 
 In modern Georgian there are 33 widely used characters. However before there were
 38 characters. In beginning of this century 5 characters were dropped, though still
 used in old texts and by language specialists.
 
 In Unicode this 5 characters follow 33. There is a different point of view that
 those 5 should be included among the ohters.
 
 This is all the issue - there are no specific implementation difficulties or
 problems. The only point is that 5 among the rest 33 is more "correct".

Ah, OK.  The order of characters in the Unicode Standard is *not*
meant to be the proper sort order for any language (even English)
or relied on for that purpose.  If any changes are needed, it is to
the Unicode default collating sequence (which I have not checked) and not to
the codes for the characters themselves.

-- 
John Cowan   [EMAIL PROTECTED]
C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant
le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux,
de rapport nyait pas.   -- Jacques Lacan, "L'Etourdit"





RE: is there any way to change already defined character codes?

2000-08-08 Thread Peter_Constable


On 08/08/2000 06:40:17 AM Marco.Cimarosti wrote:

(You definitely need an official reply, but let's go on with some more
informal chatting.)

All the "officials" are busy meeting this week, but the statement, "Can't
be done" is just as true whether it comes from the lips (or... fingertips)
of a Ken Whistler or Mark Davis as from a Marco Cimarosti or a Chris Fynn.
There are enough of us on this list that have a solid understanding of the
standard and its development that a question like this can be answered
without waiting for an "official" answer (though this question really ought
to be answered somewhere on the Unicode web site); if somebody were to give
wrong information, there would be several that wouldn't hesitate to
correct.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]





Re: is there any way to change already defined character codes?

2000-08-08 Thread John H. Jenkins

At 11:01 PM -0800 8/7/00, Jianping Yang wrote:
Not really for Unicode in which we have relocated some codepoints for Hangul
between Unicode 1.1 and 2.0 :)


And have regretted it ever since. Moving the Hangul and renaming æ
have caused no end of problems.  It was the fact that it was so
disastrous when done once that makes everyone determined not to do it
again.

--
=
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.blueneptune.com/~tseng



Re: is there any way to change already defined character codes?

2000-08-08 Thread Michael \(michka\) Kaplan

From: [EMAIL PROTECTED]
  [EMAIL PROTECTED] wrote:
  E.g., if you look at the Latin part, you see that
  the 26 letters used in
  modern English are all contiguously ordered in
  two areas: U0041 to U005A
  (uppercase) and U0061 to U007A (lowercase).
 
 Yeah, but so what? All you gotta do is turn the 6th
 bit off and there you go!
  
  But that's the end of the story! All the other
  100's Latin letters are
  scattered all over, using no consistent order.
  
 Too bad unicode values can't be fractions!!

Lets take this one offline, Robert.

michka








Re: is there any way to change already defined character codes?

2000-08-07 Thread Christopher J. Fynn

Sandro

I'm sure someone official will give you an official answer, but I know the only
answer you are going to get to your question is NO - there is no way to change
the encoding point of a character (or to change a character name) once it is in
the Unicode or ISO 10646 standards. Allowing changes like this would break
existing implementations of these standards - and of course these standards
would be useless as standards if they were subject to that kind of change.

Proposals to encode new characters in the Unicode and ISO 10646 standards have
to go through a lengthy process of consideration and there is ample opportunity
to submit comments on any proposal during that process. However once characters
are finally assigned code points in the Unicode and ISO 10646 standards that's
it.

May I ask what is the reason these people from the government of Georgia want
to change the codepoints of some Georgian characters? There is probably another
good solution (or solutions) for whatever problem they think would be solved by
changing encoding points.

Regards

- Chris


"Sandro Karumidze" [EMAIL PROTECTED] wrote:

 There are people from the government of Georgia interested in possibility in
 altering Unicode standard it terms of changing codes for some of Georgian
 characters.

 Does this type of things happen in Consortium and if yes under what
circumstances.

 If not can you specify in which rules is it defined that this types of
changes are
 not allowed..

 Thanks in advance for your support,

 Best regards,

 Sandro Karumidze