Re: is there any way to change already defined character codes?
At 11:01 PM 8/7/00 -0800, Jianping Yang wrote: Not really for Unicode in which we have relocated some codepoints for Hangul between Unicode 1.1 and 2.0 :) Regards, Jianping. "Christopher J. Fynn" wrote: Allowing changes like this would break existing implementations of these standards - and of course these standards would be useless as standards if they were subject to that kind of change. Well, those were the early days when there were few implementations of Unicode and even fewer that used Unicode to support Korean. Finally, the Korean set is so large that we could not use our preferred method of 'correction' mistakes, i.e. by coding the 'corrected' characters as new characters. Nowadays the number of implementations supporting Unicode has grown to the point that it's impossible to even get an accurate estimate of their number and since Georgian does not require unusually complicated rendering, I would suspect that there already are a considerable amount of data and implementations.
Re: is there any way to change already defined character codes?
Not really for Unicode in which we have relocated some codepoints for Hangul between Unicode 1.1 and 2.0 :) Regards, Jianping. "Christopher J. Fynn" wrote: Sandro I'm sure someone official will give you an official answer, but I know the only answer you are going to get to your question is NO - there is no way to change the encoding point of a character (or to change a character name) once it is in the Unicode or ISO 10646 standards. Allowing changes like this would break existing implementations of these standards - and of course these standards would be useless as standards if they were subject to that kind of change. Proposals to encode new characters in the Unicode and ISO 10646 standards have to go through a lengthy process of consideration and there is ample opportunity to submit comments on any proposal during that process. However once characters are finally assigned code points in the Unicode and ISO 10646 standards that's it. May I ask what is the reason these people from the government of Georgia want to change the codepoints of some Georgian characters? There is probably another good solution (or solutions) for whatever problem they think would be solved by changing encoding points. Regards - Chris "Sandro Karumidze" [EMAIL PROTECTED] wrote: There are people from the government of Georgia interested in possibility in altering Unicode standard it terms of changing codes for some of Georgian characters. Does this type of things happen in Consortium and if yes under what circumstances. If not can you specify in which rules is it defined that this types of changes are not allowed.. Thanks in advance for your support, Best regards, Sandro Karumidze
Re: is there any way to change already defined character codes?
Sandro, Are you basically wanting the ordering to be different? Unicode does not have any expressed or implied warranty that the ordering of characters will be anything like what a user would expect (how can it, when even so many languages that use the same scripts have entirely different, occasionally conflicting, collation rules? It is up to the software to make the necessary collation rules happen. For example, in Windows 2000 there are two different sorts supported for Georgian: "modern" and "traditional." The difference is that modern has four letters (He, Hie, We, and Har, both Capital and Small) sort at the end of the alphabet (which I presume corresponds to the sort that you do not like?), while the traditional sort has: * He appearing between Zen and Tan * Hie appearing between Nar and On * We appearing between Un and Phar * Har appearing between Xan and Jhan I presume the above "exceptions" more closely match the sort you would expect? And if there are more, this would be very valuable information (as the rules behind all new "sorts" like this are that a valid need to sort text differently was identified. As a rule, Unicode order is not intended to be nor does it explicitly decide to follow any kind of collation rules for code point order. FWIW, the LCIDs behind these two sorts under Windows 2000 (used in the C CompareString and the VB StrComp) are: Traditional: 1079 (0x0437) Modern: 66615 (0x10437) michka - Original Message - From: "Sandro Karumidze" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Cc: "Unicode List" [EMAIL PROTECTED] Sent: Tuesday, August 08, 2000 3:26 AM Subject: Re: is there any way to change already defined character codes? Dear Chris, Thank you for your answer. May I ask what is the reason these people from the government of Georgia want to change the codepoints of some Georgian characters? There is probably another good solution (or solutions) for whatever problem they think would be solved by changing encoding points. The issue is that in Unicode there is a sequence of Georgian caracters different from what this people think should be. In modern Georgian there are 33 widely used characters. However before there were 38 characters. In beginning of this century 5 characters were dropped, though still used in old texts and by language specialists. In Unicode this 5 characters follow 33. There is a different point of view that those 5 should be included among the ohters. This is all the issue - there are no specific implementation difficulties or problems. The only point is that 5 among the rest 33 is more "correct". Best regards, Sandro Karumidze Regards - Chris "Sandro Karumidze" [EMAIL PROTECTED] wrote: There are people from the government of Georgia interested in possibility in altering Unicode standard it terms of changing codes for some of Georgian characters. Does this type of things happen in Consortium and if yes under what circumstances. If not can you specify in which rules is it defined that this types of changes are not allowed.. Thanks in advance for your support, Best regards, Sandro Karumidze
Re: is there any way to change already defined character codes?
On Mon, 7 Aug 2000, Jianping Yang wrote: Not really for Unicode in which we have relocated some codepoints for Hangul between Unicode 1.1 and 2.0 :) Yes, but NEVER AGAIN. -- John Cowan [EMAIL PROTECTED] C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux, de rapport nyait pas. -- Jacques Lacan, "L'Etourdit"
Re: is there any way to change already defined character codes?
On Tue, 8 Aug 2000, Sandro Karumidze wrote: The issue is that in Unicode there is a sequence of Georgian caracters different from what this people think should be. In modern Georgian there are 33 widely used characters. However before there were 38 characters. In beginning of this century 5 characters were dropped, though still used in old texts and by language specialists. In Unicode this 5 characters follow 33. There is a different point of view that those 5 should be included among the ohters. This is all the issue - there are no specific implementation difficulties or problems. The only point is that 5 among the rest 33 is more "correct". Ah, OK. The order of characters in the Unicode Standard is *not* meant to be the proper sort order for any language (even English) or relied on for that purpose. If any changes are needed, it is to the Unicode default collating sequence (which I have not checked) and not to the codes for the characters themselves. -- John Cowan [EMAIL PROTECTED] C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux, de rapport nyait pas. -- Jacques Lacan, "L'Etourdit"
RE: is there any way to change already defined character codes?
On 08/08/2000 06:40:17 AM Marco.Cimarosti wrote: (You definitely need an official reply, but let's go on with some more informal chatting.) All the "officials" are busy meeting this week, but the statement, "Can't be done" is just as true whether it comes from the lips (or... fingertips) of a Ken Whistler or Mark Davis as from a Marco Cimarosti or a Chris Fynn. There are enough of us on this list that have a solid understanding of the standard and its development that a question like this can be answered without waiting for an "official" answer (though this question really ought to be answered somewhere on the Unicode web site); if somebody were to give wrong information, there would be several that wouldn't hesitate to correct. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: [EMAIL PROTECTED]
Re: is there any way to change already defined character codes?
At 11:01 PM -0800 8/7/00, Jianping Yang wrote: Not really for Unicode in which we have relocated some codepoints for Hangul between Unicode 1.1 and 2.0 :) And have regretted it ever since. Moving the Hangul and renaming æ have caused no end of problems. It was the fact that it was so disastrous when done once that makes everyone determined not to do it again. -- = John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://www.blueneptune.com/~tseng
Re: is there any way to change already defined character codes?
From: [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: E.g., if you look at the Latin part, you see that the 26 letters used in modern English are all contiguously ordered in two areas: U0041 to U005A (uppercase) and U0061 to U007A (lowercase). Yeah, but so what? All you gotta do is turn the 6th bit off and there you go! But that's the end of the story! All the other 100's Latin letters are scattered all over, using no consistent order. Too bad unicode values can't be fractions!! Lets take this one offline, Robert. michka
Re: is there any way to change already defined character codes?
Sandro I'm sure someone official will give you an official answer, but I know the only answer you are going to get to your question is NO - there is no way to change the encoding point of a character (or to change a character name) once it is in the Unicode or ISO 10646 standards. Allowing changes like this would break existing implementations of these standards - and of course these standards would be useless as standards if they were subject to that kind of change. Proposals to encode new characters in the Unicode and ISO 10646 standards have to go through a lengthy process of consideration and there is ample opportunity to submit comments on any proposal during that process. However once characters are finally assigned code points in the Unicode and ISO 10646 standards that's it. May I ask what is the reason these people from the government of Georgia want to change the codepoints of some Georgian characters? There is probably another good solution (or solutions) for whatever problem they think would be solved by changing encoding points. Regards - Chris "Sandro Karumidze" [EMAIL PROTECTED] wrote: There are people from the government of Georgia interested in possibility in altering Unicode standard it terms of changing codes for some of Georgian characters. Does this type of things happen in Consortium and if yes under what circumstances. If not can you specify in which rules is it defined that this types of changes are not allowed.. Thanks in advance for your support, Best regards, Sandro Karumidze