Re: [unicode] Telugu Unicode Encoding Review

2010-10-17 Thread mpsuzuki
I'm sorry for my comment is about only one item in the
comments for Telugu encoding. Other items are also
interesting (e.g. Telugu digits in Unicode are not
taught in the schools).

On Sat, 16 Oct 2010 19:49:07 -0700
Asmus Freytag asm...@ix.netcom.com wrote:
  On 10/16/2010 10:38 AM, suzuki toshiya wrote:
 I've never heard any comments about the reservation
 of the codepoints to making the code chart structure
 similar among multiple script, no posive, no negative.

The source for this arrangement is an Indian National Standard.

The important thing to remember is that when Unicode was first created, 
it was seen as very important to mimic the layout of 8-bit character 
sets for a given script - at least for those scripts that had fairly 
well established standards in the 80s.

Yes I know. The design of the reserved codepoints was
decided by ISCII, not by Unicode, not by ISO/IEC JTC1
/SC2/WG2. It was reasonable to make existing ISCII and
ISO/IEC 10646 similar, to reduce the smooth interchange
of them, at that time.

For the standardization expert, even if one thinks
the sparse insertion of the reserved codepoints in
ISO/IEC 10646 as bad idea, he would agree with that
keeping the same structure with ISCII is better than
the incompatible structure. In fact, the structures
of Indic scripts unencoded in ISCII are incompatible
with Devanagari (e.g. Lepcha, Limbu, Meitei Mayak,
Ol Chiki, ...).

Anyway, I've never heard any comments about the
reserved codepoints in the Brahmic scripts from
the end user, I want to hear what they are recognized,
and, if they are recognized useless, anybody want
to use the reserved codepoints for other characters etc. 
As you know, in CJK fonts, often reserved codepoints
are used for user-defined/extended characters and
caused many troubles in the information interchange.

While this seems quaint now, it did make it easier for people to become 
comfortable with Unicode - and to be able to tell quickly and reliably 
whether important character sets were fully covered. Without that, 
Unicode might never have established itself - as unbelievable as that 
may sound to those who did not experience that transition period first hand.

I have no objection about this view.

Regards,
mpsuzuki

 Kiran Kumar Chava wrote (2010/10/17 2:06):
 Hi,


 At the link, http://geek.chavakiran.com/archives/55 , I tried to 
 understand
 Telugu Unicode encoding and then I tried to do an out of box review 
 of this
 encoding. Kindly let me know if I am missing something, mentioned as 
 missing
 in above article are really missing or not. Any other views...


 Thanks in advance,

 Kiran Kumar Chava

 http://chavakiran.com



Re: [unicode] Telugu Unicode Encoding Review

2010-10-16 Thread suzuki toshiya

Hi,

I've never heard any comments about the reservation
of the codepoints to making the code chart structure
similar among multiple script, no posive, no negative.
So your comment is interesting. Could you tell me more
about what kind of disadvantages you're thinking of?

If Telugu users are using 7-bit or 8-bit encoding
and they want to use more codepoints for unencoded
characters, the disadvantage (the reduction of the
available codepoint) is clear. But... you're talking
about Unicode.

Regards,
mpsuzuki

Kiran Kumar Chava wrote (2010/10/17 2:06):

Hi,


At the link, http://geek.chavakiran.com/archives/55 , I tried to understand
Telugu Unicode encoding and then I tried to do an out of box review of this
encoding. Kindly let me know if I am missing something, mentioned as missing
in above article are really missing or not. Any other views...


Thanks in advance,

Kiran Kumar Chava

http://chavakiran.com






Re: [unicode] Telugu Unicode Encoding Review

2010-10-16 Thread Asmus Freytag

 On 10/16/2010 10:38 AM, suzuki toshiya wrote:

Hi,

I've never heard any comments about the reservation
of the codepoints to making the code chart structure
similar among multiple script, no posive, no negative.
So your comment is interesting. Could you tell me more
about what kind of disadvantages you're thinking of?


The source for this arrangement is an Indian National Standard.

As chapter 9 of TUS states in the introduction:

   They are all encoded according to a common plan, so that comparable
   characters
   are in the same order and relative location. This structural
   arrangement, which facilitates
   transliteration to some degree, is based on the Indian national
   standard (ISCII).

The important thing to remember is that when Unicode was first created, 
it was seen as very important to mimic the layout of 8-bit character 
sets for a given script - at least for those scripts that had fairly 
well established standards in the 80s.


While this seems quaint now, it did make it easier for people to become 
comfortable with Unicode - and to be able to tell quickly and reliably 
whether important character sets were fully covered. Without that, 
Unicode might never have established itself - as unbelievable as that 
may sound to those who did not experience that transition period first hand.


A./




If Telugu users are using 7-bit or 8-bit encoding
and they want to use more codepoints for unencoded
characters, the disadvantage (the reduction of the
available codepoint) is clear. But... you're talking
about Unicode.

Regards,
mpsuzuki

Kiran Kumar Chava wrote (2010/10/17 2:06):

Hi,


At the link, http://geek.chavakiran.com/archives/55 , I tried to 
understand
Telugu Unicode encoding and then I tried to do an out of box review 
of this
encoding. Kindly let me know if I am missing something, mentioned as 
missing

in above article are really missing or not. Any other views...


Thanks in advance,

Kiran Kumar Chava

http://chavakiran.com