Reply to Kent Karlsson

delex r Sat, 10 Sep 2011 11:25:53 -0700
> I figure out that Unicode has not addressed the sovereignty issues of a 
> language 
Which, I dare say, is irrelevant from a *character* encoding perspective. 
##  Is not language  the next in perspective ?? “Unicode is the universal 
character encoding, maintained by the Unicode Consortium. This encoding 
standard provides the basis for processing, storage and interchange of text 
data in ANY LANGUAGE in all modern software and information technology 
protocols”
> while trying to devise an ASCII like encoding system for almost all 
> the characters and symbols used on earth. I am continuing with my observation 
> of the glaring mistake done by Unicode by naming a South Asian Script as 
> ³Bengali². Here I would like to give certain information that I think will be 
> of some help for Unicode in its endeavour to faithfully represent a Universal 
> Character encoding standard truer to even micro-facts. 
> 
> India is believed to have at least 1652 mother tongues out of which only 22 
One list of languages in India is given in 
http://www.ethnologue.com/show_country.asp?name=IN 
(I did not count the number of entries) 
## Didn’t you find Assamese there? My point of argument was just there in the 
above lines when I said Unicode should not have named the code range 0980 to 
09FF as “Bengali script” as it is the point of ambiguity itself. You seem to be 
just overlooked that.
> are recognized by the Indian Constitution as official languages for 
> administrative communication among local governments and to the citizens. And 
> the constitution has not explicitly recognized any official script. As 
> Unicode 
> has listed the languages and scripts, the Indian Constitution has also listed 
Unicode does not list any languages at all. Ok, the CLDR subproject copies a 
list of language codes from the IANA language subtag registry, which (in a 
complex manner) takes its language codes from (among others) the ISO 639-3 
registry, which largely is in sync with Ethnologue (as in the list above); 
but I guess that is not what you referred to. 
##  I referred to 
http://unicode.org/cldr/data/charts/supplemental/languages_and_scripts.html . 
Is not that an Unicode publication ?? Here I like to  object naming a script as 
“Bengali” (=coded Beng) and informing the world that this script is used for 
the Assamese language. If you say it is done just for the sake of naming a set 
of characters to put them in a block,  then I may be informed why the name “ 
Bengali” is chosen and not                 “Assamese”.   
> the official languages ( In its 8th schedule). The first entry in that list 
> is 
> the Assamese language. Assamese is a sovereign language with its own grammar 
Which I don't think is in dispute at all.
## Is there any dispute for the ownership of the “Script” it uses? Assamese and 
yes Bengali both are sovereign languages and uses a script that is almost same 
but I told the points of differences at 09F0, 09F1 and 09B0. But Assamese as a 
script contains extra and unique letter (09F1) so naming the concerned block 
range as “Assamese” should be more appropriate if at all Unicode wants to group 
the codes into blocks and name them.
> and ³script² that contains some unique characters that you will not find in 
> any of the scripts so far discovered by Unicode. At least 30 million people 
Unicode (at this stage) does not do any "discovery". Unicode and ISO/IEC 
10646 is driven by applications (proposals) to encode characters (and define 
properties of characters). 
## If it is not “discovery” , then is it  “invention” of characters that 
Unicode making?? By now I see that Unicode has certainly made an “invention” of 
a couple of letters 09F0 and 09F1 and  also arbitrarily postulated them as  
‘Bengali Ra’ s etc etc . I fear what Unicode may be doing with other scripts 
and languages!!!
> call it the ³Assamese Script² and if provided with computers and internet 
If you want to disunify the Bengali script (and characters) from Assamese, 
you need to show, in a proposal document, that they really are different 
scripts, and should not be unified as just different uses of the same 
script. 
## I guess Unicode has taken the burden of investigating, classifying, 
documenting ,codifying the world of scripts and unifying ,disunifying them for 
that matter. They should be doing it factually without any bias  and not 
arbitrarily. First Unicode must put a question now on the advocates of the name 
“Bengali Script” about their credibility.
> connection can bomb the Unicode e-mail address with confirmations. These 
Hmm, an email bombing threat... I'm sure Sarasvati can find a way to block 
those (or we may all simply file them away as spam). 
## Haa ha , the threat perception is very high I see. Why ???  Believe me that 
was not a threat at all  I swear. 
> characters are, I repeat, the one that is given a Hexcode 09F0 and the other 
> with 09F1 by this universal character encoding system but unfortunat! 
> ely has described both as ³Bengali² Ra etc. etc. I don¹t know who has advised 
> Unicode to use the tag ³Bengali² to name the block that includes these two 
> characters. 
> 
> If you are not an Indian then just google an image of an Indian Currency 
> note. 
> There on one side of the note you will find a box inside which the value of 
> the currency note is written in words in at least 15 scripts of official 
> Indian languages.( I don¹t know why it is not 22). At the top , the script is 
> Assamese as Assamese is the first officially recognized language (script?) . 
> Next below it you will find almost similar shapes. That is in Bengali. India 
> officially recognises the distinction between these two scripts which 
> although 
> shaped similar but sounds very different at many points. And the standard 
Minor font differences is not a reason for disunification. Different 
pronunciations of the same letters is not a reason for disunification 
either. Just think of how many different ways Latin letters (and letter 
combinations) are pronounced in different languages (x, j, h, v, w, f, ...; 
even "a" gets different pronunciation in British English vs. US English, 
and that is within the same language...; and most orthographies aren't 
very accurately phonetic anyway, with quite a bit of varying (contextual 
and dialectal) pronunciation for the letters). 
## Those were illustrations and not point of contentions.. You did not inform 
me about the advisors though ok….
> assamese alphabet set has extra characters which are never bengali just like 
> London is never in Germany. 
There are 8 London in the USA, two in Canada, one in Kiribati, ... ;-) 
(http://en.wikipedia.org/wiki/London_(disambiguation)) 
## May be. But you can’t create a London within Germany as Unicode has done 
creating or inventing a couple of letters 09F0 and 09F1 within what it calls 
the Bengali range.(Germany)
> Coming again to the Hexcodes 09F0 (Raw) and 09F1 (wabo). Both have nothing 
> Bengali in them and interestingly 09F1 ( sounds WO or WA when used within 
> words) has even nothing ŒRa¹ sound in it. Thus you know, with actual Bengali 
> alphabet set one can¹t write anything to produce the sound ³Watt² as in James 
> Watt and instead need to combine three alphabets but even then only to sound 
> like ³ OOYAT ³ in Bengali itself. 
Yes, English has a rather peculiar pronunciation for the letter W... ;-) 
Several languages will pronounce Watt (without changing the spelling) as 
Vatt, and regard that as a normal pronunciation of Watt. 
## I am happy you just agreed and realised that phonetical deficiency exists in 
language scripts. But here the deficiency is due to lack of a proper sound 
producing letter and not attributable to accent.  So I want to repeat Bengali 
standard alphabet set without 09F1 is phonetically deficient compared to 
Assamese script if named so to include 09F1. 
> Therefore Unicode must consider terming the block range as ³Assamese² which 
> will faithfully describe the block range with 09F0 and 09F1 in it and replace 
> all tags ³ Bengali² with ³Assamese² in the code descriptions and vice versa . 
> London is in England and Berlin is in Germany. You just can¹t bring London 
> into Germany and then say England is in Germany. You can¹t live with a lie or 
> wrong too long. 
See above re. London. ;-) As for Berlin: see 
http://en.wikipedia.org/wiki/Berlin_(disambiguation)... 
(I still fail to see how this would be analogous in any way whatsoever to 
your quest.) 
## Plz  do not invent a  London in Germany.
Yes, I have responded with a quite large dose of irony. Dryer and to the 
point responses by others seem to have passes unnoticed. ## ???
Continue: Glaring mistake in the code list for South Asian Script//Reply to Kent Karlsson

Reply via email to