Re: Malayalam Half-U: how

2002-11-12 Thread Baiju M

In Malayalam (iso639-2 language code : mal) there are 37 'vyanchanangal'
(consonants).
All these consonants are usually pronounced with a support of 'swaram'
(vowel) sound
A [U0D06]. The pure forms of consonats is writing with a 'chandrakkala'
(virama [U0D4D])
above the consonant. While pronouncing the pure forms of consonants there
should be
clear sound of vowel U [U0D09]. Some consonants another form, which is
called 'chillu'.
A 'chillu' is a consonant which do not require any vowel support to
prounce. It is writing
with a vowel sign U [U0D41] and 'chandrakkala' (virama [U0D4D]) above
that. Infact
Malayalam has seperate 'lipi' (script) for 7 chillu forms of consonants
which are widely
using in Malayalam.  Since we have seperate scripts for most of the
chillus, in writing system
we almost stopped writing chillu forms of other consonants (which is
rarely occurs) as explained
above. Eventhough still you can see some texts written in this style.
Antoine said this
is half form of u that is the 'samvrutokaram' of U [U0D09] (infact
'samvrutokaram' has
a sound of A and U, so the 'virama', 'vowel sign U' and 'combination of
this two' is
used in diffnerent places and texts, some lingusits says that
'samvrutokaram' has
a vowel value.) Now many are writing consonants with virama for chillu
forms of other consonats
One example is that Antoine said :  U0D15 + U0D41 + U0D4D (ka, u, virama).
So internaly a chillu can be represented with unicode character sequence
like this :
consonant + vowel sign U [U0D41] + virama [U0D4D].
Then you can render 7 chillu forms with correct script. I will explain how
to do this below.
For making inputting very easy you can use the inscript keyboard layout
standardised by
kerala govt. (See they just added chillus to original inscript keyboard
layout at appropriate
positions, they considered the frequency of occurense of this chillu
forms. I will explain
the drawback of this keyboard layout below.)

The proposal for inclusion of scripts of chillus forms of consonants as
basic characters should
not be accepted by Unicode consortium. (This is going to be submitted (or
already submitted?)
by Ministry of Information Technology (Govt. of India), a member of
Unicode consortium)
The prosal includeds some other things, in my opinion those changes should
be accepted.

Now I will explain howto represent chillu forms of consonants in unicode
sequence.
An important thing to be noticed is that two (or more) consonants may have
same script
for their chillu forms. And its pronouciation is also same. Though it
should be represented
in correct unicode sequence. Script for chillu forms of both RA [U0D30]
and RRA [U0D31]
are same. Similary script for chillu forms of both LLA [U0D33] and LLLA
[U0D34] are same.
Other consonants which has chillu forms with unique scripts are NNA
[U0D23], NA [U0D28]
and LA [U0D32].

Why 5 scripts of 7 chillus forms of consonants should not be included in
unicode ?
--

* The basic reason is that those 5 'lipi' (script) are not part of
Malayalam 'Aksharamala'
   (character set). instead these are chillus only (See it is not a
'koottaksharam'
   (consonanat conjunct) )

Sopporting reasons :-

  + As I explained above two (or more) consonants is using same script for
their chillu
 forms. So if these 'simple shapes' are going to be part of unicode
hard encoding of
 hard encoding of chillus wll be impossible. If someone input in
correct unicode seqence
 the renderer should render those characters, this will make more
problems.

  + Sorting rule cannot impliment effectively.

Inscript keyboard layout problems :-
-

   I think the drawback of new inscript keyboard layout standardised by
Kerala govt.
will be clear from the above discussion. Eventhough the layout can be
accepted with
practical consideration. Since we are only using those scripts, we can
compose
any character sequence to keys allocated to them. Here the choice is
coiming in between
RA [U0D30] and RRA [U0D31] chillu and LLA [U0D33] and LLLA [U0D34]. By
considering the
accent of pronounciation and freequency of occurense of these chillus, you
can choose
RRA [U0D31] and LLA [U0D33]. Infact this only can be decided by cosidering
the words.
For example :-
RA [U0D30] + vowel sign U [U0D41] + virama [U0D4D] is correct in words :
neer - neere (water), avar - avare (they),  aar - aare (who) etc.

and RRA [U0D31] + vowel sign U [U0D41] + virama [U0D4D] is correct in words :
car - caRe (car), kiNar - kiNaRe (well), sir - saRe (sir) etc.

So if someone input the other correct sequences (without using those keys),
it should render properly.

P.S : please reply to [EMAIL PROTECTED]

Regards,
Baiju M
--
http://baijum81.tripod.com


--- In [EMAIL PROTECTED], Antoine LECA [EMAIL PROTECTED] wrote:
 Hi folks,

 A problem was signaled in the Microsoft VOLT mailing list (this list
 should be dedicated

Re: [smc-devel] Re: Malayalam Half-U: how

2002-11-12 Thread Baiju M
 While pronouncing the pure forms of consonants
 there should be
 clear sound of vowel U [U0D09]

here one correction : there should not be

Regards,
Baiju M






Re: glyph selection for Unicode in browsers

2002-09-28 Thread Baiju M

Can anyone clarify this one:
In Microsoft page here :
http://www.microsoft.com/typography/OTSPEC/indicot/default.htm
says Malayalam chillu glyphs are formed when inputting 
(consonant)+(virama). Can I use another formation for chillus, I
want to use (consonant)+(virama)+(ZWJ) any problem?
And any problem, If I am am giving ligature formation in this
way
in OpenType tables?

Regards,
Baiju M

--- [EMAIL PROTECTED] wrote:
 Quoting [EMAIL PROTECTED]:
 
  Actually, my point was specifically that *part* of the
 infrastructure is
  already present, at least in OpenType, but not *all*, either
 in OpenType
  (meaning of language in the OT spec needs to be clarified,
 and
  relationships between these tags and the language tags
 used for data e.g.
  RFC 3066, need to be resolved)...
 
 'Language system' (not 'language') in the OpenType
 specification actually means 
 *writing* system, i.e. a particular set of
 orthographic/typographic conventions 
 associated with the use of a particular script. 'Language
 system' is a 
 misnomer -- an historical artifact of the incomplete
 understanding of the 
 format's original designers --, and it has caused all sorts of
 confusion, 
 especially among people who assume that the OT 'language
 system' tags must have 
 some relationship to things like NLS tags. There is no
 necessary relationship 
 and, indeed, it is possible to conceive of a user wanting to
 apply, for 
 instance, the typographic conventions of German to a language
 other than German.
 
 I've suggested to Microsoft and Adobe that the term used in
 the spec should be 
 changed, or at least annotated.
 
 John Hudson
 


=


__
Do you Yahoo!?
New DSL Internet Access from SBC  Yahoo!
http://sbc.yahoo.com