On Mon, 15 Aug 2011 07:21:20 +0530 Shriramana Sharma <[email protected]> wrote:
> On 08/15/2011 01:48 AM, Richard Wordingham wrote: > > The issues is on the relative ordering of candrabindu and virama. > > For a C1-conjoining form (i.e. C2 relatively unmodified),<la virama > > candrabindu la> is easier to handle. For a C2-conjoining form,<la > > candrabindu virama la> is easier to work with. > > Hmm -- perhaps you mean this is so because it would be possible to > easily map Virama + LA to the C2-conjoining form? That's my motivation. > This is true > enough, but it is advisable to have a single uniform representation > across Indic scripts and that is LA + Virama + Candrabindu + LA > (because of the reasons outlined by Peter and me in the previous > mails I have linked to from the archives). I can't think of any characters that can be viewed as decomposing in some sense to consonant + virama. There are quite a few characters that are functional equivalents to virama + consonant , and some of these should be folded with virama + consonant in some applications. >> <snip> > I know that and that is why I distinguish "Indian" Indic scripts and > "non-Indian" (i.e. South East Asian [SEA]) Indic (i.e. Brahmic) > scripts, especially in Unicode. It seems that at least in Khmer (I > didn't check the other charts/chapters) one vocalic R/L vowel is > represented by the independent vowel presented as a sub-base (which > you call C2,... This is not what I was talking about. The best relevant examples in TUS 6.0 Section 11.4 are the words for "both" and "already". The former actually has nikahit + coeng! > Hmmm -- I'm not sure I entirely grok the SEA situation with Thai/Tai > Tham/Khmer etc, but I'm sure the handling of vowelless consonants and > conjoining forms in those scripts does deviate from the *Indic* > model. For example, see that stuff about the Balinese Surang and how > it is handled... Consider it a generalisation of anusvara! The Limbu and Lepcha have an array of final consonants, formally divorced from initial consonants. Kharoshti apparently used conjoining forms for final consonants, though examples are few and TUS 5.0 says virama cannot follow a vowel. In the Kharoshti script, the difference between a subscript MA and ANUSVARA is slight to vanishing. > > I've seen a claim that vowels within Tibetan consonant stacks can be > > handled sensibly within the confines of Unicode - I didn't > > investigate it. > I don't understand what you mean by "vowels within Tibetan consonant > stacks". All I've got to go on is the penultimate sentence in TUS 6.0 Section 10.2 - 'Rarely, stacks are seen that contain more than one such consonant-vowel combination in a vertical arrangement'. > I also don't know whether Tibetan language written in > Tibetan script requires the conjoining forms of vowels but I do know > (to an extent) that Sanskrit written in Tibetan doesn't require > "conjoining" forms of vowels per se generated by a virama-like > character. The Tibetan script doesn't have a combining virama. I would expect the natural coding to be something like letter-vowel-subjoined letter-vowel, e.g. <U+0F40 TIBETAN LETTER KA, U+0F74 TIBETAN VOWEL SIGN U, U+0FB2 TIBETAN SUBJOINED LETTER RA, U+0F74 TIBETAN VOWEL SIGN U>. A formal analogue would be the Thai word <U+1A2F TAI THAM LETTER DA, U+1A50 TAI THAM LETTER UU, U+1A55 TAI THAM CONSONANT SIGN MEDIAL RA, U+1A63 TAI THAM VOWLE SIGN AA>, but it doesn't match visually - its second vowel goes to the right of the consonants. Richard.

