El oct 27, 2011, a las 12:36 p.m., Andrew Moschou escribió: > On 27 October 2011 16:52, Shrisha Rao <sh...@nyx.net> wrote: > Being able to use UTF-8 codings in such scripts to produce outputs in other > scripts would require n × n mappings, as against 1 × n if the input is only > in ITRANS. > > Actually, 1×n is all that is required, as long as the mappings are > bijectional, but this requires two passes. Firstly to convert to ITRANS, then > secondly to the desired script. I can identify one instance where the mapping > is not bijectional, as ব in Bengali stands for both ब and व in Devanagari, > and this has already been mentioned here, I believe, but even so, a set of > n×n mappings doesn't help this situation.
The problem is much worse with Tamil, which does not have separate symbols/sounds for क, ख, ग, घ, or प, फ, ब, भ, etc. (which is also a reason that Tamil speakers are known to mispronounce Sanskrit words/names where these consonants are found). I meant that 1 × n is preferable for Sanskrit texts (as suggested in the thread subject) to be expressed in multiple fully Sanskrit-compatible scripts. I don't know about Bengali, but I believe for Tamil there are special extended notations, something like க் with subscript 1 being क, with subscript 2 being ख, etc. These are non-classical typographical notations and have no Unicode formulation so there is no way to handle the situation with either 1 × n or n × n. The xetex-itrans package offers a way to express Tamil using ITRANS input, and Sanskrit in Kannada/Roman/Telugu/Devanagari script using ITRANS input, but not a way to code Sanskrit in Tamil script using ITRANS input. As far as I know, there is no straight-forward way to code Sanskrit in Tamil script using Unicode. Regards, Shrisha Rao > Andrew -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex