Bug#476957: (fwd) Bug#476957: texlive-xetex: Sinhala language support

Jonathan Kew Mon, 21 Apr 2008 06:31:27 -0700

On 21 Apr 2008, at 11:55 am, Anuradha Ratnaweera wrote:

On Sun, Apr 20, 2008 at 11:08 PM, Norbert Preining<[EMAIL PROTECTED]> wrote:


 If you have the *smallest* doubts let me know...


Adding Harshula to the CC list.

If you are looking for doubtful areas in the patch, check the
following.  The rest of the patch is *adding* Sinhala related
variables, functions and switch conditions.  Even this change makes
sure ZWJ is not discarded, which shouldn't a problem for others
scripts that doesn't use it.

--- layout/LEFontInstance.cpp
+++ layout/LEFontInstance.cpp
@@ -75,7 +75,7 @@
         return 0xFFFF;
     }

-    if (mappedChar == 0x200C || mappedChar == 0x200D) {
+    if (mappedChar == 0x200C) {
         return 1;
     }


        Anuradha
--
http://www.sayura.net/anuradha/

Yes, I realize the patch touches very little existing functionality,as it is adding support for a new script rather than modifying anexisting one. One other part that might interact somehow would be thechange to the state table:

--- texlive-bin-2007.orig/build/source/libs/icu-xetex/layout/IndicReordering.cpp+++ texlive-bin-2007/build/source/libs/icu-xetex/layout/IndicReordering.cpp

@@ -326,14 +346,15 @@

{ 1, 1, 1, 5, 8, 3, 2, 1, 5, 9, 5, 1, 1, 1}, // 0- ground state{-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}, // 1- exit state{-1, 6, 1, -1, -1, -1, -1, -1, 5, 9, 5, 5, 4, -1}, // 2- consonant with nukta- {-1, 6, 1, -1, -1, -1, -1, 2, 5, 9, 5, 5, 4, -1}, // 3- consonant+ {-1, 6, 1, -1, -1, -1, -1, 2, 5, 9, 5, 5, 4, 11}, // 3- consonant{-1, -1, -1, -1, -1, 3, 2, -1, -1, -1, -1, -1, -1, 7}, // 4- consonant virama{-1, 6, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}, // 5- dependent vowels{-1, -1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}, // 6- vowel mark{-1, -1, -1, -1, -1, 3, 2, -1, -1, -1, -1, -1, -1, -1}, // 7- ZWJ, ZWNJ{-1, 6, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 4, -1}, // 8- independent vowels that can take a virama{-1, 6, 1, -1, -1, -1, -1, -1, -1, -1, 10, 5, -1, -1}, // 9- first part of split vowel- {-1, 6, 1, -1, -1, -1, -1, -1, -1, -1, -1, 5, -1, -1} // 10- second part of split vowel+ {-1, 6, 1, -1, -1, -1, -1, -1, -1, -1, -1, 5, -1, -1}, // 10- second part of split vowel+ {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 7, -1} // 11- <ct> <zwj>

};

This adds the new state 11, and also changes one of the transitionsin the existing state 3 (in order to use the new state). So itpresumably could affect the processing of certain sequences in anyIndic script. I'm not saying this is wrong, or even that it wouldmake any difference to the actual results for other scripts; it'sprobably fine. I simply haven't studied it in order to understandwhat is really happening here. I notice that it has similarities tothe newer version in ICU 3.8.1, but is not identical to that (3.8.1has transitions from both states 2 and 3 to this new state, and alsoinserts another new state related to vowels; it also extensivelyrevises the ground state row of the table). But to feel reallyconfident about all this, I'll need to understand the Indic shapingengine better.


Jonathan




--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Bug#476957: (fwd) Bug#476957: texlive-xetex: Sinhala language support

Reply via email to