Tested it with EMACS_PRETEST_24_0_92-142-g559675b (today-ish's emacs git master head) and it produces identical result *.cjk as emacs 22 with the unpatched version, for the 4 files, CJKbabel.tex, muletest.tex, rubytest.tex and thai.tex . So I think the problem is fixed.
I checked the thai issue - emacs 24 has the same behavior as emacs 23 - for thai.tex, it shows tis620-2533 everywhere for the whole document (i.e. the ascii portions are tagged as thai), whereas in CJKbabel.tex, the thai part are thai-tis620 while the ascii part are nil; I think this is probably a difference of actually claiming tis620 for the whole document in thai.tex . According to emacs's source code tis620-2533 is the superset of thai-tis620 and ascii, so that's what how get-text-property behaves; charset also behaves the same way, unless restricted, for thai.tex . For CJKbabel.tex, 'FAQ' , 'textbf' are treated as ascii and distinct. --- On Thu, 15/12/11, Hin-Tak Leung <hintak_le...@yahoo.co.uk> wrote: > Finally! > > Here is a patch against your git-head (same as v4.8.2) of > your cjk-enc.el. It includes your define-coding-system patch > also. Tested okay for both emacs 22 and 23, and I should > expect the same for 24, caveat the thai issue below. > > So, in the end, it is almost all unicode-related. The > changes are: > > - define-coding-system (make-coding-system deprecated) > - char-charset returns unicode (and also sensitive to > priority) in emacs 23. > switch over to use text-property:charset as charset, which > seems more reliable > - there is a new charset/text-property called > 'tis620-2533', which is a superset of ascii and thai-tis620 > , this has the tendency of swallowing up every ascii > character to the end of file and make the code go into an > infinite loop... This is seen with thai.tex, which is just > thai and ascii. so back out of that and go back to > char-charset with restriction. > > - split-char also returns unicode plus code point and also > sensitive to priority, instead of charset + code point. so > set priority to text-property for it. > > Now that I have it working, it probably explain why I had > an almost correct version earlier, then lost it. Then I had > priority set to high for known ones, restrict search to > known ones, then make the priority choice sticky. That > differ in that the sticky choice could overflow into the > next language change, whereas this "correct" solution, while > priority is set for split-char to work and almost, it is > reset back to that from text-property in the next round. > > I suspect an alternative 'correct' solution would be to set > priority to a fixed known list before each char-charset > pushing unicode to the end, then maybe even the split-char > would work (I had it set before the whole loop, therefore it > spill over to the next language section). > > Give it a go with emacs 24/bzr, and see if it works? _______________________________________________ Cjk maillist - Cjk@ffii.org https://lists.ffii.org/mailman/listinfo/cjk