I have looked a bit more on get-text-property - and I think besides 
char-charset being now priority-based, we have another problem. coding-system 
is also priority-based, and can switch between 'emacs-mule and 'utf8-emacs 
quite freely.

With the CJKbabel example, if you look at the vietnamese section, 3rd character 
(the first non-ascii one), with c-u c-x =, you would see a strange thing (at 
least with emacs 23 on my system). It says the "buffer code:" is one, but the 
"file code:" is another.

How it gets into the infinite loop is this: get-text-property says it is 
vietnamese; 'following-char' returns the unicode value rather than emacs-mule 
value. when it gets to split-char, it returns 0 30 instead of 37 0 (what emacs 
22 does). (I am typing this from memory away from my computer so some of the 
details might be different). This is complete surprise to the code below, of 
course, since the code below expect vietnamese to have ch2 zero.

so char-charset is priority based; it seems that following-char (or the 
encoding of the current buffer) is also priority based; and split-char is doing 
the wrong thing because it receives a utf8 value rather than a emacs-mule value 
occasionally. I don't know how that logic works, but obviously it is not all 
the time, since the korean and japanese section before that works - or does not 
appear to throw an error.

I hope you can see some/all of the things I describe here.

I can't seem to find any detailed description of emacs-mule encoding, not 
split-char (current emacs lisp manual doesn't list the latter, and has very 
little info on the former).

_______________________________________________
Cjk maillist  -  Cjk@ffii.org
https://lists.ffii.org/mailman/listinfo/cjk

Reply via email to