Sorry the last patch has a minor problem - it treats the whole of 0x0EXX as
Thai (0x0E80-0x0EFF is Laos). while it is unlikely(?) somebody might want to do
both - is there a LaTeX package for Laos? - we better do it correctly.
BTW, your TUGboat article 11 years ago says the word-breaking algorithm might
be in the next version of emacs... That hadn't happened yet? :-).
--- On Tue, 27/12/11, Hin-Tak Leung <hintak_le...@yahoo.co.uk> wrote:
> Here is another patch - I really
> would like to have Thai inputs in utf8 instead of TIS620, so
> it happened :-). plus example input file and change log.
>
> This patch is a somewhat unusual approach - it isn't using
> C70 font definition, nor doing font re-encoding, but uses
> emacs's character encoding capability to transform
> unicode-Thai to tis620-Thai before doing word-breaking.
>
> Do you think it is worth adding similar
> unicode->regional hooks for the other babel
> single-byte-encodings? (I read up on emacs-mule and it is
> really a family of encodings rather than a single one like
> unicode...that's possibly how emacs preserves charset info)
> - one definitely does not want to add the double-byte ones.
> I suppose only Thai is dependent on an external
> word-breaking program .
>
> While knowledge of lisp isn't as common as that of C/C++,
> emacs is (currently) more portable/ported than swath... So
> what are the advantages of using ThaiLaTeX? (besides the
> obvious and vague one like 'written by a native' - there are
> a lot of ugly latex things from the Chinese as well...)
From 8bcb26d2739e8e2cf4c59edcfca0f9d80d174ad7 Mon Sep 17 00:00:00 2001
From: Hin-Tak Leung <ht...@users.sourceforge.net>
Date: Fri, 30 Dec 2011 14:30:17 +0000
Subject: [PATCH] [cjk-enc.el] Correct minor issue with last commit.
Thai is 0x0E00-0x0E7F only. The previous commit mistakenly
treats all of 0x0EXX (the upper range being Laos) as Thai.
---
ChangeLog | 6 ++++++
utils/lisp/emacs/cjk-enc.el | 7 ++++---
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 43d2df0..6edace8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2011-12-30 Hin-Tak Leung <ht...@users.sourceforge.net>
+ [cjk-enc.el] Correct minor issue with last commit.
+
+ Thai is 0x0E00-0x0E7F only. The previous commit mistakenly
+ treats all of 0x0EXX (the upper range being Laos) as Thai.
+
2011-12-27 Hin-Tak Leung <ht...@users.sourceforge.net>
[cjk-enc.el] Accept Thai inputs in utf-8 encoding.
diff --git a/utils/lisp/emacs/cjk-enc.el b/utils/lisp/emacs/cjk-enc.el
index 7aa9615..11d26e1 100644
--- a/utils/lisp/emacs/cjk-enc.el
+++ b/utils/lisp/emacs/cjk-enc.el
@@ -657,9 +657,10 @@
(if (eq charset 'unicode)
(let ((l (split-char ch)))
(progn
- ;; Unicode 0x0EXX is Thai. Transform back to TIS620
- (setq ch2 (nth 2 l))
- (if (eq ch2 14)
+ ;; Unicode 0x0E00-0x0E7F is Thai. Transform back to TIS620
+ (setq ch2 (nth 2 l)
+ ch3 (nth 3 l))
+ (if (and (eq ch2 14) (< ch3 128))
(setq charset 'thai-tis620
ch (encode-char ch 'thai-tis620))))))
--
1.7.7.4
_______________________________________________
Cjk maillist - Cjk@ffii.org
https://lists.ffii.org/mailman/listinfo/cjk