> > The only solution which seems to work is to replace all Big5
> > characters where the second byte is `\', `{', or `}' with
> > \CJKchar{...}{...}.
>
> Sorry for this stupid question, but I am not familiar with the Big5
> Character code and its decimal,octal or hexadecimal notation and its
> 7bit or 8bit representation. (to be honest: I am not familiar with
> character codes at all).  How can I figure out the specific values?
> Is there a table/scheme where I look for the specific values or
> something like this?

Please read the CJK documentation first!  In Big 5 encoding, the first
byte can be in the range 0xA1-0xFE, and the second byte in the ranges
0x40-0x7E and 0xA1-0xFE.  Below is a perl script which does the
conversion for you -- it has been not tested well, so please be
careful.


    Werner


======================================================================


while (<>) {
  s/([\xA1-\xFE])([\x40-\x7E]|[\xA1-\xFE])/
    if ($2 eq "\\" || $2 eq "{" || $2 eq "}") {
      sprintf("\\CJKchar{%d}{%d}", ord($1), ord($2));
    }
    else {
      sprintf("$1$2");
    }
   /eg;
  print;
}

_______________________________________________
Cjk maillist  -  [email protected]
http://lists.ffii.org/mailman/listinfo/cjk

Reply via email to