On 7 Sep 2006 01:27:55 -0700, "GM" <[EMAIL PROTECTED]> wrote:
>Could you all give me some guide on how to convert my big5 string to >unicode using python? I already knew that I might use cjkcodecs or >python 2.4 but I still don't have idea on what exactly I should do. >Please give me some sample code if you could. Thanks a lot Gary, I used this Java program quite a few years ago to convert various Big5 files to UTF-16. (Sorry it's Java not Python, but I'm a very recent convert to the latter.) My newsgroup reader has messed the formatting up somewhat. If this causes a problem, email me and I'll send you the source directly. -Richard Schulman /* This program converts an input file of one encoding format to an output file of * another format. It will be mainly used to convert Big5 text files to Unicode text files. */ import java.io.*; public class ConvertEncoding { public static void main(String[] args) { String outfile = null; try { convert(args[0], args[1], "BIG5", "UTF-16LE"); } // Or, at command line: // convert(args[0], args[1], "GB2312", "UTF8"); // or numerous variations thereon. Among possible choices for input or output: // "GB2312", "BIG5", "UTF8", "UTF-16LE". The last named is MS UCS-2 format. // I.e., "input file","output file", "input encoding", "output encoding" catch (Exception e) { System.out.print(e.getMessage()); System.exit(1); } } public static void convert(String infile, String outfile, String from, String to) throws IOException, UnsupportedEncodingException { // set up byte streams InputStream in; if (infile != null) in = new FileInputStream(infile); else in = System.in; OutputStream out; if (outfile != null) out = new FileOutputStream(outfile); else out = System.out; // Set up character stream Reader r = new BufferedReader(new InputStreamReader(in, from)); Writer w = new BufferedWriter(new OutputStreamWriter(out, to)); w.write("\ufeff"); // This character signals Unicode in the NT environment char[] buffer = new char[4096]; int len; while((len = r.read(buffer)) != -1) w.write(buffer, 0, len); r.close(); w.flush(); w.close(); } } -- http://mail.python.org/mailman/listinfo/python-list