PatchSet 7633 Date: 2007/12/31 16:32:34 Author: robilad Branch: HEAD Tag: (none) Log: cleaned up readme in developer dir
2007-12-31 Dalibor Topic <[EMAIL PROTECTED]> * developers/README.EUC_JP: Removed encoding docs for older releases. * developers/README: Removed references to deleted files. * Makefile.am (EXTRA_DIST): Removed developers/README.EUC_JP. Members: ChangeLog:1.5131->1.5132 Makefile.am:1.136->1.137 developers/README:1.16->1.17 developers/README.EUC_JP:1.2->1.3(DEAD) Index: kaffe/ChangeLog diff -u kaffe/ChangeLog:1.5131 kaffe/ChangeLog:1.5132 --- kaffe/ChangeLog:1.5131 Mon Dec 31 16:22:35 2007 +++ kaffe/ChangeLog Mon Dec 31 16:32:34 2007 @@ -1,5 +1,11 @@ 2007-12-31 Dalibor Topic <[EMAIL PROTECTED]> + * developers/README.EUC_JP: Removed encoding docs for older releases. + * developers/README: Removed references to deleted files. + * Makefile.am (EXTRA_DIST): Removed developers/README.EUC_JP. + +2007-12-31 Dalibor Topic <[EMAIL PROTECTED]> + * developers/dumpClass.pl, developers/JavaClass.pm, developers/utf8munge.pl: Removed. * developers/README: Removed dumpClass.pl,utf8munge.pl and JavaClass.pm. * Makefile.am (EXTRA_DIST): Removed developers/dumpClass.pl, utf8munge.pl and Index: kaffe/Makefile.am diff -u kaffe/Makefile.am:1.136 kaffe/Makefile.am:1.137 --- kaffe/Makefile.am:1.136 Mon Dec 31 16:22:35 2007 +++ kaffe/Makefile.am Mon Dec 31 16:32:36 2007 @@ -142,7 +142,6 @@ developers/FullTest.sh \ developers/GCJ.note.1 \ developers/README \ - developers/README.EUC_JP \ scripts/GCCWarning.pm \ scripts/JikesWarning.pm \ scripts/LogWarning.pm \ Index: kaffe/developers/README diff -u kaffe/developers/README:1.16 kaffe/developers/README:1.17 --- kaffe/developers/README:1.16 Mon Dec 31 16:22:37 2007 +++ kaffe/developers/README Mon Dec 31 16:32:39 2007 @@ -22,10 +22,6 @@ fixup.c create GCJ fixup modules -Encode.java: Encode a i18n Character Map into a resource we can load. - -EncodeEUC_JP: Encode EUC_JP Character Map. - FullTest.sh: A shell script that builds/checks/installs many configurations of Kaffe "automatically". Usefull for checking many configurations to make sure you didn't break anything. =================================================================== Checking out kaffe/developers/README.EUC_JP RCS: /home/cvs/kaffe/kaffe/developers/Attic/README.EUC_JP,v VERS: 1.2 *************** --- kaffe/developers/README.EUC_JP Mon Dec 31 16:34:23 2007 +++ /dev/null Sun Aug 4 19:57:58 2002 @@ -1,88 +0,0 @@ -# This is a historical document. -# Classes kaffe.io.ByteToCharEUC_JP and kaffe.io.CharToByteEUC_JP use -# Classes kaffe.io.ByteToCharIconv and kaffe.io.CharToByteIconv -# respectively and the tables ByteToCharEUC_JP.tbl and CharToByteEUC_JP.tbl -# are no longer used. (Dec 12, 2003, Ito Kazumitsu <[EMAIL PROTECTED]>) - -Extended UNIX Code (EUC) Encoding Scheme - -The EUC encoding scheme defines a set of encoding rules that can -support one to four character sets. The encoding rules are based on -the ISO2022 definition for the encoding of 7-bit and 8-bit data. The -EUC encoding scheme uses control characters to identify some of the -character sets. The EUC encoding table shows the basic structure of -all EUC encoding. - - CS0 0xxxxxxx - - CS1 1xxxxxxx - 1xxxxxxx 1xxxxxxx - 1xxxxxxx 1xxxxxxx 1xxxxxxx - ... - - CS2 10001110 1xxxxxxx - 10001110 1xxxxxxx 1xxxxxxx - 10001110 1xxxxxxx 1xxxxxxx 1xxxxxxx - ... - - CS3 10001111 1xxxxxxx - 10001111 1xxxxxxx 1xxxxxxx - 10001111 1xxxxxxx 1xxxxxxx 1xxxxxxx - ... - -The term EUC denotes these general encoding rules. A code set based on -EUC conforms to the EUC encoding rules but also identifies the -specific character sets associated with the specific instances. For -example, IBM-eucJP for Japanese refers to the encoding of the Japanese -Industrial Standard characters according to the EUC encoding rules. - -The first set (CS0) always contains an ISO646 character set. All of -the other sets must have the most significant bit (MSB) set to 1 and -can use any number of bytes to encode the characters. In addition, all -characters within a set must have: - - o Same number of bytes to encode all characters - o Same column display width (number of columns on a fixed-width - terminal) - -All characters in the third set (CS2) are always preceded with the -control character SS2 (single-shift 2, 0x8e). Code sets that conform -to EUC do not use the SS2 control character other than to identify the -third set. - -All characters in the fourth set (CS3) are always preceded with the -control character SS3 (single-shift 3, 0x8f). Code sets that conform -to EUC do not use the SS3 control character other than to identify the -fourth set. - - -The following table illustrates the Japanese representation of EUC -packed format: - - EUC Code Sets Encoding Range - ^^^^^^^^^^^^^ ^^^^^^^^^^^^^^ - Code set 0 (ASCII or JIS X 0201-1976 Roman): 0x21-0x7E - Code set 1 (JIS X 0208): 0xA1A1-0xFEFE - Code set 2 (half-width katakana): 0x8EA1-0x8EDF - Code set 3 (JIS X 0212-1990): 0x8FA1A1-0x8FFEFE - - -Classes kaffe.io.ByteToCharEUC_JP and kaffe.io.CharToByteEUC_JP use -external tables build by class EncodeEUC_JP in developers directory. - -(1) Get files JIS*.TXT from - http://www.unicode.org/Public/MAPPINGS/EASTASIA/JIS/ - -(2) run 'kaffe EncodeEUC_JP' - -(3) copy ByteToCharEUC_JP.tbl and CharToByteEUC_JP.tbl in the directory - kaffe/io. - -By default, classes ByteToCharEUC_JP and CharToByteEUC_JP in package -kaffe.io use full US-ASCII. If you want use exceptions defined by JIS -X 0201-1976 (aka 0x5C is U+00A5 and 0x7E is U+203E), you must change -US_ASCII to false in both classes. - -Ito Kazumitsu <[EMAIL PROTECTED]> -Edouard G. Parmelan <[EMAIL PROTECTED]> -Nov 20 2000 _______________________________________________ kaffe mailing list kaffe@kaffe.org http://kaffe.org/cgi-bin/mailman/listinfo/kaffe