There is currently a problem with Character.toChars whereby the high char
of the UTF-16 surrogate pair is incorrectly generated. This patch fixes the
problem, ensuring that the output is correct, but it uses the algorithm
proposed by the Unicode spec to generate the surrogate pair and may
therefore not be optimally efficient.

I haven't committed this, I'd just like to solicit some feedback about it.
Please comment.
-- 
Chris Burdess
  "They that can give up essential liberty to obtain a little safety
  deserve neither liberty nor safety." - Benjamin Franklin
Index: java/lang/Character.java
===================================================================
RCS file: /cvsroot/classpath/classpath/java/lang/Character.java,v
retrieving revision 1.40
diff -u -r1.40 Character.java
--- java/lang/Character.java    17 Sep 2005 21:58:41 -0000      1.40
+++ java/lang/Character.java    7 Jan 2006 21:01:36 -0000
@@ -2410,9 +2410,8 @@
       {
         // Write second char first to cause IndexOutOfBoundsException
         // immediately.
-        dst[dstIndex + 1] = (char) ((codePoint & 0x3ff)
-                                    + (int) MIN_LOW_SURROGATE );
-        dst[dstIndex] = (char) ((codePoint >> 10) + (int) MIN_HIGH_SURROGATE);
+        dst[dstIndex + 1] = (char) (((codePoint - 0x10000) % 0x400) + 0xdc00);
+        dst[dstIndex] = (char) (((codePoint - 0x10000) / 0x400) + 0xd800);
         result = 2;
     }
     else

Attachment: pgpoNkdW6XqLz.pgp
Description: PGP signature

_______________________________________________
Classpath-patches mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/classpath-patches

Reply via email to