There is currently a problem with Character.toChars whereby the high char of the UTF-16 surrogate pair is incorrectly generated. This patch fixes the problem, ensuring that the output is correct, but it uses the algorithm proposed by the Unicode spec to generate the surrogate pair and may therefore not be optimally efficient.
I haven't committed this, I'd just like to solicit some feedback about it. Please comment. -- Chris Burdess "They that can give up essential liberty to obtain a little safety deserve neither liberty nor safety." - Benjamin Franklin
Index: java/lang/Character.java
===================================================================
RCS file: /cvsroot/classpath/classpath/java/lang/Character.java,v
retrieving revision 1.40
diff -u -r1.40 Character.java
--- java/lang/Character.java 17 Sep 2005 21:58:41 -0000 1.40
+++ java/lang/Character.java 7 Jan 2006 21:01:36 -0000
@@ -2410,9 +2410,8 @@
{
// Write second char first to cause IndexOutOfBoundsException
// immediately.
- dst[dstIndex + 1] = (char) ((codePoint & 0x3ff)
- + (int) MIN_LOW_SURROGATE );
- dst[dstIndex] = (char) ((codePoint >> 10) + (int) MIN_HIGH_SURROGATE);
+ dst[dstIndex + 1] = (char) (((codePoint - 0x10000) % 0x400) + 0xdc00);
+ dst[dstIndex] = (char) (((codePoint - 0x10000) / 0x400) + 0xd800);
result = 2;
}
else
pgpoNkdW6XqLz.pgp
Description: PGP signature
_______________________________________________ Classpath-patches mailing list [email protected] http://lists.gnu.org/mailman/listinfo/classpath-patches
