Hello,
can anyone review this?
- Vlad
On 8/27/2012 6:51 PM, Vladislav Karnaukhov wrote:
Hello,
could you please review a new version?
Please find webrev here:
http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.04/
Regards,
- Vlad
On 6/29/2012 2:09 AM, Phil Race wrote:
That would work but I think its cleaner to just move it all into
mapNumericReference
as below.
-phil.
diff --git
a/src/share/classes/javax/swing/text/html/parser/Parser.java
b/src/share/classes/javax/swing/text/html/parser/Parser.java
--- a/src/share/classes/javax/swing/text/html/parser/Parser.java
+++ b/src/share/classes/javax/swing/text/html/parser/Parser.java
@@ -952,7 +952,7 @@
ch = readCh();
break;
}
- char data[] = {mapNumericReference((char) n)};
+ char data[] = mapNumericReference(n);
return data;
}
addString('#');
@@ -1021,7 +1021,7 @@
}
/**
- * Converts numeric character reference to Unicode character.
+ * Converts numeric character reference to char array.
*
* Normally the code in a reference should be always converted
* to the Unicode character with the same code, but due to
@@ -1030,13 +1030,21 @@
* to displayable characters with other codes.
*
* @param c the code of numeric character reference.
- * @return the character corresponding to the reference code.
+ * @return a char array corresponding to the reference code.
*/
- private char mapNumericReference(char c) {
- if (c < 130 || c > 159) {
- return c;
+ private char[] mapNumericReference(int c) {
+ char[] data;
+ if (c >= 0xffff) { // outside unicode BMP.
+ try {
+ data = Character.toChars(c);
+ } catch (IllegalArgumentException e) {
+ data = new char[0];
}
- return cp1252Map[c - 130];
+ } else {
+ data = new char[1];
+ data[0] = (c < 130 || c > 159) ? (char)c : cp1252Map[c -
130];
+ }
+ return data;
}
On 06/28/12 07:58 AM, Vladislav Karnaukhov wrote:
Hello Phil, Pavel,
thank you for your comments. I've reworked fix and testcase, please
find new webrev here:
http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.03/
Regards,
- Vlad
On 6/27/2012 10:13 PM, Phil Race wrote:
Well its not only unnecessary but is likely wrong .. I don't think
you looked at what mapNumericReference() does.
-phil.
On 6/27/2012 11:03 AM, Vladislav Karnaukhov wrote:
Hello Phil,
I used Character.toChars() in both branches because I wanted to
delegate code point conversion to char or surrogate pair entirely
to Character class...
Regards,
- Vlad
On 6/26/2012 9:49 PM, Phil Race wrote:
I don't understand why you call Character.toChars() if you've
just determined
you don't need to ?
ie what was wrong with
data = ( n >>> 16 == 0) ? {mapNumericReference((char) n)} :
Character.toChars(n);
?
In the case of an invalid supplementary pair, maybe it would be
safer to return { ' ' } ?
One thing I see in the parsing code that is not new or changed
here, that
may bear examination, is that there's a loop that keeps on
reading so long
so long there are new digits. I am not sure its wise to keep
going once
you overflow.
-phil.
On 6/26/2012 12:37 AM, Vladislav Karnaukhov wrote:
Hello Pavel,
I can provide you with the link to 6u19, but this is direct
forward-port and no code changes were made.
I'll make changes as you've pointed out in 1) and 2)
About 3) - is it a requirement to use "? :" operator? I
personally prefer single-line if-else, but I don't want to argue
over code style, and surely I'll follow code design practices.
Regards,
- Vlad
On 6/25/2012 6:43 PM, Pavel Porvatov wrote:
Hi Vladislav,
Do you have a link to the fix for 6u19?
I didn't investigate the fix deeply, but
1.
private final int MAX_BMP_BOUND = 65535;
should be static (otherwise variable name should be in lower case)
2. Add a space in single line comments
3.
+ char data[];
+ if (n <= MAX_BMP_BOUND) {
+ data =
Character.toChars(mapNumericReference((char) n));
+ } else {
+ data = Character.toChars(n);
+ }
+
return data;
can be written in one line via "? :" operator and looks more
readable for me
Thanks, Pavel
Hello,
please review the fix for 6836089: Swing HTML parser can't
properly decode codepoints outside the Unicode Plane 0 into a
surrogate pair. This is a forward port from JDK6 (fixed
escalated issue, fix integrated) to JDK7.
The issue is a defect in Swing HTML Parser: if the codepoint
is outside BMP (Unicode Plain 0), Parser incorrectly decodes
codepoint into surrogate pair. The fix is to use
Character.toChars() method if codepoint value is greater than
upper bound of BMP.
Webrev: http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.00/
Bug description:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6836089
Regards,
- Vlad