Re: RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
On 18.12.2013 06:18, huizhe wang wrote: Why didn't you like this very short version? : JAXP currently maintains a code level 1.5 (Apache Xerces at 1.4). While we're getting close to the end of JAXP standalone, we may consider newer/advanced features in JDK9. But we'll get to that discussion some time later. Thanks for your feedback, I didn't know that and additionally I wasn't aware that Arrays.copyOf is so young. Good catch! I'll remember to apply this cleanup in a separate or related patch, for JDK9. For the current issue, I want to keep my promise to RT that this is going to be a short and low-risk patch. We have a very small window for JDK8. Reasonable! -Ulf
Re: RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
Hi Joe, The fix looks good - though I wonder at whether incrementing whiteSpaceLookup by a fix amount wouldn't be better than doubling its length. best regards, -- daniel On 12/16/13 8:31 PM, huizhe wang wrote: Hi, This is a quick fix for a whitespace buffer that was not adjusted properly in one of the two cases. The buffer, whiteSpaceLookup, is filled in two cases and adjusted properly the 2nd time. The code is moved into a method storeWhiteSpace so that it's shared for the 1st case as well. Note at line 1175, there is no need to save character 0x20 since all whitespace characters will later be replaced with character 0x20. webrevs: http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Thanks, Joe
Re: RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
Joe, I thought this looked OK also On Dec 17, 2013, at 12:26 PM, huizhe wang wrote: On 12/17/2013 4:10 AM, Daniel Fuchs wrote: Hi Joe, The fix looks good - though I wonder at whether incrementing whiteSpaceLookup by a fix amount wouldn't be better than doubling its length. Both would be okay. The case, as demonstrated in the report, that there are hundreds of LFs in an attribute, is very rare. Best, Joe best regards, -- daniel On 12/16/13 8:31 PM, huizhe wang wrote: Hi, This is a quick fix for a whitespace buffer that was not adjusted properly in one of the two cases. The buffer, whiteSpaceLookup, is filled in two cases and adjusted properly the 2nd time. The code is moved into a method storeWhiteSpace so that it's shared for the 1st case as well. Note at line 1175, there is no need to save character 0x20 since all whitespace characters will later be replaced with character 0x20. webrevs: http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Thanks, Joe Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037 Oracle Java Engineering 1 Network Drive Burlington, MA 01803 lance.ander...@oracle.com
Re: RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
Also you could increment fCurrentEntity.position at the end of the loop, save 1 if and indent correctly with 8 spaces: 1166 for (; fCurrentEntity.positionfCurrentEntity.count; fCurrentEntity.position++) { 1167 c = fCurrentEntity.ch[fCurrentEntity.position]; 1168 if ((c == quote 1169 (!fCurrentEntity.literal || isExternal)) || 1170 c == '%' || !XMLChar.isContent(c)) { 1171 break; 1172 } 1173 if (whiteSpaceInfoNeeded c == ’\t’) { 1174 storeWhiteSpace(fCurrentEntity.position); 1175 } 1176 } -Ulf On 17.12.2013 20:49, Ulf Zibis wrote: Hi Joe, I would use ’\t’ instead of 0x9, to stay consistent with ahead code: 1175 if (c == ’\t’) { Lines 1214..1221 could be simpler: 1214 if (whiteSpaceLen = whiteSpaceLookup.length) { 1215 int [] tmp = new int[whiteSpaceLookup.length*2]; 1216 System.arraycopy(whiteSpaceLookup, 0, tmp, 0, whiteSpaceLookup.length); 1217 whiteSpaceLookup = tmp; 1218 } 1219 whiteSpaceLookup[whiteSpaceLen++] = whiteSpacePos; Or even shorter: 1214 if (whiteSpaceLen = whiteSpaceLookup.length) 1215 whiteSpaceLookup = Arrays.copyOf(whiteSpaceLookup, whiteSpaceLookup.length*2); 1216 whiteSpaceLookup[whiteSpaceLen++] = whiteSpacePos; (please insert spaces around if clause, else and after commas.) -Ulf On 16.12.2013 20:31, huizhe wang wrote: Hi, This is a quick fix for a whitespace buffer that was not adjusted properly in one of the two cases. The buffer, whiteSpaceLookup, is filled in two cases and adjusted properly the 2nd time. The code is moved into a method storeWhiteSpace so that it's shared for the 1st case as well. Note at line 1175, there is no need to save character 0x20 since all whitespace characters will later be replaced with character 0x20. webrevs: http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Thanks, Joe
Re: RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
Thanks for the comments. I incorporated the suggested changes. In terms of code format, for small source files, I would just hit the source-format button. For big files like this one, it'd unfortunately create too much distraction from real changes. http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Joe On 12/17/2013 1:03 PM, Ulf Zibis wrote: Also you could increment fCurrentEntity.position at the end of the loop, save 1 if and indent correctly with 8 spaces: 1166 for (; fCurrentEntity.positionfCurrentEntity.count; fCurrentEntity.position++) { 1167 c = fCurrentEntity.ch[fCurrentEntity.position]; 1168 if ((c == quote 1169 (!fCurrentEntity.literal || isExternal)) || 1170 c == '%' || !XMLChar.isContent(c)) { 1171 break; 1172 } 1173 if (whiteSpaceInfoNeeded c == ’\t’) { 1174 storeWhiteSpace(fCurrentEntity.position); 1175 } 1176 } -Ulf On 17.12.2013 20:49, Ulf Zibis wrote: Hi Joe, I would use ’\t’ instead of 0x9, to stay consistent with ahead code: 1175 if (c == ’\t’) { Lines 1214..1221 could be simpler: 1214 if (whiteSpaceLen = whiteSpaceLookup.length) { 1215 int [] tmp = new int[whiteSpaceLookup.length*2]; 1216 System.arraycopy(whiteSpaceLookup, 0, tmp, 0, whiteSpaceLookup.length); 1217 whiteSpaceLookup = tmp; 1218 } 1219 whiteSpaceLookup[whiteSpaceLen++] = whiteSpacePos; Or even shorter: 1214 if (whiteSpaceLen = whiteSpaceLookup.length) 1215 whiteSpaceLookup = Arrays.copyOf(whiteSpaceLookup, whiteSpaceLookup.length*2); 1216 whiteSpaceLookup[whiteSpaceLen++] = whiteSpacePos; (please insert spaces around if clause, else and after commas.) -Ulf On 16.12.2013 20:31, huizhe wang wrote: Hi, This is a quick fix for a whitespace buffer that was not adjusted properly in one of the two cases. The buffer, whiteSpaceLookup, is filled in two cases and adjusted properly the 2nd time. The code is moved into a method storeWhiteSpace so that it's shared for the 1st case as well. Note at line 1175, there is no need to save character 0x20 since all whitespace characters will later be replaced with character 0x20. webrevs: http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Thanks, Joe
Re: RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
On 18.12.2013 00:45, huizhe wang wrote: Thanks for the comments. I incorporated the suggested changes. In terms of code format, for small source files, I would just hit the source-format button. For big files like this one, it'd unfortunately create too much distraction from real changes. http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Joe Fine. But: ERROR: Line 1216 should be removed !!! Why didn't you like this very short version? : Or even shorter: 1214 if (whiteSpaceLen = whiteSpaceLookup.length) 1215 whiteSpaceLookup = Arrays.copyOf(whiteSpaceLookup, whiteSpaceLookup.length*2); 1216 whiteSpaceLookup[whiteSpaceLen++] = whiteSpacePos; Another shortage: 1095 do { 1098 newlines++; 1099 fCurrentEntity.lineNumber++; 1100 fCurrentEntity.columnNumber = 1; 1101 if (++fCurrentEntity.position == fCurrentEntity.count) { 1102 invokeListeners(newlines); 1103 offset = 0; 1104 fCurrentEntity.position = newlines; 1105 if (load(newlines, false)) { 1106 break; 1107 } 1108 } 1097 if (c == '\r') { 1109 if (fCurrentEntity.ch[fCurrentEntity.position] == '\n') { 1110 fCurrentEntity.position++; offset++; 1112 } 1113 /*** NEWLINE NORMALIZATION ***/ 1114 else { 1115 newlines++; 1116 } 1118 } else { 1130 /*** NEWLINE NORMALIZATION *** 1131 * if (fCurrentEntity.ch[fCurrentEntity.position] == '\r') { 1133 * fCurrentEntity.position++; 1134 * offset++; 1135 * }***/ 1137 } 1096 c = fCurrentEntity.ch[fCurrentEntity.position]; 1141 } while (fCurrentEntity.position fCurrentEntity.count - 1 (c == '\n' || c == '\r')); 1142 1143 for (int i = offset; i fCurrentEntity.position; i++) { 1144 fCurrentEntity.ch[i] = '\n'; 1145 storeWhiteSpace(i); 1146 } -Ulf
Re: RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Joe Fine. But: ERROR: Line 1216 should be removed !!! Done, thanks. Why didn't you like this very short version? : JAXP currently maintains a code level 1.5 (Apache Xerces at 1.4). While we're getting close to the end of JAXP standalone, we may consider newer/advanced features in JDK9. But we'll get to that discussion some time later. Or even shorter: 1214 if (whiteSpaceLen = whiteSpaceLookup.length) 1215 whiteSpaceLookup = Arrays.copyOf(whiteSpaceLookup, whiteSpaceLookup.length*2); 1216 whiteSpaceLookup[whiteSpaceLen++] = whiteSpacePos; Another shortage: Good catch! I'll remember to apply this cleanup in a separate or related patch, for JDK9. For the current issue, I want to keep my promise to RT that this is going to be a short and low-risk patch. We have a very small window for JDK8. Thanks, Joe 1095 do { 1098 newlines++; 1099 fCurrentEntity.lineNumber++; 1100 fCurrentEntity.columnNumber = 1; 1101 if (++fCurrentEntity.position == fCurrentEntity.count) { 1102 invokeListeners(newlines); 1103 offset = 0; 1104 fCurrentEntity.position = newlines; 1105 if (load(newlines, false)) { 1106 break; 1107 } 1108 } 1097 if (c == '\r') { 1109 if (fCurrentEntity.ch[fCurrentEntity.position] == '\n') { 1110 fCurrentEntity.position++; offset++; 1112 } 1113 /*** NEWLINE NORMALIZATION ***/ 1114 else { 1115 newlines++; 1116 } 1118 } else { 1130 /*** NEWLINE NORMALIZATION *** 1131 * if (fCurrentEntity.ch[fCurrentEntity.position] == '\r') { 1133 * fCurrentEntity.position++; 1134 * offset++; 1135 * }***/ 1137 } 1096 c = fCurrentEntity.ch[fCurrentEntity.position]; 1141 } while (fCurrentEntity.position fCurrentEntity.count - 1 (c == '\n' || c == '\r')); 1142 1143 for (int i = offset; i fCurrentEntity.position; i++) { 1144 fCurrentEntity.ch[i] = '\n'; 1145 storeWhiteSpace(i); 1146 } -Ulf
RFR (JAXP) 8029955 : AIOB in XMLEntityScanner.scanLiteral upon parsing literals with 100 LF chars
Hi, This is a quick fix for a whitespace buffer that was not adjusted properly in one of the two cases. The buffer, whiteSpaceLookup, is filled in two cases and adjusted properly the 2nd time. The code is moved into a method storeWhiteSpace so that it's shared for the 1st case as well. Note at line 1175, there is no need to save character 0x20 since all whitespace characters will later be replaced with character 0x20. webrevs: http://cr.openjdk.java.net/~joehw/jdk8/8029955/webrev/ Thanks, Joe