On 4/19/2016 6:54 AM, Langer, Christoph wrote:
Hi again,I've updated the testcase to increase the probability of it really hitting the issue. The old version was prone to miss the issue in case something changed in the parser. http://cr.openjdk.java.net/~clanger/webrevs/8153781.2/
Looks good overall.Error msgs: you made an extra effort :-) normally, we don't need to do any translation. The I18n team has a revolving task that will periodically pick up these changes.
License header: it is correct to update to the latest Apache License format. However, we would need to preserve the "reserved comment block" if we've not changed the code, which is the case for class such as HTTPInputSource.java.
Best, Joe
Best regards Christoph-----Original Message----- From: Langer, Christoph Sent: Montag, 18. April 2016 23:04 To: 'huizhe wang' <huizhe.w...@oracle.com> Cc: core-libs-dev@openjdk.java.net Subject: RE: RFR: JDK-8153781 Issue in XMLScanner: EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when skipping large DOCTYPE section with CRLF at wrong place Hi Joe, here is the updated webrev where I incorporated your suggestions: http://cr.openjdk.java.net/~clanger/webrevs/8153781.1/ I also added a testcase. As for the message " DoctypedeclNotClosed ": I did it in several languages but there are some languages where I don't have the knowledge to create a localized string. Is there some process for this or would we just be ok with reverting to the English text for that particular message? Thanks in advance for your review :) All the best Christoph-----Original Message----- From: huizhe wang [mailto:huizhe.w...@oracle.com] Sent: Dienstag, 12. April 2016 23:28 To: Langer, Christoph <christoph.lan...@sap.com> Cc: core-libs-dev@openjdk.java.net Subject: Re: RFR: JDK-8153781 Issue in XMLScanner: EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when skipping large DOCTYPE section with CRLF at wrong place On 4/12/2016 11:50 AM, Langer, Christoph wrote:Hi Joe, thanks as always for your suggestions and I'll try to work it in. It willprobablytake me a little while as I'm currently busy with another thing. I'll update my webrev eventually and add a testcase, too. Ok.But one question: I understand that the fix in skipDTD will be sufficient to fixthe issue. Nevetheless, can you explain me why the change in scanData is not beneficial or could even cause issues? There are several places in scanData when further loads are done. But only at this point where there's exactly one character after CRLF at the end of the buffer the method just returns without further load. If it wouldn't leave here it seems to me as if it would continue correctly and load more data. That would probably also be better in the sense of performance I guess?? It's there to handle the situation where a surrogate pair got split in between buffers. -JoeThanks Christoph-----Original Message----- From: huizhe wang [mailto:huizhe.w...@oracle.com] Sent: Dienstag, 12. April 2016 19:54 To: Langer, Christoph <christoph.lan...@sap.com> Cc: core-libs-dev@openjdk.java.net Subject: Re: RFR: JDK-8153781 Issue in XMLScanner: EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET whenskippinglarge DOCTYPE section with CRLF at wrong place Also, EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET was a wrong msg id. It would be good to change that to DoctypedeclNotClosed and add a message to XMLMessages.properties right beforeDoctypedeclUnterminated,sth. like the following: DoctypedeclNotClosed = The document type declaration for root element type \"{0}\" must be closed with '']''. Thanks, Joe On 4/11/2016 5:49 PM, huizhe wang wrote:On 4/7/2016 1:45 PM, Langer, Christoph wrote:Hi, I've run into a peculiar issue with Xerces. The problem is happening when a DTD shall be skipped, the DTD is larger than the buffer of the current entity and a CRLF sequence occurs just one char before the buffer end. The reason is that method skipDTD of class XMLDTDScannerImpl (about line 389) calls XMLEntityScanner.scanData() to scan for the next occurrence of ']'. The scanData method might return true which indicates that the delimiter ']' was not found yet and more data is to scan. Other users of scanData would handle this and call this method in a loop until it returns false or some other condition happens. So I've fixed that place like at the other callers of scanData.This part of the change looks good.Nevertheless, the scanData method would usually load more data when it is at the end of the buffer. But in the special case when CRLF is found at the end of buffer - 1, scanData would just return true. So I also removed that check at line 1374 in XMLEntityScanner. Do you see any problem with that? Is there any reason behind it which I'm overseeing?No need to remove this after the above change. The parser needs to retain what's in the xml, e.g., not removing new lines.Furthermore I took the chance for further little cleanups. I've added the new copyright header to the files... is that the correct one?Yes, that's the right license header. However,I also aligned the calls to invokeListeners(position) in XMLEntityScanner to always pass the actual position from which the load is started. Do you think this is correct?Yes.Here is the bug: https://bugs.openjdk.java.net/browse/JDK-8153781 Here is the webrev: http://cr.openjdk.java.net/~clanger/webrevs/8153781.0/ Please give me some comments before I finalize my change including a jtreg testcase.It would be better if you had included the testcase so that the review can be done together with the code change. Thanks, JoeThanks & Best regards Christoph