Hi again, I've updated the testcase to increase the probability of it really hitting the issue. The old version was prone to miss the issue in case something changed in the parser.
http://cr.openjdk.java.net/~clanger/webrevs/8153781.2/ Best regards Christoph > -----Original Message----- > From: Langer, Christoph > Sent: Montag, 18. April 2016 23:04 > To: 'huizhe wang' <huizhe.w...@oracle.com> > Cc: core-libs-dev@openjdk.java.net > Subject: RE: RFR: JDK-8153781 Issue in XMLScanner: > EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when skipping > large DOCTYPE section with CRLF at wrong place > > Hi Joe, > > here is the updated webrev where I incorporated your suggestions: > > http://cr.openjdk.java.net/~clanger/webrevs/8153781.1/ > > I also added a testcase. As for the message " DoctypedeclNotClosed ": I did > it in > several languages but there are some languages where I don't have the > knowledge to create a localized string. Is there some process for this or > would > we just be ok with reverting to the English text for that particular message? > > Thanks in advance for your review :) > > All the best > Christoph > > > -----Original Message----- > > From: huizhe wang [mailto:huizhe.w...@oracle.com] > > Sent: Dienstag, 12. April 2016 23:28 > > To: Langer, Christoph <christoph.lan...@sap.com> > > Cc: core-libs-dev@openjdk.java.net > > Subject: Re: RFR: JDK-8153781 Issue in XMLScanner: > > EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when skipping > > large DOCTYPE section with CRLF at wrong place > > > > > > On 4/12/2016 11:50 AM, Langer, Christoph wrote: > > > Hi Joe, > > > > > > thanks as always for your suggestions and I'll try to work it in. It will > probably > > take me a little while as I'm currently busy with another thing. I'll > > update my > > webrev eventually and add a testcase, too. > > > > Ok. > > > > > > But one question: I understand that the fix in skipDTD will be sufficient > > > to fix > > the issue. Nevetheless, can you explain me why the change in scanData is not > > beneficial or could even cause issues? There are several places in scanData > > when further loads are done. But only at this point where there's exactly > > one > > character after CRLF at the end of the buffer the method just returns > > without > > further load. If it wouldn't leave here it seems to me as if it would > > continue > > correctly and load more data. That would probably also be better in the > > sense > > of performance I guess?? > > > > It's there to handle the situation where a surrogate pair got split in > > between buffers. > > > > -Joe > > > > > > > > Thanks > > > Christoph > > > > > >> -----Original Message----- > > >> From: huizhe wang [mailto:huizhe.w...@oracle.com] > > >> Sent: Dienstag, 12. April 2016 19:54 > > >> To: Langer, Christoph <christoph.lan...@sap.com> > > >> Cc: core-libs-dev@openjdk.java.net > > >> Subject: Re: RFR: JDK-8153781 Issue in XMLScanner: > > >> EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when > > skipping > > >> large DOCTYPE section with CRLF at wrong place > > >> > > >> Also, EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET was a > > >> wrong msg > > >> id. It would be good to change that to DoctypedeclNotClosed and add a > > >> message to XMLMessages.properties right before > > DoctypedeclUnterminated, > > >> sth. like the following: > > >> > > >> DoctypedeclNotClosed = The document type declaration for root element > > >> type \"{0}\" must be closed with '']''. > > >> > > >> Thanks, > > >> Joe > > >> > > >> On 4/11/2016 5:49 PM, huizhe wang wrote: > > >>> On 4/7/2016 1:45 PM, Langer, Christoph wrote: > > >>>> Hi, > > >>>> > > >>>> > > >>>> > > >>>> I've run into a peculiar issue with Xerces. > > >>>> > > >>>> > > >>>> > > >>>> The problem is happening when a DTD shall be skipped, the DTD is > > >>>> larger than the buffer of the current entity and a CRLF sequence > > >>>> occurs just one char before the buffer end. > > >>>> > > >>>> > > >>>> > > >>>> The reason is that method skipDTD of class XMLDTDScannerImpl (about > > >>>> line 389) calls XMLEntityScanner.scanData() to scan for the next > > >>>> occurrence of ']'. The scanData method might return true which > > >>>> indicates that the delimiter ']' was not found yet and more data is > > >>>> to scan. Other users of scanData would handle this and call this > > >>>> method in a loop until it returns false or some other condition > > >>>> happens. So I've fixed that place like at the other callers of > > >>>> scanData. > > >>> This part of the change looks good. > > >>>> > > >>>> > > >>>> Nevertheless, the scanData method would usually load more data when > > >>>> it is at the end of the buffer. But in the special case when CRLF is > > >>>> found at the end of buffer - 1, scanData would just return true. So I > > >>>> also removed that check at line 1374 in XMLEntityScanner. Do you see > > >>>> any problem with that? Is there any reason behind it which I'm > > >>>> overseeing? > > >>> No need to remove this after the above change. The parser needs to > > >>> retain what's in the xml, e.g., not removing new lines. > > >>>> Furthermore I took the chance for further little cleanups. I've added > > >>>> the new copyright header to the files... is that the correct one? > > >>> Yes, that's the right license header. However, > > >>>> > > >>>> I also aligned the calls to invokeListeners(position) in > > >>>> XMLEntityScanner to always pass the actual position from which the > > >>>> load is started. Do you think this is correct? > > >>> Yes. > > >>>> > > >>>> > > >>>> Here is the bug: > > >>>> > > >>>> https://bugs.openjdk.java.net/browse/JDK-8153781 > > >>>> > > >>>> > > >>>> > > >>>> Here is the webrev: > > >>>> > > >>>> http://cr.openjdk.java.net/~clanger/webrevs/8153781.0/ > > >>>> > > >>>> > > >>>> > > >>>> Please give me some comments before I finalize my change including a > > >>>> jtreg testcase. > > >>> It would be better if you had included the testcase so that the review > > >>> can be done together with the code change. > > >>> > > >>> Thanks, > > >>> Joe > > >>> > > >>>> > > >>>> > > >>>> Thanks & Best regards > > >>>> > > >>>> Christoph > > >>>> > > >>>> > > >>>>