certain sequences (such as endstrea[^m] are eaten by BaseParser#readUntilEndStream ----------------------------------------------------------------------------------
Key: PDFBOX-910 URL: https://issues.apache.org/jira/browse/PDFBOX-910 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 1.3.1 Reporter: Adam Nichols Assignee: Adam Nichols Priority: Minor Fix For: 1.4.0 Reported on the dev list by Martijn List: It looks like there are two missing else parts in BaseParser#readUntilEndStream. The last part when trying to match "endstream" contains this: if(byteRead==M){ //found the whole marker pdfSource.unread( ENDSTREAM ); return; } But what happens when the last character is not "m" (for example endstreaX). Because there is no else statement it seems that "endstrea" is never written. Shouldn't it be: if(byteRead==M){ //found the whole marker pdfSource.unread( ENDSTREAM ); return; } else { out.write(ENDSTREAM, 0, 8); } Similar thing happens happens below when matching "endobj". If the last character does not match "j". "endob" is not written: if(byteRead==J){ //found whole marker pdfSource.unread( ENDOBJ ); return; } shouldn't it be: if(byteRead==J){ //found whole marker pdfSource.unread( ENDOBJ ); return; } else { out.write(ENDOBJ, 0, 5); } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.