Done, https://issues.apache.org/jira/browse/PDFBOX-1268
Be careful with the attachment / pdf file. It contains a malicious payload for vulnerable windows systems. On Mon, Mar 26, 2012 at 8:12 AM, Andreas Lehmkuehler <[email protected]> wrote: > Hi, > > Am 26.03.2012 07:42, schrieb Christophe Vandeplas: > >> Hello List, >> >> >> I'm working on a PDF scanning tool and with a specific (malicious) PDF >> I always get OutOfMemory Errors. >> >> The backtrace is: >> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.pdfbox.filter.FlateFilter.decodePredictor(FlateFilter.java:218) >> at >> org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:170) >> at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:279) >> at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:221) >> at >> org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:156) >> at ScanPdf.checkCOSBaseObject(ScanPdf.java:199) >> ... >> >> When looking in the PDFBox code FlateFilter.java:218 is >> byte[] lastline = new byte[rowlength]; >> >> In that contact rowlength = 1073741838 => seems rather big, no? >> Looking back in the code it seems that it's colors who is so big. >> Colors seems to be extracted from the dict in FlateFilter.java:96: >> colors = dict.getInt(COSName.COLORS); >> >> The (malicious) PDF has indeed the definition : /Colors 1073741838 > > Hmm, that sounds quite large, but the pdf spec describes the colors value as > follows: > > "(May be used only if Predictor is greater than 1) The number of interleaved > colour components per sample. Valid values are 1 to 4 (PDF 1.0) and 1 or > greater (PDF 1.3). Default value: 1." > > >> So my question is now: >> Is this something I need to catch in my own code, or should PDFBox be >> patched to catch such issues? (like the catched OutOfMemoryError in >> FlateFilter:124) > > PDFBox should handle that. Please create an issue on JIRA [1] and attach the > pdf in question. > > >> Thanks for your expertise >> Christophe > > > BR > Andreas Lehmkühler > > [1] https://issues.apache.org/jira/browse/PDFBOX

