On 08/09/15 23:03, Even Rouault wrote: > Le mardi 08 septembre 2015 14:43:07, Adrian Johnson a écrit : >> On 08/09/15 21:06, Even Rouault wrote: >>> Hi, >>> >>> A too huge number may cause the gmallocn() in Catalog::cachePageTree() >>> to crash even if we call it with a low page number. >>> >>> Even >>> >>> + // to avoid too huge memory allocations layer and avoid crashes >>> + // This is the maximum number of indirect objects as per >>> >>> ISO-32000:2008 (Table C-1) >> >> Table C-1 is a list of minimum limits for 32-bit readers. > > Ah indeed. But they also state "Because Acrobat implementations are subject > to > these limits, applications producing PDF files are strongly advised to remain > within them", so that might make sense to check that (even if Acrobat goes > 64bit, which is perhaps the case, but anyway, does a 8 million page PDF make > sense ?)
A page count limit does not make sense. A limit that may be appropriate for a 64-bit 16GB desktop would not be appropriate on a 32-bit embedded system with limited memory. Better to just check for the out of memory condition and report an error. > >> >>> + // We could probably decrease that number again. PDFium for >>> example uses 1 Mi >>> + else if (numPages > 8 * 1024 * 1024) { >>> + error(errSyntaxWarning, -1, >>> + "Page count ({0:d}) too big. Limiting number of >>> >>> reported pages to 8 Mi", >>> >>> + numPages); >> >> Instead of imposing an arbitrary limit we should just add a check for >> gmallocn() returning NULL and print an error. > > That would be another possibility. Just looked a bit more complicated to do > it > right and not leak memory for someone not familiar with the code base. > >> >> For broken PDFs that report an invalid size (see bug 85140) we could >> check if the page count exceeds the number of objects in the XRef. > > What would be the criterion to decide that a PDF is broken ? Or do you mean > we > should always check that the reported page count is no bigger than the number > of objects in the XRef ? And in that case, should we limit the reported page > count to the number of objects in the XRef, or just return 0 with an error ? Since you did not provide a sample PDF that demonstrates the problem I assumed that you have a broken PDF that claims to have a much higher page count than the actual number of pages. If the PDF is not broken and really does have more than 8 million pages it makes no sense to limit the page count as this would prevent machines with sufficient memory from being able to read the entire PDF. > >> >> _______________________________________________ >> poppler mailing list >> poppler@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/poppler > _______________________________________________ poppler mailing list poppler@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/poppler