Hello again, we recently had another issue with podofo crashing while parsing a pdf. The crashing pdf had 130 updates, but not with trailers this time, but xref streams. Sadly I cannot share the file, bc it contains sensitive data of a third party.
With the previous patch, try-catch clauses Since PdfParser::ReadXRefStreamContents had it's try-catch-throw clause not patched with the last patch, it triggered the stack overflow when processing the RecursionGuard exception. I created a patch for this in a similar fashion (for the current master), removing the try-catch-throw clause: --- podofo/src/podofo/private/PdfParser.cpp Mon Mar 3 16:15:00 2025 +++ podofo/src/podofo/private/PdfParser.cpp Mon Mar 3 16:18:00 2025 @@ -550,24 +550,10 @@ if (!skipFollowPrevious) { - try - { - - // PDFs that have been through multiple PDF tools may have a mix of xref tables (ISO 32000-1 7.5.4) - // and XRefStm streams (ISO 32000-1 7.5.8.1) and in the Prev chain, - // so call ReadXRefContents (which deals with both) instead of ReadXRefStreamContents - ReadXRefContents(device, previousOffset, false); - } - catch (PdfError& e) - { - // Be forgiving, the error happens when an entry in XRef - // stream points to a wrong place (offset) in the PDF file. - if (e != PdfErrorCode::InvalidNumber) - { - PODOFO_PUSH_FRAME(e); - throw; - } - } + // PDFs that have been through multiple PDF tools may have a mix of xref tables (ISO 32000-1 7.5.4) + // and XRefStm streams (ISO 32000-1 7.5.8.1) and in the Prev chain, + // so call ReadXRefContents (which deals with both) instead of ReadXRefStreamContents + ReadXRefContents(device, previousOffset, false); } } } But as you can see, doing so changes the code slightly: Before, InvalidNumber errors where ignored, but by removing the try-catch they get treated as every other error. I don't know if its important to keep ignoring InvalidNumber errors, but I know that removing the try-catch clause prevents the stack overflow (tested that out). So the InvalidNumber errors likely need to be handled differently, I guess. Greetings, F.E. Am Mo., 12. Aug. 2024 um 00:07 Uhr schrieb Francesco Pretto < cez...@gmail.com>: > On Tue, 13 Feb 2024 at 14:57, F. E. <exler7...@gmail.com> wrote: > >> today I had some time for testing whether reducing the try-catch clauses >> solves the stack overflow, and it does: >> > > Hello, I finally had the time to look at the issue in 1.0.0-dev and I came > up with a similar fix: > > https://github.com/podofo/podofo/commit/31e0139113f5860cc35d59bbe5c2844a1a37ec6c > > >> Minor side-note: >> Why does the PdfTokenizer::RecursionGuard throw a "ePdfError_InvalidXRef" >> error when the rmax recursion depth is reached? >> It would make more sense to throw a special "MaxRecursionReached" error >> here. >> > > Your suggestion makes a lot of sense. I will have it in time for 1.0.0. > > Cheers, > Francesco >
--- podofo/src/podofo/private/PdfParser.cpp Mon Mar 3 16:15:00 2025 +++ podofo/src/podofo/private/PdfParser.cpp Mon Mar 3 16:18:00 2025 @@ -550,24 +550,10 @@ if (!skipFollowPrevious) { - try - { - - // PDFs that have been through multiple PDF tools may have a mix of xref tables (ISO 32000-1 7.5.4) - // and XRefStm streams (ISO 32000-1 7.5.8.1) and in the Prev chain, - // so call ReadXRefContents (which deals with both) instead of ReadXRefStreamContents - ReadXRefContents(device, previousOffset, false); - } - catch (PdfError& e) - { - // Be forgiving, the error happens when an entry in XRef - // stream points to a wrong place (offset) in the PDF file. - if (e != PdfErrorCode::InvalidNumber) - { - PODOFO_PUSH_FRAME(e); - throw; - } - } + // PDFs that have been through multiple PDF tools may have a mix of xref tables (ISO 32000-1 7.5.4) + // and XRefStm streams (ISO 32000-1 7.5.8.1) and in the Prev chain, + // so call ReadXRefContents (which deals with both) instead of ReadXRefStreamContents + ReadXRefContents(device, previousOffset, false); } } }
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users