Hello again, I think I get what you mean here, but I'm not so confident in implementing such a change myself. We are still working with v0.9.8, and while the implementation of PdfParser did not change drastically, PdfTokenizer did receive some notable changes. So every potential change to a later version then v.9.8.0 I wouldn't be able to test properly.
I was thinking about another way of tackling this: Right now, there's just the one PdfError class, which is used for every error, but objects can be distinguished via EPdfError enum. So to check the type of a given PdfError to decide whether to rethrow, we have to check it's enum in the catch-clause. And that's what's triggering the stack overflow with big pdfs. Alternatively, if we had distinct error classes, at least for the errors we want to check for, we could just use different catch clauses, with no need for type checking in the catch clause. This would prevent the stack overflow and still keep the previous behavior of ignoring NoNumber errors. All that's required for this is to derive a specialized NoNumberError from PdfError, throw that one instead of a general PdfError, and adjust the catch clause. I'm fully aware that this would mean a drastic change in podofo error design, and I'm somewhat expecting this proposal to be disregarded. I just wanted to point out that this might be a solution as well. Since v0.9.8 is no longer receiving any updates, so I have to patch it on my own, introducing a special error class for NoNumber is the easier implementation for me to get right. Regards, F.E. Am Mo., 10. März 2025 um 11:53 Uhr schrieb Francesco Pretto < cez...@gmail.com>: > On Mon, 3 Mar 2025 at 16:59, F. E. <exler7...@gmail.com> wrote: > > > > we recently had another issue with podofo crashing while parsing a pdf. > The crashing pdf had 130 updates, but not with trailers this time, but xref > streams. > > [...] > > I created a patch for this in a similar fashion (for the current > master), removing the try-catch-throw clause: > > > > [...] > > But as you can see, doing so changes the code slightly: > > Before, InvalidNumber errors where ignored, but by removing the > try-catch they get treated as every other error. > > I don't know if its important to keep ignoring InvalidNumber errors, but > I know that removing the try-catch clause prevents the stack overflow > (tested that out). > > So the InvalidNumber errors likely need to be handled differently, I > guess. > > > > Hello, > > I understand the problem. I think the solution as you state is just > removing the try-catch, but we must ensure > `PdfErrorCode::InvalidNumber` is not thrown in a nested frame, which I > believe can happen in the following two locations: > - > https://github.com/podofo/podofo/blob/0e0ba2d387d6e8426cbf8f216fed7ef15a0db3e5/src/podofo/private/PdfParser.cpp#L302 > - > https://github.com/podofo/podofo/blob/0e0ba2d387d6e8426cbf8f216fed7ef15a0db3e5/src/podofo/private/PdfParser.cpp#L370 > > Locally we can determine if parsing can continue or we throw an > exception that won't be ignored. Also we should not throw > `PdfErrorCode::InvalidNumber` in determining token type in these two > places: > - > https://github.com/podofo/podofo/blob/0e0ba2d387d6e8426cbf8f216fed7ef15a0db3e5/src/podofo/main/PdfTokenizer.cpp#L301 > - > https://github.com/podofo/podofo/blob/0e0ba2d387d6e8426cbf8f216fed7ef15a0db3e5/src/podofo/main/PdfTokenizer.cpp#L314 > > `PdfTokenizer::DetermineDataType` should not throw in general, as > it's used by `Try<something>...` functions which now have the > semantics of not throwing in case of error. > > As usual I can't help but point out how little I get interested by > such code security issues (which are extremely boring). So, unless > a comprehensive solution with the above recommendations is contributed > by someone else, a fix will come before 1.0 but with less priority than > other stuff I want to finish earlier. > > Regards, > Francesco >
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users