Am 08.03.24 um 13:34 schrieb Tilman Hausherr:
On 08.03.2024 07:13, Tilman Hausherr wrote:
regression test result:
https://home.snafu.de/tilman/tmp/reports_pdfbox_3.0.1_vs_3.0.2.tar.xz
Thanks for running the regression tests.
Re exceptions:
- The OOM can't be reproduced
- The two others are related to the zip bomb protection and (probably) a
recent change (PDFBOX-5704)
I've found a solution for that case, see PDFBOX-5783
Andreas
Re text extraction:
commoncrawl3/TQ/TQVMNMW5ACPU3CZL46OBNGWMPSSXC5MO: that file is a mess
anyway
commoncrawl3/Y2/Y2PVHNL43FBNKZRAJTSX5J5BLLHMCNLY: same
bug_trackers/pdf.js/pdf.js-11651-0.pdf: might be related to the
exception I mentioned, the stack trace looks similar. The result is that
a broken font is no longer replaced. It can be fixed by catching the
exception when fontFile.createView() is called in PDFOntFactory and
returning null.
bug_trackers/poppler-gitlab/poppler-748-0.tgz-1.pdf: messy file. But
there is an NPE on page 2, that can be fixed easily
commoncrawl3/JP/JPO3LX6ABADSDNC5BIX3KZJBRFT5BIEQ: messy file
commoncrawl3/4L/4L2UKWSZNPXPSGS3OTXQZBBKJH6XF7G4: same
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org