[Podofo-users] PdfXRef rewriting

Francesco Pretto Mon, 24 Nov 2025 16:54:31 -0800

Hello,

I just wanted to inform you that an important update has just been
pushed: The PdfXref class has been mostly rewritten[1] and cleaned to
address a serious PoDoFo bug. Per ISO 32000-2:2020 chapter 7.5.4
Cross-reference table "For a PDF file that has never been
incrementally updated, the cross-reference section shall contain only
one subsection, whose object numbering begins at 0". PoDoFo was not
strictly following this requirement, ending sometimes in producing
files that were unreadable[2] in Adobe Acrobat, especially when flat
saving files that were incrementally updated several times and with
many deleted objects. This change at first looked easy but 1 planned
working day turned to be at least 5, because the approach previously
adopted in PdfXRef was using several insertions with manual sorting
plus hammering with iterations in other locations, both of them were
not guaranteeing readability and easy to prove correctness to my
evaluation. It ended up being a big work that touches several places
in the code and fixes a multitude of bugs related to handling of XRef
sections. Since this is a sensible change, I urge you to have a look
and test it: you should notice a better use of free object XRef
entries, which stopped growing at each save. PoDoFo unit testing is
still lacking, and certainly the couple of tests I added is not enough
to fill the gap. Still, a lot of testing is done under the hood, and I
hope more PDF writing unit tests can still come in the future, also
based on the received feedback, if any.


I hope to have more exciting news for you soon.

Cheers,
Francesco

[1] 
https://github.com/podofo/podofo/commit/599de1b8cb01b7db55a196b6a4071d98963879cc
[2] 
https://github.com/podofo/podofo-resources/blob/master/TestFixInvalidCrossReferenceTable.pdf


_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users

[Podofo-users] PdfXRef rewriting

Reply via email to