Hello Connor,

I'm the maintainer of pdfmm, a PoDoFo fork, but I had to evaluate what
was done in PoDoFo about PDF linearization. I will try to answer you
in a fair way, to the best of my knowledge. First, let's clarify what
main capabilities PDF Linearization should enable, among others.
According to Annex F of PDF 32000-1:2008 the PDF linearization, :
- allows to "display the first page as quickly as possible" (not
necessarily the page 0);
- when the user requests another page of an open document it allows to
"display that page as quickly as possible".

PDF linearization as described by Annex F is implemented by
encapsulating the content of the first page document in a "Incremental
Update" like serialization that must be at the beginning of the
document, together with a "linearization dictionary" that should be
the first object of the document. The rest of the document is appended
after this fake "incremental update" and "the pages shall be
contiguous and shall be ordered by page number", and "the objects
required to display that page shall be grouped together" and "the
order of objects referenced from the page object should facilitate
[...] incremental display of the page data as it arrives".

Let's distinguish between PDF linearization read support, intended as
the ability to exploit the organization of a linearized PDF document
and write support as the ability to create a compliant linearized PDF.
PoDoFo attempted to have linearization read support but it was
disabled in 2009[1]. Also just reading the document structure (and not
the object content) is performing a lot of of seeks that would kill
the purpose of linearization (I actually removed those seeks in
pdfmm).

 About PDF linearization write support, which I think you are most
interested in, PoDoFo appears to do some work related to linearization
in PdfVecObjects class[2], but in all the work related to create the
linearization dictionary was disabled in PdfWriter[3] even earilier in
2007. Also there's no sign of the needed fake incremental update that
contains the content of the first page.

My conclusion is that PoDoFo linearization support overall (read an
write) has always been quite incomplete at best, and quite certainly
broken/buggy enough to be disabled quite early in PoDoFo development,
so that nothing is working about PDF linearization now, and the
leftover API that seems to enable linearization is just code that got
rotten (that's why I decided to remove it completely in pdfmm). If one
decided to work on revamping the PDF linearization support I would
recommend to read the specification and start it from scratch, not
basing on the left-over code in PoDoFo, but it's a weeks/months long
full time work. Of course I would love to re-introduce it in pdfmm,
where the situation is just much more clean than in PoDoFO, but
unfortunately that work is not in my top priorities.

I hope I was factually correct about the current state of PoDoFO.
Other people may add further details or correct me if I was wrong.

Regards,
Francesco

[1] 
https://sourceforge.net/p/podofo/code/HEAD/tree/podofo/trunk/src/podofo/base/PdfParser.cpp#l300
[2] 
https://sourceforge.net/p/podofo/code/HEAD/tree/podofo/trunk/src/podofo/base/PdfVecObjects.cpp#l308
[3] 
https://sourceforge.net/p/podofo/code/HEAD/tree/podofo/trunk/src/podofo/base/PdfWriter.cpp#l274


On Thu, 17 Feb 2022 at 23:00, Connor Black <cbl...@iconect.com> wrote:
>
> Hey,
>
>
>
> I am currently evaluating this library for use in a commercial product and I 
> was curious what linearalization would look like using PoDoFo. I have spent 
> the last couple of days looking through documentation and trying to look into 
> how it would work but the most I can grasp is that PdfWriter has the option 
> to set linearalization through SetLineralization – but I have not been able 
> to find any examples or successfully use the PdfWriter class to produce these 
> results. I was wondering if you could provide a little code snippet showing 
> how PdfWriter would be used in that regard and maybe a break down on how 
> linearalization is supported using this library.
>
>
>
> Thanks,
>
> Connor Black
>
>
>
> This message and any attachments are intended only for the use of the 
> addressee and may contain information that is privileged and confidential. If 
> the reader of the message is not the intended recipient or an authorized 
> representative of the intended recipient, you are hereby notified that any 
> dissemination of this communication is strictly prohibited. If you have 
> received this communication in error, notify the sender immediately by return 
> email and delete the message and any attachments from your system.
> _______________________________________________
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users


_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to