Re: [Podofo-users] GetStructTreeRoot returns null on some tagged PDF documents?

2009-06-23 Thread Mark Rogers
There seems to be some sort of tagged text in there: - the Read Out Loud feature of Adobe Reader does a good job of reading out the document and synchronising the reading to highlighted text on the document - the online PDF to HTML converter at Adobe gets all the document structure right (includ

Re: [Podofo-users] GetStructTreeRoot returns null on some tagged PDF documents?

2009-06-23 Thread Dominik Seichter
According to the document catalog, the StructTreeRoot is in object 74 0 R which is missing in this PDF. Maybe acrobat reader just checks if there is a StructTreeRoot entry in the catalog to display that a document is tagged. But from my understanding there is no StructTreeRoot dictionary in thi

[Podofo-users] GetStructTreeRoot returns null on some tagged PDF documents?

2009-06-23 Thread Mark Rogers
I've been trying to use PoDoFo to extract accessible text from PDFs. I have some PDF documents tagged for accessibility which show as tagged in Adobe Reader properties (Tagged: Yes), but PoDoFo::PdfDocument:: GetStructTreeRoot returns null. My (limited) understanding of ISO 32000 is that the ta

[Podofo-users] Large pdf documents ( >5,000 pages )

2009-06-23 Thread Bart McDonald
Hey everyone, I have been challenged to be able to create a PDF document that has 200,000 or more pages in it. I have used both the memdocument and streameddocument but both seem to slow down more and more until they finally crash (around 4500 pages), and no pdf output. Is there a way to flus

[Podofo-users] Vista 64

2009-06-23 Thread Bart McDonald
Hey everyone, I have a vista 64 machine and a windows XP 32 machine. I am using the same dll (which has the PoDoFo api) on both systems. The resulting pdf documents are different by quite a bit. I suspect it is the font resources or something. Has anyone seen the same problem or have any id