I've been trying to use PoDoFo to extract accessible text from PDFs.
I have some PDF documents tagged for accessibility which show as tagged in
Adobe Reader properties (Tagged: Yes), but PoDoFo::PdfDocument::
GetStructTreeRoot returns null. My (limited) understanding of ISO 32000 is
that the tagged text all lives under StructTreeRoot.
When I open one of the problem PDFs in PoDoFoBrowser it shows a reference
for StructTreeRoot, but the referenced node can't be expanded to show its
contents. The StructTreeRoot node can be expanded in other tagged PDFs.
An example of a problem document is:
http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf
Any pointers or suggestions would be gratefully accepted.
Regards
Mark Rogers - mark.rog...@electrum.co.uk
------------------------------------------------------------------------------
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users