I've been trying to use PoDoFo to extract accessible text from PDFs.

 

I have some PDF documents tagged for accessibility which show as tagged in
Adobe Reader properties (Tagged: Yes), but PoDoFo::PdfDocument::
GetStructTreeRoot returns null. My (limited) understanding of ISO 32000 is
that the tagged text all lives under StructTreeRoot.

 

When I open one of the problem PDFs in PoDoFoBrowser it shows a reference
for StructTreeRoot, but the referenced node can't be expanded to show its
contents.  The StructTreeRoot node can be expanded in other tagged PDFs.

 

An example of a problem document is:

http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf

 

Any pointers or suggestions would be gratefully accepted.

 

Regards

Mark Rogers - mark.rog...@electrum.co.uk

 

------------------------------------------------------------------------------
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to