We don't handle xfa, you're on your own there, or should buy a product that can (I think itext can do it).

XFA is some sort of XML. So after you have getDocument() you need to look at the XML you get. The XFA specification is 1500 pages long.

If all the documents you want to handle have the same content, then you might be able to get what you need without reading it.

Tilman

Am 23.02.2019 um 02:55 schrieb Nick Westerly:
Hi, my ultimate goal is to extract text data from PDFs forms using xfa. Is
it possible to use pdfbox to flatten PDFs with xfa forms ( to simplify text
extraction).

If not can the fields themselves be easily parsed?

I see
https://stackoverflow.com/questions/14454387/pdfbox-how-to-flatten-a-pdf-form
which seems to say that xfa is not flatten able?

I see this class,
https://pdfbox.apache.org/docs/1.8.12/javadocs/org/apache/pdfbox/pdmodel/interactive/form/PDXFA.html,
once I call getDocument, how can I get fields (by name/type/) and contents?

Thanks!



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to