Hi, I have tested PDFBox 1.6.0, the text extraction feature does not work with PDF forms designed from Adobe Livecycle.
The output text file only has content like this: "To view the full contents of this document, you need a later version of the PDF viewer. You can upgrade to the latest version of Adobe Reader from www.adobe.com/products/acrobat/readstep2.html<http://www.google.com/url?sa=D&q=www.adobe.com/products/acrobat/readstep2.html> For further support, go to www.adobe.com/support/products/acrreader.html<http://www.google.com/url?sa=D&q=www.adobe.com/support/products/acrreader.html>" I have tried all kinds of open source codes, all same problems, because I see that they do not have decoding function for that. Here is the explanation I found: "PDFs created in LiveCycle Designer aren't the same as PDFs created in Acrobat or third-party tools. They're really a different format entirely. They contain an XDP (Acrobat XML Data Package File) payload within a PDF wrapper. " Please advise, thank you. Regards, Wu, Han Programmer Analyst UMDNJ-ICPH Dept. of Microbiology and Molecular Genetics ICPH-E250R Phone: 973-972-9005 (Tue, Thur) UMDNJ-NJMS Office of Research and Sponsored Programs MSB-C690 Phone: 973-972-5237 (Mon, Wed)

