[ https://issues.apache.org/jira/browse/PDFBOX-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673299#comment-13673299 ]
Maruan Sahyoun commented on PDFBOX-1502: ---------------------------------------- Hi, as far as I can see the text extraction works as expected. Text extraction is meant to extract the boilerplate text not the fields value. This works similar to Adobe Reader if you save the filled out form as text. You will get also not get the fields value. So from my perspective the software works as designed an inline with what Adobe Reader does. BR Maruan > Not Extracting Text from PDF Document > ------------------------------------- > > Key: PDFBOX-1502 > URL: https://issues.apache.org/jira/browse/PDFBOX-1502 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 0.8.0-incubator, 1.7.1, 1.8.0 > Environment: Mac OS , jdk 1.7 > Reporter: deepak > Assignee: Andreas Lehmkühler > Attachments: PDFBOX1502-RenewalAdvice.txt, > Renewal_Advice_Edited_Extracted_Text.txt, Renewal_Advice_Edited.pdf, Renewal > Advice .pdf > > > PDDocument document = PDDocument.load(Inputstream); > PDFTextStripper stripper = new PDFTextStripper(); > stripper.getText(document) is not returning some text content in the > attached PDF Document . It is just returning the form fields but the values > are empty . The bug is reproducible both in 1.8.0-Snapshot and 1.7.1 > codebase. > Please help in resolving the issue -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira