[ https://issues.apache.org/jira/browse/PDFBOX-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr closed PDFBOX-3960. ----------------------------------- Resolution: Not A Bug > Spaces are ignored when reading text using TextPosition > ------------------------------------------------------- > > Key: PDFBOX-3960 > URL: https://issues.apache.org/jira/browse/PDFBOX-3960 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Text extraction > Affects Versions: 1.8.13 > Reporter: Rampradeep B > Attachments: spark1.pdf > > Original Estimate: 72h > Remaining Estimate: 72h > > Hi, > The following code snippet should print each character from the PDF > (including Space as a character). But for some PDFs return text with missing > spaces between words. > public class PDFTextStripperProcessor extends PDFTextStripper { > @Override > public void processTextPosition( TextPosition text ) { > System.out.println( text.getCharacter() ); > } > } > TextPosition getCharacter() is not returning space between words. Due to > this, I am getting text without spaces. Please provide the solution to avoid > missing spaces. > Please find the Input PDF in the attachment. > Thanks, > Rampradeep B -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org