[
https://issues.apache.org/jira/browse/PDFBOX-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr closed PDFBOX-4769.
-----------------------------------
Resolution: Not A Bug
> Problem pdf version 1.4
> -----------------------
>
> Key: PDFBOX-4769
> URL: https://issues.apache.org/jira/browse/PDFBOX-4769
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 2.0.17
> Environment: java, maven,
> Reporter: NathanJ
> Priority: Blocker
>
> Here is my problem. I have to read pdf files and i decided to use pdfbox. I'm
> using the following code to read my file line by line to execute some actions
> on each ones :
> File tempFile = "_myPdfFile"_
> {color:#cc7832}try {color}(PDDocument document = PDDocument.load(tempFile))
> {{color:#cc7832}
> {color}{color:#cc7832}
> {color}{color:#cc7832} if {color}(!document.isEncrypted())
> {
> PDFTextStripperByArea stripper = {color:#cc7832}new
> {color}PDFTextStripperByArea(){color:#cc7832};
> {color} stripper.setSortByPosition({color:#cc7832}true{color}){color:#cc7832};
> {color} PDFTextStripper tStripper = {color:#cc7832}new
> {color}PDFTextStripper(){color:#cc7832};
> {color} String pdfFileInText = tStripper.getText(document){color:#cc7832};
> {color} String lines[] =
> pdfFileInText.split({color:#6a8759}"{color}{color:#cc7832}\\{color}{color:#6a8759}r?{color}{color:#cc7832}\\{color}{color:#6a8759}n"{color}){color:#cc7832};{color}
> For a pdf in format version 1.7, all is working well. But sometimes, i have
> to work with pdf version 1.4 and at this moment there is a problem : the
> PDFTextStripper is unable to read the pdf and my "pdfFileInText" get this
> value : "\r\n\r\n" and that's all.
>
> I didn't find any solutions on the web.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]