Re: PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException

Tilman Hausherr Sat, 07 Mar 2015 06:37:34 -0800

Sorry, wrong links, use these:
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/1.8.9-SNAPSHOT/
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.0-SNAPSHOT/


Tilman

Am 07.03.2015 um 14:21 schrieb Tilman Hausherr:

The best would be to test whether that file can be handled by newerversions of PDFBox (1.8.9 and 2.0)
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/1.8.9-SNAPSHOT/https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/2.0.0-SNAPSHOT/
download the jar files, for each one try

    - run java -jar <jarfile> ExtractText <yourfile>
    - see what happens
    - tell it

Your paste indicates a problem in RandomAccessBuffer.java.

Tilman

Am 06.03.2015 um 21:05 schrieb [email protected]:
Hello,
I am getting PDFParser Error Caused by:org.apache.pdfbox.exceptions.WrappedIOException
Complete stack trace is on the following link.
( http://apaste.info/DRD )
I am trying to import 4GB Long PDF using Tika into Solr. I was ableto import up to 500MB.
Please suggest if there is any workaround.

Thanks
G

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException

Reply via email to