Hi,

[email protected] schrieb:
Hello,
I am trying to do the PDF to Text conversion in the Websphere Environment using RAD7.

Please see below for the code snippet.

I keep on getting the below error with the empty output file.

Jan 20, 2010 3:18:45 PM org.apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: BDC
Jan 20, 2010 3:18:45 PM org.apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: g
This is just some logging output. During text extraction some of the not needed
operators are disabled for performance reasons, others aren't yet supported by
PDFBox. Generally all needed operators for a succesful text extraction should
be supported.

                        int ITERATIONS = 10;
  ....SNIP
Your code seems to be ok, but I have one question. Why do you iterate 10 times
over the text extraction part? Probably the problem is that you try to extract
the text multiple times without reinitializing the pdf-parser.

I've successfully extracted the text from the pdf-reference using the
ExtractText class [1] coming with PDFBox.
Thanks,
_____________________________________________
Varma Padmaraju
BR
Andreas Lehmkühler

[1] http://svn.apache.org/repos/asf/pdfbox/trunk/src/main/java/org/apache/pdfbox/ExtractText.java

Reply via email to