In the code below, running on Solaris using pdfBox 1.2.0, we have had no
problems extracting images from pdf documents. Recently, we recently
started accepting files from a new scanning source out of our UK office
which has caused the program to throw a java error. Code and error
stack-trace shown below.
Does anyone know a valid work-around?
try{
image = page.convertToImage();
} catch(Exception eImage){
PDResources resources = page.getResources();
Map images = resources.getImages();
if( images != null )
{//only expects one image per page
Iterator imageIter =
images.keySet().iterator();
while( imageIter.hasNext() )
{
String key = (String )imageIter.next();
PDXObjectImage imagePDX = null;
imagePDX = (PDXObjectImage)images.get(
key );
image = imagePDX.getRGBImage();
}
}
/* end new extraction method*/
}
java.lang.Error: Error 5
at
com.sun.media.imageioimpl.plugins.tiff.TIFFFaxDecompressor.decodeT6(TIFFFaxDecompressor.java:1129)
at
com.sun.media.imageioimpl.plugins.tiff.TIFFFaxDecompressor.decodeRaw(TIFFFaxDecompressor.java:651)
at
com.sun.media.imageioimpl.plugins.tiff.TIFFCodecLibFaxDecompressor.decodeRaw(TIFFCodecLibFaxDecompressor.java:112)
at
com.sun.media.imageio.plugins.tiff.TIFFDecompressor.decode(TIFFDecompressor.java:2488)
at
com.sun.media.imageioimpl.plugins.tiff.TIFFImageReader.decodeTile(TIFFImageReader.java:963)
at
com.sun.media.imageioimpl.plugins.tiff.TIFFImageReader.read(TIFFImageReader.java:1240)
at javax.imageio.ImageIO.read(ImageIO.java:1400)
at javax.imageio.ImageIO.read(ImageIO.java:1322)
at
org.apache.pdfbox.pdmodel.graphics.xobject.PDCcitt.getRGBImage(PDCcitt.java:120)
at
org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:74)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:567)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:250)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:208)
at
org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:112)
at
org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:718)
at
org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:689)
at oz.ozReadBarcode.oz_pdf_process(ozReadBarcode.java:963)
at oz.ozReadBarcode.oz_pdf_process_files(ozReadBarcode.java:1178)
at oz.ozReadBarcode.main(ozReadBarcode.java:350)
Process exited with exit code 0.
Glenn Hirshon
Och-Ziff
9 W 57th St, 13th Fl.
NY, NY 10019
646.562.4583
646.562.4683 F
The information contained in this message and any attachment(s) may be
privileged, confidential, proprietary or otherwise protected from
disclosure and is intended solely for the use of the individual or entity
to whom it is addressed. If you are not the intended recipient, you are
hereby notified that any dissemination, distribution, copying or use of
this message and any attachment is strictly prohibited and may be
unlawful. If you have received this message in error, please notify us
immediately by replying to this email and permanently delete the message
from your computer.
Nothing contained in this message and/or any attachment(s) constitutes a
solicitation or an offer to buy or sell any securities.