The patch is in SVN repository. You have to update your workspace from the SVN location.
http://svn.apache.org/repos/asf/incubator/pdfbox/trunk/ and then build the project using ANT. -----Original Message----- From: Abid Hussain [mailto:[email protected]] Sent: Wednesday, January 21, 2009 9:44 AM To: [email protected] Subject: Re: extract images Thanks for help. Where can I find the provided patch? I looked in the jira but didn't find anything. Maybe I have overlooked something? Regards, Abid [email protected] schrieb: > Abid, > > This bug may be the same bug that was just patched. > The line of code it is blowing up on is the same as another bug report. > " RE: java.io.EOFException: Unexpected end of ZLIB input stream" > > Please get the Patch that Andreas talks about and try that. > > Good Luck, > Peter > > > Hi Peter, > > I've checked all critical locations org.apache.pdfbox.filter.FlateFilter > and provided a patch. > > Thanks you for your help. > > BR > Andreas > > [email protected] schrieb: >> I forgot to add the number of bytes available in the variable mayRead >> to the where statement, in the earlier message. Version 2 is below. >> >> >> int mayRead=compressedData.available(); // pjl >> while ((mayRead > 0 && >> (amountRead = decompressor.read(buffer, 0, >> Math.min(mayRead,BUFFER_SIZE))) != -1)) >> >> -----Original Message----- >> From: Lenahan, Peter >> Sent: Friday, January 16, 2009 10:26 AM >> To: [email protected] >> Subject: RE: java.io.EOFException: Unexpected end of ZLIB input stream >> error message on UNIX box >> >> I did a Google search on your issue. There are a couple of solutions. >> InflaterInputStream read Unexpected end of ZLIB It came up with: >> Results 1 - 10 of about 854 >> >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4040920 >> >> Work Around >> The workaround is to never attempt to read more bytes than the entry >> contains. Call ZipEntry.getSize() to get the actual size of the entry, >> then use this value to keep track of the number of bytes remaining in >> the entry while reading from it. To take the previous example: >> >> This code change may solve the issue for PDFBox. >> >> at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:97) >> Add the Math.min() to reduce the number of bytes you are trying to read. >> >> int mayRead=compressedData.available(); >> while ((amountRead = decompressor.read(buffer, 0, >> Math.min(mayRead,BUFFER_SIZE))) != -1) >> >> >> >> I found another potential issue like this with a solution on the Sun >> site. >> It was described using windows, but the same could happen on UNIX. >> It suggests that the issue could happen if you are running several >> processes against the same directory. Please look this over to see if >> this is the problem. Are you running multiple processes to accomplish >> the job faster? >> >> http://forums.sun.com/thread.jspa?threadID=5316308 >> >> paul.miner >> Posts:2,639 >> Registered: 10/8/07 >> Re: Unexpected end of ZLIB input stream error while compiling >> Jul 22, 2008 6:54 AM (reply 1 of 2) (In reply to original post ) >> >> koko191 wrote: >> Main batch : >> start /B %SWIFT_LOCAL_HOME%\scripts\rmicAll.bat >> start /B %SWIFT_LOCAL_HOME%\scripts\create_jar.bat >> >> The "start" command does not wait for the command to finish, so both >> those batch files would be running in parallel. If they both work on >> the same jar, this could be a problem. >> >> If you want to run the batch files in sequence, use "call". >> >> -----Original Message----- >> From: Balasubramaniam, Balaji >> [mailto:[email protected]] >> Sent: Tuesday, January 13, 2009 7:05 PM >> To: [email protected] >> Subject: java.io.EOFException: Unexpected end of ZLIB input stream >> error message on UNIX box >> >> Hello, >> >> >> >> I'm trying to use PdfBox to identify a PDF file is corrupted or not. >> We are trying to automate a process in which it is going to loop >> through a given folder and see how many of the PDF files are >> corrupted. This program works fine in windows XP environment (OS >> Version: x86 Windows XP 5.1, Java version >> : Java HotSpot(tm) Client VM 1.5.0-15-b04). When we ran this >> application in UNIX box (OS Version: PA_RISC2.0 HP-UX B.11.23, Java >> Version: Java >> HotSpot(tm) Client VM 1.5.0.11 jinteg:11.07.07-09:52 PA2.0(aCC_AP)) it >> throws the following error. >> >> >> >> NOTE: This error is not happening for all the time. It throws the >> error only for some of the PDF files. Those PDF files are not >> corrupted and I could open those PDF files manually and it opens fine. >> >> >> >> java.io.EOFException: Unexpected end of ZLIB input stream >> >> at >> java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:216) >> >> at >> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:134) >> >> at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:97) >> >> at org.pdfbox.cos.COSStream.doDecode(COSStream.java:290) >> >> at org.pdfbox.cos.COSStream.doDecode(COSStream.java:235) >> >> at >> org.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:170) >> >> at >> org.pdfbox.pdmodel.common.COSStreamArray.getUnfilteredStream(COSStream >> Ar >> ray.j >> ava:200) >> >> at >> org.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:101) >> >> at >> ProcessDefinitions.RunAuditProcess.RunAuditProcessGenerateAuditLogMess >> ag >> e.inv >> oke(RunAuditProcessGenerateAuditLogMessage.java:212) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j >> av >> a:39) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess >> or >> Impl. >> java:25) >> >> at java.lang.reflect.Method.invoke(Method.java:585) >> >> at >> com.tibco.plugin.java.JavaActivity.eval(JavaActivity.java:383) >> >> at com.tibco.pe.plugin.Activity.eval(Activity.java:209) >> >> at com.tibco.pe.core.TaskImpl.eval(TaskImpl.java:540) >> >> at com.tibco.pe.core.Job.a(Job.java:712) >> >> at com.tibco.pe.core.Job.k(Job.java:501) >> >> at >> com.tibco.pe.core.JobDispatcher$JobCourier.a(JobDispatcher.java:249) >> >> at >> com.tibco.pe.core.JobDispatcher$JobCourier.run(JobDispatcher.java:200) >> >> >> >> Sample code snippet I use to do the task. >> >> >> >> PDDocument document = PDDocument.load(<input stream>); >> >> List pages = document.getDocumentCatalog().getAllPages(); >> >> If(pages != null && pages.size() > 0) { >> >> PDPage page = (PDPage)pages.get(i); >> >> PDStream contents = page.getContents(); >> >> PDFStreamParser parser = null; >> >> try { >> >> parser = new PDFStreamParser(contents.getStream()); >> >> } catch(Exception e) { >> >> System.err.println("This PDF cannot be read. Most possibly it >> could be corrupted. " + pdfFileName); >> >> } >> >> } >> >> >> >> Could somebody shed some light on this one? >> >> >> >> Thank you. >> >> > > > -- > Auf der Verpackung stand "benötigt Windows 9x/2000/XP oder BESSER", also habe ich Linux installiert. > > > > -----Original Message----- > From: Abid Hussain [mailto:[email protected]] > Sent: Tuesday, January 20, 2009 6:17 AM > To: [email protected] > Subject: extract images > > Hello everybody, > > I'm trying to extract images from a pdf file which won't work...:-( > > I tried the ExtractImages.exe which results in: > >ExtractImages.exe "C:\path\to\pdf_file" > Exception in thread "main" java.lang.NullPointerException > at org.pdfbox.ExtractImages.extractImages(ExtractImages.java:138) > at org.pdfbox.ExtractImages.main(ExtractImages.java:72) > > Then I tried to extract the images using code I copied from the ExtractImages class: > Here's a snippet: > PDXObjectImage image = (PDXObjectImage) images.get(key); > String name = getUniqueFileName(key, image.getSuffix()); > image.write2file(name); > > The execution of the last line results in: > java.util.zip.ZipException: unknown compression method > at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:140) > at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:110) > at org.pdfbox.cos.COSStream.doDecode(COSStream.java:290) > at org.pdfbox.cos.COSStream.doDecode(COSStream.java:235) > at org.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:170) > at org.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:226) > at org.pdfbox.pdmodel.common.PDStream.getByteArray(PDStream.java:481) > at org.pdfbox.pdmodel.graphics.xobject.PDPixelMap.getRGBImage(PDPixelMap.java:13 8) > at > org.pdfbox.pdmodel.graphics.xobject.PDPixelMap.write2OutputStream(PDPixelMap. java:166) > at > org.pdfbox.pdmodel.graphics.xobject.PDXObjectImage.write2file(PDXObjectImage. java:118) > at de.thecode.pdf.pdfbox.ExtractImages.extractImages(ExtractImages.java:52) > at de.thecode.pdf.pdfbox.ExtractImages.main(ExtractImages.java:30) > > Anybody knows how to get the image extraction work correctly...? > > Best regards, > > Abid > -- Abid Hussain
