Am 19.12.2016 um 20:34 schrieb Michael Ng:
Hi everyone,
First of all, thank the effort of the developers for PDFBox.
I encounter an OutOfMemory problem when rendering images in 600 dpi scanned PDF
files. Here are 2 sample one-page files (with 45KB and 517 KB respectively):
https://drive.google.com/open?id=0B9p0h5LPh-3FQTZ0OGhoYkpMUjg
https://drive.google.com/open?id=0B9p0h5LPh-3FRUxhT0RJRl9FekE
The file size is not large, but rendering it will take up more than 256Mb of
the memory and throw an error. If I tell Java VM to allocate more memory by
passing the -Xmx parameter, the image will be rendered successfully. (There
also seems to be no problem for scanned files with 300 dpi or below.)
I would like to know whether this is a known issue. Why would rendering of a
small one-page scanned PDF file take up so much memory?
Is there any way to get around this memory issue?
This isn't really an issue - this is expected. Images are expanded
internally, so a 600dpi image will use a lot of space. A 300dpi will use
4 times as much space as a 600dpi image.
You don't really need 600dpi, unless you want high quality OCR. My
clients request 200 or 300dpi.
Tilman
The following is the error message:
Exception in thread "ViewerLoadDocumentThread" java.lang.OutOfMemoryError: Java
heap space
at java.awt.image.DataBufferByte.<init>(DataBufferByte.java:92)
at
java.awt.image.ComponentSampleModel.createDataBuffer(ComponentSampleModel.java:445)
at
sun.awt.image.ByteInterleavedRaster.<init>(ByteInterleavedRaster.java:90)
at
sun.awt.image.ByteInterleavedRaster.createCompatibleWritableRaster(ByteInterleavedRaster.java:1281)
at
sun.awt.image.ByteInterleavedRaster.createCompatibleWritableRaster(ByteInterleavedRaster.java:1292)
at org.apache.pdfbox.filter.DCTFilter.fromBGRtoRGB(DCTFilter.java:246)
at org.apache.pdfbox.filter.DCTFilter.decode(DCTFilter.java:171)
at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:69)
at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:162)
at
org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:235)
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.<init>(PDImageXObject.java:160)
at
org.apache.pdfbox.pdmodel.graphics.PDXObject.createXObject(PDXObject.java:70)
at
org.apache.pdfbox.pdmodel.PDResources.getXObject(PDResources.java:409)
at
org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:53)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:486)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:460)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:189)
at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:145)
at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:68)
Thanks in advance,
Michael Ng
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]