[
https://issues.apache.org/jira/browse/PDFBOX-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr updated PDFBOX-3768:
------------------------------------
Labels: optimization (was: )
> Optimize SampledImageReader.from1Bit()
> --------------------------------------
>
> Key: PDFBOX-3768
> URL: https://issues.apache.org/jira/browse/PDFBOX-3768
> Project: PDFBox
> Issue Type: Improvement
> Components: Rendering
> Affects Versions: 2.0.5
> Reporter: Tilman Hausherr
> Assignee: Tilman Hausherr
> Labels: optimization
> Fix For: 2.0.6, 3.0.0
>
>
> The from1bit() path passes a raster to {{colorSpace.toRGBImage(raster)}}
> where an RGB BufferedImage is created, which means a big memory footprint for
> scanned images.
> I tried optimizing by using the raster to create smaller BufferedImages.
> Instead of calling {{colorSpace.toRGBImage(raster)}} where the raster would
> be copied into an RGB image, I did this:
> {code}
> byte[] indexedValues = new byte[] { 0, (byte)0xFF };
> ColorModel colorModel = new IndexColorModel(1, 2, indexedValues,
> indexedValues, indexedValues);
> return new BufferedImage(colorModel, raster, false, null);
> {code}
> Sadly, this resulted in a bigger memory footprint.
> Lowest possible -Xmx setting to convert a file with 300dpi A4 scans: 76m
> With the optimization: 123m
> The stack trace suggests that java copies the image to an RGB image:
> {code}
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at java.awt.image.DataBufferInt.<init>(Unknown Source)
> at java.awt.image.Raster.createPackedRaster(Unknown Source)
> at java.awt.image.DirectColorModel.createCompatibleWritableRaster(Unknown
> Source)
> at java.awt.image.BufferedImage.<init>(Unknown Source)
> at sun.java2d.loops.GraphicsPrimitive.convertFrom(Unknown Source)
> at sun.java2d.loops.GraphicsPrimitive.convertFrom(Unknown Source)
> at sun.java2d.loops.MaskBlit$General.MaskBlit(Unknown Source)
> at sun.java2d.loops.Blit$GeneralMaskBlit.Blit(Unknown Source)
> at sun.java2d.pipe.DrawImage.blitSurfaceData(Unknown Source)
> at sun.java2d.pipe.DrawImage.renderImageCopy(Unknown Source)
> at sun.java2d.pipe.DrawImage.copyImage(Unknown Source)
> at sun.java2d.pipe.DrawImage.copyImage(Unknown Source)
> at sun.java2d.pipe.ValidatePipe.copyImage(Unknown Source)
> at sun.java2d.SunGraphics2D.copyImage(Unknown Source)
> at sun.java2d.pipe.DrawImage.makeBufferedImage(Unknown Source)
> at sun.java2d.pipe.DrawImage.renderImageXform(Unknown Source)
> at sun.java2d.pipe.DrawImage.transformImage(Unknown Source)
> at sun.java2d.pipe.DrawImage.transformImage(Unknown Source)
> at sun.java2d.pipe.DrawImage.transformImage(Unknown Source)
> at sun.java2d.pipe.ValidatePipe.transformImage(Unknown Source)
> at sun.java2d.SunGraphics2D.drawImage(Unknown Source)
> at
> org.apache.pdfbox.rendering.PageDrawer.drawBufferedImage(PageDrawer.java:1007)
>
> {code}
> After I mentioned this on the dev mailing list, [~pslabycz] replied:
> {quote}
> your message caught my attention, so I could not resist to try and
> investigate it a little. I did not get too far and do not have the time to do
> any tests, but maybe at least a small hint. To at least have a chance that
> the sun java2d machinery draws the image without converting it first,
> BufferedImage.getType() must return something else than TYPE_CUSTOM. (At
> least I think so) For IndexColorModel, the raster has to be either
> BytePackedRaster or ByteComponentRaster. ByteComponentRaster resulting in
> BufferedImage type TYPE_BYTE_INDEXED is a safer bet.
> {quote}
> So I looked at the source of BufferedImage and everything created by a user
> is TYPE_CUSTOM. Thus I tried using a TYPE_BYTE_BINARY image, but I got the
> same OOM stack trace suggesting a copying is taking place. I tried getting
> drawImage in the debugger but couldn't. But a look at the source code
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/sun/java2d/pipe/DrawImage.java
> shows at line 381 that java wants a "helper" and if there isn't, then it will
> convert to RGB / ARGB. And that is what's done according to the stack trace.
> What I didn't search in the source code is what "helpers" would be available.
> Then, in an act of desperation, I tried TYPE_BYTE_GRAY. This worked! It uses
> 1 byte per pixel, thus saves 2/3 of the RGB footprint, and the intermediate
> raster.
> Minimal -Xmx setting got down to -Xmx26m.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]