[
https://issues.apache.org/jira/browse/PDFBOX-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164861#comment-17164861
]
Tilman Hausherr commented on PDFBOX-4921:
-----------------------------------------
Sadly there is more to this, you would also have to adjust the transform in
content stream. This is currently like this:
{noformat}
q
14400 0 0 11286.311 0 0 cm
/Img1 Do
Q {noformat}
You could set the mediabox to A4 landscape (or maybe letter?) and adjust the
content stream to that (not sure of the values, the scan is obviously higher
than 72 dpi, I suspect it is 200 dpi), but IMHO the easiest would be that you
take that PDF and "print" it into another PDF and use that one. That is much
less work than changing the content stream. (or is this for many similar
files?) And contact the person/vendor who created that PDF and ask why an
invoice must have a size of 5m x 4m.
> java.lang.OutOfMemoryError: Java heap space when convertif large pdf to tiff
> ----------------------------------------------------------------------------
>
> Key: PDFBOX-4921
> URL: https://issues.apache.org/jira/browse/PDFBOX-4921
> Project: PDFBox
> Issue Type: Bug
> Components: Rendering
> Affects Versions: 2.0.20
> Environment: Java Version: 11
> Java Runtime Version: 11+28
> Java Home: OpenJDK11_x64
> Java Vendor: Oracle Corporation
> Java Vendor URL: http://java.oracle.com/
> Reporter: Florent Juillet
> Priority: Major
> Attachments: After Orientation.pdf
>
>
> Hello,
> I am faced to this issue when i want to convert only the first page of a pdf
> image to a tiff image file.
> This is my java method :
> {code:java}
> private static ByteArrayOutputStream extractFirstPageAsTiff(File pdfsource)
> throws IOException {
> ByteArrayOutputStream out = new ByteArrayOutputStream();
> ImageOutputStream imageOut = new MemoryCacheImageOutputStream(out);
> // Load the PDF
> try (PDDocument pdf = PDDocument.load(pdfsource)) {
> // Initialize PDF renderer
> PDFRenderer ren = new PDFRenderer(pdf);
> // Setup Image Writer
> ImageWriter writer = ImageIO.getImageWritersBySuffix("tiff").next();
> writer.setOutput(imageOut);
> // Setup Image Writer Parameters
> ImageWriteParam params = writer.getDefaultWriteParam();
> params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
> params.setCompressionType(COMPRESSION_TYPE_GROUP4FAX);
> // Writer pages to the image writer
> // capture the page image to file
> BufferedImage src = ren.renderImageWithDPI(0, RESOLUTION);
> int[] cmap = new int[] { 0xFF000000, 0xFFFFFFFF };
> BufferedImage src4BitColourDepth = ConvertUtil.convert4(src, cmap);
> // Prepare the Image
> Writer writer.prepareWriteSequence(null);
> writer.writeToSequence(new IIOImage(src4BitColourDepth, null, null),
> params);
> // End Writer Sequence
> writer.endWriteSequence();
> imageOut.close();
> }
> return out;
> }
> {code}
>
> Produce this stack :
> {noformat}
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap
> spaceException in thread "main" java.lang.OutOfMemoryError: Java heap space
> at java.desktop/java.awt.image.DataBufferInt.<init>(DataBufferInt.java:75) at
> java.desktop/java.awt.image.Raster.createPackedRaster(Raster.java:467) at
> java.desktop/java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1032)
> at java.desktop/java.awt.image.BufferedImage.<init>(BufferedImage.java:324)
> at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:296)
> at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:243)
> at
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:215)
> at
> sandbox.tess4j.DetectOrientation.extractFirstPageAsTiff(DetectOrientation.java:173)
> {noformat}
> In attachement the pdf document After Orientation.pdf
>
> Run with :
> {noformat}
> Java Version: 11
> Java Runtime Version: 11+28
> Java Home: OpenJDK11_x64
> Java Vendor: Oracle Corporation
> Java Vendor URL: http://java.oracle.com/{noformat}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]