[ 
https://issues.apache.org/jira/browse/PDFBOX-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164861#comment-17164861
 ] 

Tilman Hausherr commented on PDFBOX-4921:
-----------------------------------------

Sadly there is more to this, you would also have to adjust the transform in 
content stream. This is currently like this:
{noformat}
q
  14400 0 0 11286.311 0 0 cm
  /Img1 Do
Q {noformat}
You could set the mediabox to A4 landscape (or maybe letter?) and adjust the 
content stream to that (not sure of the values, the scan is obviously higher 
than 72 dpi, I suspect it is 200 dpi), but IMHO the easiest would be that you 
take that PDF and "print" it into another PDF and use that one. That is much 
less work than changing the content stream. (or is this for many similar 
files?) And contact the person/vendor who created that PDF and ask why an 
invoice must have a size of 5m x 4m.

> java.lang.OutOfMemoryError: Java heap space when convertif large pdf to tiff
> ----------------------------------------------------------------------------
>
>                 Key: PDFBOX-4921
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4921
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.20
>         Environment: Java Version: 11
> Java Runtime Version: 11+28
> Java Home: OpenJDK11_x64
> Java Vendor: Oracle Corporation
> Java Vendor URL: http://java.oracle.com/
>            Reporter: Florent Juillet
>            Priority: Major
>         Attachments: After Orientation.pdf
>
>
> Hello, 
> I am faced to this issue when i want to convert only the first page of a pdf 
> image to a tiff image file.
> This is my java method : 
> {code:java}
> private static ByteArrayOutputStream extractFirstPageAsTiff(File pdfsource) 
> throws IOException {
>  ByteArrayOutputStream out = new ByteArrayOutputStream();
>  ImageOutputStream imageOut = new MemoryCacheImageOutputStream(out);
>  // Load the PDF 
>  try (PDDocument pdf = PDDocument.load(pdfsource)) {
>    // Initialize PDF renderer
>    PDFRenderer ren = new PDFRenderer(pdf);
>    // Setup Image Writer
>    ImageWriter writer = ImageIO.getImageWritersBySuffix("tiff").next();
>    writer.setOutput(imageOut);
>    // Setup Image Writer Parameters
>    ImageWriteParam params = writer.getDefaultWriteParam();
>    params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
>    params.setCompressionType(COMPRESSION_TYPE_GROUP4FAX);
>    // Writer pages to the image writer
>    // capture the page image to file
>    BufferedImage src = ren.renderImageWithDPI(0, RESOLUTION);
>    int[] cmap = new int[] { 0xFF000000, 0xFFFFFFFF };
>    BufferedImage src4BitColourDepth = ConvertUtil.convert4(src, cmap);
>    // Prepare the Image
>    Writer writer.prepareWriteSequence(null);
>    writer.writeToSequence(new IIOImage(src4BitColourDepth, null, null), 
> params);
>    // End Writer Sequence
>    writer.endWriteSequence();
>    imageOut.close();
>   }
>   return out;
> }
> {code}
>  
> Produce this stack : 
> {noformat}
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap 
> spaceException in thread "main" java.lang.OutOfMemoryError: Java heap space 
> at java.desktop/java.awt.image.DataBufferInt.<init>(DataBufferInt.java:75) at 
> java.desktop/java.awt.image.Raster.createPackedRaster(Raster.java:467) at 
> java.desktop/java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1032)
>  at java.desktop/java.awt.image.BufferedImage.<init>(BufferedImage.java:324) 
> at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:296) 
> at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:243) 
> at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:215)
>  at 
> sandbox.tess4j.DetectOrientation.extractFirstPageAsTiff(DetectOrientation.java:173)
> {noformat}
> In attachement the pdf document After Orientation.pdf
>  
> Run with : 
> {noformat}
> Java Version: 11
> Java Runtime Version: 11+28
> Java Home: OpenJDK11_x64
> Java Vendor: Oracle Corporation
> Java Vendor URL: http://java.oracle.com/{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to