[ 
https://issues.apache.org/jira/browse/PDFBOX-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573416#comment-17573416
 ] 

Andreas Lehmkühler commented on PDFBOX-5462:
--------------------------------------------

To sum it up, 2.0.x did some caching under the hood which was removed in 3.0.x. 
The 3.0.x code base was refactored many aspects so that the memory foot print 
was optimized. However, this might still lead to an OOM-exception in some 
corner cases if the 2.0.x code isn't adjusted to those changes of behaviour 
when porting it to 3.0.x. 

The good news is, in many cases one is able to "simulate" that 2.0.x behaviour 
when using an InputStream.

* {{MemoryUsageSetting.setupMainMemoryOnly()}} -> use 
{{org.apache.pdfbox.io.RandomAccessReadBuffer}}, it copies the whole 
InputStream to the memory. This works fine for small files 
* {{MemoryUsageSetting.setupTempFileOnly()()}} -> copy the InputStream to a 
(temp-) file and use {{org.apache.pdfbox.io.RandomAccessReadBufferedFile}} or 
{{org.apache.pdfbox.io.RandomAccessReadMemoryMappedFile}}. This is the right 
choice for bigger files and/or environments with limited resources 

Only the third case {{org.apache.pdfbox.io.MemoryUsageSetting.setupMixed()}} 
the mixed usage of both isn't supported any more. You have to decide which one 
to use or have to provide your own cache by implementing a class using the 
interface {{org.apache.pdfbox.io.RandomAccessRead}}.

> OutOfMemoryError when watermaking in 3.0.0-RC1
> ----------------------------------------------
>
>                 Key: PDFBOX-5462
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5462
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 3.0.0 PDFBox
>            Reporter: Marian Ion
>            Priority: Major
>         Attachments: TestPdfBox.tgz, my-pdf-test.jar
>
>
> I am using the Maven *3.0.0-RC1* version and I encounter the following error 
> when watermarking a 5120 pages file:
> {quote} java.lang.OutOfMemoryError: Java heap space: failed reallocation of 
> scalar replaced objects
> {quote}
>  
> However, the *2.0.26* version code works without problem!
> The code is basically this :
> {code:java}
> private static final PDFont PDF_FONT = PDType1Font.HELVETICA;
> memoryUsageSetting = MemoryUsageSetting.setupMixed(2 * ONE_GIGA, 40 * 
> ONE_GIGA);
> //try (PDDocument pdfDocument = PDDocument.load(is, memoryUsageSetting)) {  
> // 2.0.26
> try (PDDocument pdfDocument = Loader.loadPDF(inputStream, 
> memoryUsageSetting)) { // 3.0.0-RC1
>       int nbPages = addWatermark(watermarkText, pdfDocument);
>       pdfDocument.save(os);
> }
> ...
> private int addWatermark(String watermarkText, PDDocument document) throws 
> IOException {
>       int numberOfPages = document.getNumberOfPages();
>       System.out.printf("Start adding watermark on a %d pages PDF 
> document%n", numberOfPages);
>       long start = System.nanoTime();
>       int pageIndex = 0;
>       for(PDPage page : document.getPages()) {
>               ++pageIndex;
>               try (PDPageContentStream cs = new PDPageContentStream(document, 
> page, PDPageContentStream.AppendMode.APPEND, true, true)) {
>                       float width = page.getMediaBox().getWidth();
>                       float height = page.getMediaBox().getHeight();
>                       int rotation = page.getRotation();
>                       switch(rotation) {
>                               case 90:
>                                       width = page.getMediaBox().getHeight();
>                                       height = page.getMediaBox().getWidth();
>                                       
> cs.transform(Matrix.getRotateInstance(Math.toRadians(90), height, 0));
>                                       break;
>                               case 180:
>                                       
> cs.transform(Matrix.getRotateInstance(Math.toRadians(180), width, height));
>                                       break;
>                               case 270:
>                                       width = page.getMediaBox().getHeight();
>                                       height = page.getMediaBox().getWidth();
>                                       
> cs.transform(Matrix.getRotateInstance(Math.toRadians(270), 0, width));
>                                       break;
>                               default:
>                                       break;
>               }
>               double stringWidth = 
> (double)PDF_FONT.getStringWidth(watermarkText) / 1000 * FONT_HEIGHT;
>               double diagonalLength = Math.sqrt((double)width * width + 
> (double)height * height);
>               double angle = Math.atan2(height, width);
>               cs.transform(Matrix.getRotateInstance(angle, 0, 0));
>               cs.setFont(PDF_FONT, (float)FONT_HEIGHT);
>               //cs.setRenderingMode(RenderingMode.STROKE); // for "hollow" 
> effect
>               PDExtendedGraphicsState gs = new PDExtendedGraphicsState();
>               gs.setNonStrokingAlphaConstant(0.2f);
>               gs.setStrokingAlphaConstant(0.2f);
>               gs.setBlendMode(BlendMode.MULTIPLY);
>               cs.setGraphicsStateParameters(gs);
>               // some API weirdness here. When int, range is 0..255.
>               // when float, this would be 0..1f
>               cs.setNonStrokingColor(0f, 0, 0);
>               cs.setStrokingColor(0f, 0, 0); // black
>               float x = (float)((diagonalLength - stringWidth) / 2); // 
> "horizontal" position in rotated world
>               float y = (float)(-FONT_HEIGHT / 4); // 4 is a trial-and-error 
> thing, this lowers the text a bit
>               cs.beginText();
>               cs.newLineAtOffset(x, y);
>               cs.showText(watermarkText);
>               cs.endText();
>       } finally {
>                               ...
>       }
>       return numberOfPages;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to