[ https://issues.apache.org/jira/browse/PDFBOX-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844659#comment-17844659 ]
Tilman Hausherr commented on PDFBOX-5816: ----------------------------------------- That's the JPEG2000 inkblot problem. You need to use the latest jpeg2000 decoder, i.e. 1.4.0 https://pdfbox.apache.org/2.0/dependencies.html https://github.com/jai-imageio/jai-imageio-jpeg2000/ https://stackoverflow.com/questions/41977536/black-stain-when-extracting-page-to-image-on-pdfbox-2-0-4 > PDFRenderer.renderImage creates black line clouds on the text > ------------------------------------------------------------- > > Key: PDFBOX-5816 > URL: https://issues.apache.org/jira/browse/PDFBOX-5816 > Project: PDFBox > Issue Type: Bug > Components: PDModel, Rendering > Affects Versions: 2.0.30, 2.0.31 > Environment: Java 8 > Reporter: ExSp > Priority: Major > Labels: JPEG2000, JPXDecode, JPXFilter > Attachments: image003.png, image0032.png > > > For some PDF files, the PDFRenderer.renderImage method creates an image with > black line clouds on the text that are not visible in the original PDF. > Unfortunately, the files contain personal data, so I cannot make them > available for examination. The attached screenshots hopefully give a first > impression of the problem. Is there any way for me to narrow down the problem > or analyze the PDF so that I can provide more information? > The source code does something like this: > {code:java} > try (PDDocument document = PDDocument.load(pdfData)) { > PDFRenderer pdfRenderer = new PDFRenderer(document); > int pageCount = document.getNumberOfPages(); > for (int pageIndex = 0; pageIndex < pageCount; ++pageIndex) { > PDPage page = document.getPage(pageIndex); > BufferedImage pageImage = pdfRenderer.renderImage(pageIndex); > ... > } > }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org