amiladi created PDFBOX-4530:
-------------------------------

             Summary: PDFRenderer adding horizental white lines to exported 
image
                 Key: PDFBOX-4530
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4530
             Project: PDFBox
          Issue Type: Bug
          Components: Rendering
    Affects Versions: 2.0.15
            Reporter: amiladi
         Attachments: PdfBoxTestCase.zip, page-3-1.jpeg, page-3.pdf, 
page-7-1.jpeg, page-7.pdf

Hello,

I started using pdfbox recently to extract a datamatrix code from a pdf file.

The image extraction works pretty fine.

We found out that the source of the pdfs is not attaching them neither as 
embedded objects or inline image, the datamatrix is coded in the pdf as black 
squares.

Then, the idea was to convert the pdf to an image and parse the code.

Only problem, the conversion sometimes add white lines inside the datamatrix 
which makes the it unparsable (see attachements page-3-1.jpeg and page-3.pdf)

For some other cases, the datamatrix squares differ in size in the exported 
image while they are the same in the original pdf file (see attachements 
page-7.jpeg and page-7-1.pdf).

The outcome is the same and the parser is not able to recognize the datamatrix 
content.

The code I am using to convert to BufferedImage is pretty straightforward
{code:java}
// code placeholder
{code}
                BufferedImage bi = new 
PDFRenderer(pdDocument).renderImageWithDPI(i, 600, ImageType.BINARY);

 

Is it the way I am using the renderer which causing this problem or simply a 
bug in the software!

 

I am attaching the test project reproducing the behavior.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to