Hi, all,


I tried to convert a scanned image file (see attached: original_image.png) into 
a pdf(see attached: converted_pdf) file by using the example ImageToPdf code. 
it actually works very well after some adjustment, however, the converted pdf 
still keep some grey, or dark color marks, is there any way to clean it? I saw 
some commercial software which can scan a homedepot receipt into a very clean 
pdf, not sure if PDFBox can do the same thing? maybe have to get some OCR 
package to further process it?


I also copied the code i used below. The PDFBox version is: pdfbox.2.0.9


thanks for any comment,


Arthur


*****************************

try (PDDocument doc = new PDDocument())
        {
            PDPage page = new PDPage();
            doc.addPage(page);

            PDImageXObject pdImage = PDImageXObject.createFromFile(imagePath, 
doc);

            // draw the image at full size at (x=20, y=20)
            try (PDPageContentStream contents = new PDPageContentStream(doc, 
page))
            {

                 contents.drawImage(pdImage, -20, -80, pdImage.getWidth() / 2, 
pdImage.getHeight() / 2);
            }
            doc.save(pdfPath);



*****************************

Reply via email to