Re: PdfBox - extra symbols when converting pdf page to image

Tilman Hausherr Sun, 28 Jun 2015 04:33:55 -0700

Already answered in
https://stackoverflow.com/questions/31097691/extra-symbols-when-converting-pdf-to-image-with-pdfbox


feel free to ask additional questions about the 2.0 version. See also
https://pdfbox.apache.org/downloads.html#scm
and
https://pdfbox.apache.org/2.0/getting-started.html

Tilman

Am 28.06.2015 um 10:49 schrieb Александр Свиридов:

I use apache pfdbox 1.8.9. I have one page pdf file which contains text and I 
want to convert this page to image. This pdf file I did via Libre Office. I use 
the following code:
PDDocument document =PDDocument.loadNonSeq(newFile(filename),null);
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
int page =0;for(PDPage pdPage : pdPages){
++page;
BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB,300);
ImageIOUtil.writeImage(bim,"png","/home/file"+"-"+ page,300);
}
document.close();
The code works, I get png image. The problem is that there are a lot of strange 
extra symbols which make it very difficult to read the text. How to fix it?
The image is here  http://i.stack.imgur.com/OUyLO.png



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: PdfBox - extra symbols when converting pdf page to image

Reply via email to