I use apache pfdbox 1.8.9. I have one page pdf file which contains text and I 
want to convert this page to image. This pdf file I did via Libre Office. I use 
the following code:
PDDocument document =PDDocument.loadNonSeq(newFile(filename),null);
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
int page =0;for(PDPage pdPage : pdPages){
++page;
BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB,300);
ImageIOUtil.writeImage(bim,"png","/home/file"+"-"+ page,300);
} 
document.close();
The code works, I get png image. The problem is that there are a lot of strange 
extra symbols which make it very difficult to read the text. How to fix it? 
The image is here  http://i.stack.imgur.com/OUyLO.png



-- 
Александр Свиридов

Reply via email to