I use apache pfdbox 1.8.9. I have one page pdf file which contains text and I
want to convert this page to image. This pdf file I did via Libre Office. I use
the following code:
PDDocument document =PDDocument.loadNonSeq(newFile(filename),null);
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
int page =0;for(PDPage pdPage : pdPages){
++page;
BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB,300);
ImageIOUtil.writeImage(bim,"png","/home/file"+"-"+ page,300);
}
document.close();
The code works, I get png image. The problem is that there are a lot of strange
extra symbols which make it very difficult to read the text. How to fix it?
The image is here http://i.stack.imgur.com/OUyLO.png
--
Александр Свиридов