Re: Issue with PDF - Image conversion

Andreas Lehmkuehler Tue, 11 Jun 2013 05:08:32 -0700

Hi,

Am 10.06.2013 11:15, schrieb Robin Thomas Panicker:

Thanks a lot Gilad, for responding. I was not sure on what more information
to provide. Now that you have asked me the specific details, let me provide
you with more information.


I am using the below code to do the conversion of PDF - image. (Trying to
save the first page of the pdf as an image file)

  String pdfFile ="d:/hs/4.pdf";
  document = PDDocument.load( pdfFile );

             List pages = document.getDocumentCatalog().getAllPages();
             PDPage page = ( PDPage ) pages.get( 0 );
             int width = ( int ) page.getArtBox().getWidth();
             int height = ( int ) page.getArtBox().getHeight();
             BufferedImage image = page.convertToImage( imageType,
resolution );


On a machine (prod server) where the conversion DOES NOT work, I have
Ubuntu 12.4, open office 3.0
while on a machine (development machine) where the conversion works, I have
Ubuntu 10.10 and open office 3.0

On both the machines I am using the same code and version of PDFBox on both
is 1.8.1

The issue that I face is that the image conversion simply doesnt work
correctly ( I can see parts of image / text garbled, or missing) There is
no error or warning on the log outputs.

Please let me know if I can provide you with any more information in
understanding the problem

Without a sample pdf this is just a guess:

The fact that you are using open office 3.0 leads to the assumption that the pdf
in question contains fonts as embedded subsets. Those are not fully supported
by PDFBox. There are different issues with those kind of fonts.
As you are using different platforms (Ubuntu 10.10 vs 12.04) you are most likely
using different versions of the JDK (1.6 vs 1.7). There are some 1.7 specific
issues with embedded font subsets.

Thanks,
Robin



On Mon, Jun 10, 2013 at 2:25 PM, Gilad Denneboom
<[email protected]>wrote:

A lof of information missing, there... How are you converting the PDF
files, exactly? What type of problems do you encounter? Which version of
PDFBox do you use? And what does it have to do with your Office suite

Without more information it's impossible to help you with your problem.


On Mon, Jun 10, 2013 at 8:22 AM, Robin Thomas Panicker <[email protected]

wrote:

Hi,
          I am using PDFBox to convert PDF documents into images. However

in

some machines I am facing an issue. The conversion does not happen

correct.

I can see missing text / images etc.

Please note that this happens only in a few machines. I use Ubuntu and
OpenOffice. I have tried with a variety of combinations for difference
version of Ubuntu and Openoffice (and even LibreOffice)

However I am unable to find out why it does not work on some machines.

Can anyone please help?

Thanks,
Robin


BR
Andreas Lehmkühler

Re: Issue with PDF - Image conversion

Reply via email to