Am 28.09.2016 um 15:02 schrieb Manuel Fomitescu:
Hello,

I wan to use pdfbox in my project instead of the pdfl C adobe library.
For the task I have to calculate the width/height of the first page of the
document I used the following code:

PDDocument document = PDDocument.load(new File(args[0]));
PDPage pdPage = document.getPage(0);

System.out.println("PDFBOX - NoPage: " + document.getNumberOfPages());
Aystem.out.println("PDFBOX - FirstPage Height: " +
  pdPage.getMediaBox().getHeight());
System.out.println("PDFBOX - FirstPage Width: " +
pdPage.getMediaBox().getWidth());


To obtain the same thing with pdfl I run I command line with some
parameters and read the response from a file, more ugly from java code.

But the performance is a big problem.

So with a 90MB pdf document I obtained a performance of 400millisec with
pdfbox and 50millisec with pdfl. For a 1.7GB document I obtained a
performace of 47106millisec with pdfbox and 151millisec with pdfl. These
are very big differences.

The main problem is that for accessing the first page I have to load the
entire document and after that I can access the first page.
PDFL has a constructor for a document with the page parameter and loads
only that page from the document. Because of that it is working so fast

Best regards,
Manuel.


This is a known problem that can't be solved in a few hours / days. PDFBox does not "parse on demand".

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to