Marcus Korinth created PDFBOX-5809:
--------------------------------------

             Summary: PDDocument#importPage slowed down by factor 1300
                 Key: PDFBOX-5809
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5809
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 3.0.2 PDFBox
            Reporter: Marcus Korinth


We are using the *PDDocument#importPage* Method in our own splitter where we 
split pages from a _SourceDocument_ to a _TargetDocument_. In order to do so we 
first extract the page by using the following code:
{code:java}
final PDPage sourcePage = sourceDocument.getPage(pageNumber);
{code}

Immediatly afterwards we are calling:
{code:java}
final PDPage targetPage = targetDocument.importPage(sourcePage);
{code}

This approach worked just fine with *pdfbox 2.0.26*.
We decided to upgrade to version *3.0.2* since it takles a lot of the problems.

Unfortunately the *PDDocument#importPage* method slowed down by around 1300 
times. In Version *2.0.26* it took 15ms in an average. With the latest *3.0.2* 
it takes 20000 ms in average. That is a huge deal breaker as we usually have to 
split documents which have several thousand pages.

Note: The same applies when using *PDDocument#addPage*.
Note: The problem does not appear in *3.0.1*. But we can't use that since it 
has other major problems which breaks our application.

I have prepared an example document with which you can replicate the issue. Due 
to the file size limitation I had to prepare a WeTransfer-Link for you: 
https://we.tl/t-lfN2wz7cAs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to