[ 
https://issues.apache.org/jira/browse/PDFBOX-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843776#comment-17843776
 ] 

Nicolò Rossi commented on PDFBOX-5815:
--------------------------------------

Sorry i didn't see it, thanks!

> Can't split the document into individual pages
> ----------------------------------------------
>
>                 Key: PDFBOX-5815
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5815
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 3.0.2 PDFBox
>            Reporter: Nicolò Rossi
>            Priority: Critical
>              Labels: Links, java.lang.nullPointerException, link, split
>         Attachments: CTU.pdf
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> If I try to split a document, containing links to internal pages, by single 
> page, Splitter class throws {*}NPE{*}.
>  
> This is our code:
>  
> {code:java}
> PDDocument pdfDocument = Loader.loadPDF(new File("path/to/file.pdf"));
> List<PDDocument> splitted = splitter.split(pdfDocument); {code}
>  
> This the exception:
>  
> {code:java}
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.pdfbox.pdmodel.PDPage.getCOSObject()" because the return value of 
> "org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageDestination.getPage()"
>  is null
>     at org.apache.pdfbox.multipdf.Splitter.fixDestinations(Splitter.java:153)
>     at org.apache.pdfbox.multipdf.Splitter.split(Splitter.java:136){code}
>  
> I search for the error and i see that it breaks in splitter class in 
> +{color:#172b4d}_fixDestinations_{color}+ {color:#172b4d}method.{color}
>  
> {color:#172b4d}I report here the method definition:{color}
> {code:java}
> private void fixDestinations(PDDocument destinationDocument)
> {
>     PDPageTree pageTree = destinationDocument.getPages();
>     for (PDPageDestination pageDestination : destToFixSet)
>     {
>         COSDictionary srcPageDict = pageDestination.getPage().getCOSObject();
>         COSDictionary dstPageDict = pageDictMap.get(srcPageDict);
>         PDPage dstPage = new PDPage(dstPageDict);
>         // Find whether destination is inside or outside
>         if (pageTree.indexOf(dstPage) >= 0)
>         {
>             pageDestination.setPage(dstPage);
>         }
>         else
>         {
>             pageDestination.setPage(null);
>         }
>     }
> } {code}
> h2. What's the problem:
> _+pageDestination.getPage()+_ returns null because the document contains 
> links to internal pages, so splitting by page there is no more valid page to 
> link in the result splitted document.
>  
> h2. Possible solution:
> check the page returned and if null set +_pageDestination_+ to null, I could 
> suggest something like this:
>  
> {code:java}
> private void fixDestinations(PDDocument destinationDocument)
> {
>     PDPageTree pageTree = destinationDocument.getPages();
>     for (PDPageDestination pageDestination : destToFixSet)
>     {
>         PDPage srcPage = pageDestination.getPage();
>         if (srcPage != null){
>             COSDictionary srcPageDict = srcPage.getCOSObject();
>             COSDictionary dstPageDict = pageDictMap.get(srcPageDict);
>             PDPage dstPage = new PDPage(dstPageDict);
>             // Find whether destination is inside or outside
>             if (pageTree.indexOf(dstPage) >= 0)
>             {
>                 pageDestination.setPage(dstPage);
>             }
>             else
>             {
>                 pageDestination.setPage(null);
>             }
>         }
>         else
>         {
>             pageDestination.setPage(null);
>         }
>     }
> } {code}
>  
> I've attached example file, thanks.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to