Nicolò Rossi created PDFBOX-5815: ------------------------------------ Summary: Can't split the document into individual pages Key: PDFBOX-5815 URL: https://issues.apache.org/jira/browse/PDFBOX-5815 Project: PDFBox Issue Type: Bug Affects Versions: 3.0.2 PDFBox Reporter: Nicolò Rossi Fix For: 3.0.3 PDFBox Attachments: CTU.pdf
If I try to split a document, containing links to internal pages, by single page, Splitter class throws {*}NPE{*}. This is our code: {code:java} PDDocument pdfDocument = Loader.loadPDF(new File("path/to/file.pdf")); List<PDDocument> splitted = splitter.split(pdfDocument); {code} This the exception: {code:java} java.lang.NullPointerException: Cannot invoke "org.apache.pdfbox.pdmodel.PDPage.getCOSObject()" because the return value of "org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageDestination.getPage()" is null at org.apache.pdfbox.multipdf.Splitter.fixDestinations(Splitter.java:153) at org.apache.pdfbox.multipdf.Splitter.split(Splitter.java:136){code} I search for the error and i see that it breaks in splitter class in +{color:#172b4d}_fixDestinations_{color}+ {color:#172b4d}method.{color} {color:#172b4d}I report here the method definition:{color} {code:java} private void fixDestinations(PDDocument destinationDocument) { PDPageTree pageTree = destinationDocument.getPages(); for (PDPageDestination pageDestination : destToFixSet) { COSDictionary srcPageDict = pageDestination.getPage().getCOSObject(); COSDictionary dstPageDict = pageDictMap.get(srcPageDict); PDPage dstPage = new PDPage(dstPageDict); // Find whether destination is inside or outside if (pageTree.indexOf(dstPage) >= 0) { pageDestination.setPage(dstPage); } else { pageDestination.setPage(null); } } } {code} h2. What's the problem: _+pageDestination.getPage()+_ returns null because the document contains links to internal pages, so splitting by page there is no more valid page to link in the result splitted document. h2. Possible solution: check the page returned and if null set +_pageDestination_+ to null, I could suggest something like this: {code:java} private void fixDestinations(PDDocument destinationDocument) { PDPageTree pageTree = destinationDocument.getPages(); for (PDPageDestination pageDestination : destToFixSet) { PDPage srcPage = pageDestination.getPage(); if (srcPage != null){ COSDictionary srcPageDict = srcPage.getCOSObject(); COSDictionary dstPageDict = pageDictMap.get(srcPageDict); PDPage dstPage = new PDPage(dstPageDict); // Find whether destination is inside or outside if (pageTree.indexOf(dstPage) >= 0) { pageDestination.setPage(dstPage); } else { pageDestination.setPage(null); } } else { pageDestination.setPage(null); } } } {code} I've attached example file, thanks. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org