[ 
https://issues.apache.org/jira/browse/PDFBOX-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386931#comment-14386931
 ] 

Tilman Hausherr commented on PDFBOX-2730:
-----------------------------------------

{quote}
I fixed another pdf large split size by doing
annotation.getDictionary().removeItem(COSName.PARENT);
{quote}
Yeah, you're closer to the cause of the problem than I was... some more 
research on the page 2 file:
{code}
4 0 obj
<<
/Annots [5 0 R 6 0 R 7 0 R 8 0 R 9 0 R 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R
15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 23 0 R 24 0 R
25 0 R 26 0 R 27 0 R 28 0 R 29 0 R 30 0 R 31 0 R 32 0 R]
...
/Type /Page
>>
endobj
5 0 obj
<<
...
/P 36 0 R        <======================
...
/Type /Annot
>>
endobj
{code}
Annotation object 5 claims object 36 as its parent. But object 36 is the wrong 
parent:
{code}
36 0 obj
<<
/Annots [5 0 R 6 0 R 7 0 R 8 0 R 9 0 R 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R
15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 23 0 R 24 0 R
25 0 R 26 0 R 27 0 R 28 0 R 29 0 R 30 0 R 31 0 R 32 0 R]
...
/Parent 43 0 R
...
/Type /Page
>>
endobj

43 0 obj
<<
/Count 6
/Kids [46 0 R 36 0 R 47 0 R 48 0 R 49 0 R 50 0 R]
/Parent 51 0 R
/Type /Pages
>>
endobj
{code}
This suggests that when splitting, the parent of an annotation isn't updated in 
Splitter.processAnnotations(). One could add
{code}
if (annotation.getPage != null)
{
    annotation.setPage(imported);
}
{code}
but I'm reluctant to touch that: the structures are not cloned IIRC, so I'm 
afraid to mess with the source page.

And just deleting the parent isn't always right... from the spec:
{quote}
(Optional except as noted below; PDF 1.3; not used in FDF files) An indirect 
reference to the page object with which this annotation is associated.
This entry shall be present in screen annotations associated with rendition 
actions (PDF 1.5; see 12.5.6.18, “Screen Annotations” and 12.6.4.13, “Rendition 
Actions”).
{quote}
Over to you, [~lehmi]. You probably know more, as you worked on this in 
PDFBOX-785 which is a similar issue.

> PDFSplit slow
> -------------
>
>                 Key: PDFBOX-2730
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2730
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 2.0.0
>            Reporter: simon steiner
>         Attachments: document-2.pdf
>
>
> PDF from PDFBOX-1298
> java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFSplit 
> document.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to