[
https://issues.apache.org/jira/browse/PDFBOX-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386931#comment-14386931
]
Tilman Hausherr commented on PDFBOX-2730:
-----------------------------------------
{quote}
I fixed another pdf large split size by doing
annotation.getDictionary().removeItem(COSName.PARENT);
{quote}
Yeah, you're closer to the cause of the problem than I was... some more
research on the page 2 file:
{code}
4 0 obj
<<
/Annots [5 0 R 6 0 R 7 0 R 8 0 R 9 0 R 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R
15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 23 0 R 24 0 R
25 0 R 26 0 R 27 0 R 28 0 R 29 0 R 30 0 R 31 0 R 32 0 R]
...
/Type /Page
>>
endobj
5 0 obj
<<
...
/P 36 0 R <======================
...
/Type /Annot
>>
endobj
{code}
Annotation object 5 claims object 36 as its parent. But object 36 is the wrong
parent:
{code}
36 0 obj
<<
/Annots [5 0 R 6 0 R 7 0 R 8 0 R 9 0 R 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R
15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 23 0 R 24 0 R
25 0 R 26 0 R 27 0 R 28 0 R 29 0 R 30 0 R 31 0 R 32 0 R]
...
/Parent 43 0 R
...
/Type /Page
>>
endobj
43 0 obj
<<
/Count 6
/Kids [46 0 R 36 0 R 47 0 R 48 0 R 49 0 R 50 0 R]
/Parent 51 0 R
/Type /Pages
>>
endobj
{code}
This suggests that when splitting, the parent of an annotation isn't updated in
Splitter.processAnnotations(). One could add
{code}
if (annotation.getPage != null)
{
annotation.setPage(imported);
}
{code}
but I'm reluctant to touch that: the structures are not cloned IIRC, so I'm
afraid to mess with the source page.
And just deleting the parent isn't always right... from the spec:
{quote}
(Optional except as noted below; PDF 1.3; not used in FDF files) An indirect
reference to the page object with which this annotation is associated.
This entry shall be present in screen annotations associated with rendition
actions (PDF 1.5; see 12.5.6.18, “Screen Annotations” and 12.6.4.13, “Rendition
Actions”).
{quote}
Over to you, [~lehmi]. You probably know more, as you worked on this in
PDFBOX-785 which is a similar issue.
> PDFSplit slow
> -------------
>
> Key: PDFBOX-2730
> URL: https://issues.apache.org/jira/browse/PDFBOX-2730
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 2.0.0
> Reporter: simon steiner
> Attachments: document-2.pdf
>
>
> PDF from PDFBOX-1298
> java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFSplit
> document.pdf
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]