[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

Michael Klink (Jira) Mon, 26 Oct 2020 11:44:21 -0700


    [ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17220923#comment-17220923
 ]


Michael Klink commented on PDFBOX-4297:
---------------------------------------

You cannot guarantee that you need less than 5 MB.

For example, one can simply blow up the *Catalog* object alone to more than 5 
MB by adding a lot of simple entries whose sizes add up to more than 5 MB.

This example is not a common case, but if you have to handle arbitrary inputs 
from the wild, you have to keep this possibility in mind as base of a possible 
DOS attack.

> Allow to space efficiently analyse large PDFs
> ---------------------------------------------
>
>                 Key: PDFBOX-4297
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4297
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Parsing
>            Reporter: Ralf Hauser
>            Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

Reply via email to