[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

Tilman Hausherr (Jira) Sun, 03 Jan 2021 01:18:40 -0800


    [ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257700#comment-17257700
 ]


Tilman Hausherr commented on PDFBOX-4297:
-----------------------------------------

I've tried with your file, it takes almost two minutes to download the file. 
However after that the content is there. The rest is done in 1 second.

I'll see what happens when going with streams.

> Allow to space efficiently analyse large PDFs
> ---------------------------------------------
>
>                 Key: PDFBOX-4297
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4297
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Parsing
>            Reporter: Ralf Hauser
>            Priority: Major
>         Attachments: programWinter2015_20210103_091853-sig_LTV.pdf
>
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

Reply via email to