[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2021-01-03 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257709#comment-17257709
 ] 

Tilman Hausherr edited comment on PDFBOX-4297 at 1/3/21, 10:55 AM:
---

I was able to check your file with streams in the same amount of time. Btw the 
proposed method doesn't work because it returns a closed input stream. There is 
more work to do, the method you mentioned, and then the check for 
"adbe.x509.rsa_sha1" (update: turns out that this one isn't part of our 
repository, because it contains code from another project).


was (Author: tilman):
I was able to check your file with streams in the same amount of time. Btw the 
proposed method doesn't work because it returns a closed input stream. There is 
more work to do, the method you mentioned, and then the check for 
"adbe.x509.rsa_sha1".

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
> Attachments: programWinter2015_20210103_091853-sig_LTV.pdf
>
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2021-01-03 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257709#comment-17257709
 ] 

Tilman Hausherr edited comment on PDFBOX-4297 at 1/3/21, 10:37 AM:
---

I was able to check your file with streams in the same amount of time. Btw the 
proposed method doesn't work because it returns a closed input stream. There is 
more work to do, the method you mentioned, and then the check for 
"adbe.x509.rsa_sha1".


was (Author: tilman):
I was able to check your file with streams in the same amount of time. Btw the 
proposed method doesn't work because it returns a closed input stream. There is 
more work to do, the method you mentioned, and then the check for 
"adbe.x509.rsa_sha1", and {{Signature.update()}} doesn't work with input 
streams.

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
> Attachments: programWinter2015_20210103_091853-sig_LTV.pdf
>
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2021-01-03 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257709#comment-17257709
 ] 

Tilman Hausherr edited comment on PDFBOX-4297 at 1/3/21, 9:58 AM:
--

I was able to check your file with streams in the same amount of time. Btw the 
proposed method doesn't work because it returns a closed input stream. There is 
more work to do, the method you mentioned, and then the check for 
"adbe.x509.rsa_sha1", and {{Signature.update()}} doesn't work with input 
streams.


was (Author: tilman):
I was able to check your file with streams in the same amount of time. Btw the 
proposed method doesn't work because it returns a closed input stream. There is 
more work to do, the method you mentioned, and then the check for 
"adbe.x509.rsa_sha1".

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
> Attachments: programWinter2015_20210103_091853-sig_LTV.pdf
>
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2020-10-26 Thread Ralf Hauser (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220789#comment-17220789
 ] 

Ralf Hauser edited comment on PDFBOX-4297 at 10/26/20, 4:09 PM:


RE  [#comment-17211657]

What are the test methods for point 1) - 3)  ?


was (Author: hau...@acm.org):
RE PDFBOX-4297#comment-17211657

What are the test methods for point 1) - 3)  ?

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2020-10-26 Thread Ralf Hauser (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220789#comment-17220789
 ] 

Ralf Hauser edited comment on PDFBOX-4297 at 10/26/20, 4:08 PM:


RE PDFBOX-4297#comment-17211657

What are the test methods for point 1) - 3)  ?


was (Author: hau...@acm.org):
RE 
https://issues.apache.org/jira/browse/PDFBOX-4297?focusedCommentId=17211657=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17211657

What are the test methods for point 1) - 3)  ?

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2020-10-26 Thread Ralf Hauser (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220789#comment-17220789
 ] 

Ralf Hauser edited comment on PDFBOX-4297 at 10/26/20, 4:07 PM:


RE 
https://issues.apache.org/jira/browse/PDFBOX-4297?focusedCommentId=17211657=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17211657

What are the test methods for point 1) - 3)  ?


was (Author: hau...@acm.org):
RE comment-17211657 

What are the test methods for point 1) - 3)  ?

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2020-10-10 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211640#comment-17211640
 ] 

Tilman Hausherr edited comment on PDFBOX-4297 at 10/10/20, 10:23 AM:
-

Commit 1882383 from Tilman Hausherr in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1882383 ]

PDFBOX-4297: use the new method to get the signature contents; remove line 
forgotten in a previous refactoring


was (Author: jira-bot):
Commit 1882383 from Tilman Hausherr in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1882383 ]

PDFBOX-4297: introduce new method to get the signature contents; remove line 
forgotten in a previous refactoring

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2020-10-10 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211641#comment-17211641
 ] 

Tilman Hausherr edited comment on PDFBOX-4297 at 10/10/20, 10:23 AM:
-

Commit 1882384 from Tilman Hausherr in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1882384 ]

PDFBOX-4297: use the new method to get the signature contents


was (Author: jira-bot):
Commit 1882384 from Tilman Hausherr in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1882384 ]

PDFBOX-4297: introduce new method to get the signature contents

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs

2020-10-05 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208198#comment-17208198
 ] 

Tilman Hausherr edited comment on PDFBOX-4297 at 10/5/20, 5:15 PM:
---

{code}
COSString contents = (COSString) 
sig.getCOSObject().getDictionaryObject(COSName.CONTENTS);
byte [] ba = contents.getBytes();
{code}
But I guess you'd like a direct PDSignature method. I have no idea why the two 
methods we are offering read the whole file.


was (Author: tilman):
{\{code}}
COSString contents = (COSString) 
sig.getCOSObject().getDictionaryObject(COSName.CONTENTS);
byte [] ba = contents.getBytes();
{\{code}}
But I guess you'd like a direct PDSignature method. I have no idea why the two 
methods we are offering read the whole file.

> Allow to space efficiently analyse large PDFs
> -
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Parsing
>Reporter: Ralf Hauser
>Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>  
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle 
> large PDF files"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org