[jira] [Commented] (PDFBOX-5067) make PDVisibleSignDesigner memory aware
[ https://issues.apache.org/jira/browse/PDFBOX-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258030#comment-17258030 ] Ralf Hauser commented on PDFBOX-5067: - Thanks for the commit of the "uncontroversial". Tried to make the 2nd half also acceptable with [^patch_PDFBOX-5067.txt] What do you think ? > make PDVisibleSignDesigner memory aware > --- > > Key: PDFBOX-5067 > URL: https://issues.apache.org/jira/browse/PDFBOX-5067 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > Attachments: patch_PDFBOX-2512.txt, patch_PDFBOX-5067.txt > > > PDFBOX-2512 might have failed earlier if I hadn't used > MemoryUsageSetting.setupMixed(1500) > to limit the memory usage of PDDocument document to 15 MB in > CreateVisibleSignature in > > a) setVisibleSignDesigner() and used the now memory-aware constructor of > PDVisibleSignDesigner > and > b) in signPDF(), reused PDDocument > setTsaUrl(tsaUrl); > PDDocument doc = null; > if (null != visibleSignDesigner) { > doc = visibleSignDesigner.getDocument(); > } > if (null == doc) { > doc = Loader.loadPDF(inputFile, memoryUsageSetting); > } > // creating output document and prepare the IO streams. > ... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5067) make PDVisibleSignDesigner memory aware
[ https://issues.apache.org/jira/browse/PDFBOX-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ralf Hauser updated PDFBOX-5067: Attachment: patch_PDFBOX-5067.txt > make PDVisibleSignDesigner memory aware > --- > > Key: PDFBOX-5067 > URL: https://issues.apache.org/jira/browse/PDFBOX-5067 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > Attachments: patch_PDFBOX-2512.txt, patch_PDFBOX-5067.txt > > > PDFBOX-2512 might have failed earlier if I hadn't used > MemoryUsageSetting.setupMixed(1500) > to limit the memory usage of PDDocument document to 15 MB in > CreateVisibleSignature in > > a) setVisibleSignDesigner() and used the now memory-aware constructor of > PDVisibleSignDesigner > and > b) in signPDF(), reused PDDocument > setTsaUrl(tsaUrl); > PDDocument doc = null; > if (null != visibleSignDesigner) { > doc = visibleSignDesigner.getDocument(); > } > if (null == doc) { > doc = Loader.loadPDF(inputFile, memoryUsageSetting); > } > // creating output document and prepare the IO streams. > ... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5067) make PDVisibleSignDesigner memory aware
[ https://issues.apache.org/jira/browse/PDFBOX-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257950#comment-17257950 ] Tilman Hausherr commented on PDFBOX-5067: - I wanted to commit the "uncontroversial" parts before the new year "inbox madness" at works starts. Does this help? > make PDVisibleSignDesigner memory aware > --- > > Key: PDFBOX-5067 > URL: https://issues.apache.org/jira/browse/PDFBOX-5067 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > Attachments: patch_PDFBOX-2512.txt > > > PDFBOX-2512 might have failed earlier if I hadn't used > MemoryUsageSetting.setupMixed(1500) > to limit the memory usage of PDDocument document to 15 MB in > CreateVisibleSignature in > > a) setVisibleSignDesigner() and used the now memory-aware constructor of > PDVisibleSignDesigner > and > b) in signPDF(), reused PDDocument > setTsaUrl(tsaUrl); > PDDocument doc = null; > if (null != visibleSignDesigner) { > doc = visibleSignDesigner.getDocument(); > } > if (null == doc) { > doc = Loader.loadPDF(inputFile, memoryUsageSetting); > } > // creating output document and prepare the IO streams. > ... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5067) make PDVisibleSignDesigner memory aware
[ https://issues.apache.org/jira/browse/PDFBOX-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257949#comment-17257949 ] ASF subversion and git services commented on PDFBOX-5067: - Commit 1885091 from Tilman Hausherr in branch 'pdfbox/trunk' [ https://svn.apache.org/r1885091 ] PDFBOX-5067: allow the passing of a MemoryUsageSetting, as suggested by Ralf Hauser > make PDVisibleSignDesigner memory aware > --- > > Key: PDFBOX-5067 > URL: https://issues.apache.org/jira/browse/PDFBOX-5067 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > Attachments: patch_PDFBOX-2512.txt > > > PDFBOX-2512 might have failed earlier if I hadn't used > MemoryUsageSetting.setupMixed(1500) > to limit the memory usage of PDDocument document to 15 MB in > CreateVisibleSignature in > > a) setVisibleSignDesigner() and used the now memory-aware constructor of > PDVisibleSignDesigner > and > b) in signPDF(), reused PDDocument > setTsaUrl(tsaUrl); > PDDocument doc = null; > if (null != visibleSignDesigner) { > doc = visibleSignDesigner.getDocument(); > } > if (null == doc) { > doc = Loader.loadPDF(inputFile, memoryUsageSetting); > } > // creating output document and prepare the IO streams. > ... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5059) java.io.IOException: expected number, actual=COSFloat{18446744073521659909} at offset 4932600
[ https://issues.apache.org/jira/browse/PDFBOX-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257937#comment-17257937 ] Tilman Hausherr commented on PDFBOX-5059: - It means that your PDF does not respect the PDF specification, due to a software bug in the software that created the file. The related issue PDFBOX-4495 has an explanation. Also try opening your file with NOTEPAD++ and go to byte offset (not line offset!) 4932600 and see how it looks. Maybe the Adobe Viewer displays the file, but this only means that it has a better error recovery. > java.io.IOException: expected number, actual=COSFloat{18446744073521659909} > at offset 4932600 > - > > Key: PDFBOX-5059 > URL: https://issues.apache.org/jira/browse/PDFBOX-5059 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.3 > Environment: linux >Reporter: Ling Hock Hin, Daniel >Priority: Major > > Encountered this error while trying to upload pdf. Seems to apply only for > certain pdfs. Can't share more due to confidentiality. > > java.io.IOException: expected number, actual=COSFloat\{18446744073521659909} > at offset 4932600java.io.IOException: expected number, > actual=COSFloat\{18446744073521659909} at offset 4932600 at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:162) > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:274) > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:207) > at > org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:854) at > org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:757) at > org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:726) > at > org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:657) > at > org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:617) at > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:215) at > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:249) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1093) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1053) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5059) java.io.IOException: expected number, actual=COSFloat{18446744073521659909} at offset 4932600
[ https://issues.apache.org/jira/browse/PDFBOX-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257883#comment-17257883 ] Ling Hock Hin, Daniel commented on PDFBOX-5059: --- May I have an brief description of what this error is/means so that I can explain to people I am working with? > java.io.IOException: expected number, actual=COSFloat{18446744073521659909} > at offset 4932600 > - > > Key: PDFBOX-5059 > URL: https://issues.apache.org/jira/browse/PDFBOX-5059 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.3 > Environment: linux >Reporter: Ling Hock Hin, Daniel >Priority: Major > > Encountered this error while trying to upload pdf. Seems to apply only for > certain pdfs. Can't share more due to confidentiality. > > java.io.IOException: expected number, actual=COSFloat\{18446744073521659909} > at offset 4932600java.io.IOException: expected number, > actual=COSFloat\{18446744073521659909} at offset 4932600 at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:162) > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:274) > at > org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:207) > at > org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:854) at > org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:757) at > org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:726) > at > org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:657) > at > org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:617) at > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:215) at > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:249) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1093) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1053) at > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5067) make PDVisibleSignDesigner memory aware
[ https://issues.apache.org/jira/browse/PDFBOX-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257857#comment-17257857 ] Ralf Hauser commented on PDFBOX-5067: - Thanks for the feedback I tested it on the basis of CreateVisibleSignature2.java ( also with setupMixed 15MB ) It needed 8 MB more Xmx (==> 69m). But PDDocument was only loaded once, so in this case, the getter is not used. Although when you close PDVisibleSignDesigner it would be deallocated anyway. > Some of the constructors set that field, some don't (the one that calls > {{calculatePageSizeFromStream}}) I only added it where the test did a Loader.loadPDF() - there may well be more places it could be added. But still, I assume it is quicker and not that worse for the memory if the load only happens once. > make PDVisibleSignDesigner memory aware > --- > > Key: PDFBOX-5067 > URL: https://issues.apache.org/jira/browse/PDFBOX-5067 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > Attachments: patch_PDFBOX-2512.txt > > > PDFBOX-2512 might have failed earlier if I hadn't used > MemoryUsageSetting.setupMixed(1500) > to limit the memory usage of PDDocument document to 15 MB in > CreateVisibleSignature in > > a) setVisibleSignDesigner() and used the now memory-aware constructor of > PDVisibleSignDesigner > and > b) in signPDF(), reused PDDocument > setTsaUrl(tsaUrl); > PDDocument doc = null; > if (null != visibleSignDesigner) { > doc = visibleSignDesigner.getDocument(); > } > if (null == doc) { > doc = Loader.loadPDF(inputFile, memoryUsageSetting); > } > // creating output document and prepare the IO streams. > ... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5067) make PDVisibleSignDesigner memory aware
[ https://issues.apache.org/jira/browse/PDFBOX-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257834#comment-17257834 ] Tilman Hausherr commented on PDFBOX-5067: - - The old PDVisibleSignDesigner should call the new one to avoid double code - The new field should be on top - Why the new "PDDocument" field? You're exposing internals, and this prevents closing of the document. - Some of the constructors set that field, some don't (the one that calls {{calculatePageSizeFromStream}}) Why not use the CreateVisibleSignature2.java as starting point? This is easier to understand. > make PDVisibleSignDesigner memory aware > --- > > Key: PDFBOX-5067 > URL: https://issues.apache.org/jira/browse/PDFBOX-5067 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > Attachments: patch_PDFBOX-2512.txt > > > PDFBOX-2512 might have failed earlier if I hadn't used > MemoryUsageSetting.setupMixed(1500) > to limit the memory usage of PDDocument document to 15 MB in > CreateVisibleSignature in > > a) setVisibleSignDesigner() and used the now memory-aware constructor of > PDVisibleSignDesigner > and > b) in signPDF(), reused PDDocument > setTsaUrl(tsaUrl); > PDDocument doc = null; > if (null != visibleSignDesigner) { > doc = visibleSignDesigner.getDocument(); > } > if (null == doc) { > doc = Loader.loadPDF(inputFile, memoryUsageSetting); > } > // creating output document and prepare the IO streams. > ... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5068) OutOfMemory while signing large documents - continued
[ https://issues.apache.org/jira/browse/PDFBOX-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ralf Hauser updated PDFBOX-5068: Description: Continuation of PDFBOX-2512 in COSWriter.prepareIncrement(), for the test case cosDoc.getXrefTable().keySet() has size 5925. For each of thes keys, cosDoc.getObjectFromPool() gets an object that is not just referencing some part of the input document, but duplicates it (which is unavoidable in the case where they are decompressed with FlateFilter - albeit this could possibly be done "lazy") -Xmx20m 746/5925 -Xmx25m 1615/5925 -Xmx30m 2800/5925 -Xmx40m 3872/5925 -Xmx55m 5773/5925 With 60m, it gets them all, but dies later with less telling java.lang.OutOfMemoryError: GC overhead limit exceeded This assumes the patch of PDFBOX-5067 already in place was: Continuation of PDFBOX-2512 in COSWriter.prepareIncrement(), for the test case cosDoc.getXrefTable().keySet() has size 5925. For each of thes keys, cosDoc.getObjectFromPool() gets an object that is not just referencing some part of the input document, but duplicates it (which is unavoidable in the case where they are decompressed with FlateFilter - albeit this could possibly be done "lazy") -Xmx20m 746/5925 -Xmx25m 1615/5925 -Xmx30m 2800/5925 -Xmx40m 3872/5925 -Xmx55m 5773/5925 With 60m, it gets them all, but dies later with less telling java.lang.OutOfMemoryError: GC overhead limit exceeded > OutOfMemory while signing large documents - continued > - > > Key: PDFBOX-5068 > URL: https://issues.apache.org/jira/browse/PDFBOX-5068 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > > Continuation of PDFBOX-2512 > > in COSWriter.prepareIncrement(), for the test case > cosDoc.getXrefTable().keySet() has size 5925. For each of thes keys, > cosDoc.getObjectFromPool() gets an object that is not just referencing some > part of the input document, but duplicates it (which is unavoidable in the > case where they are decompressed with FlateFilter - albeit this could > possibly be done "lazy") > -Xmx20m 746/5925 > -Xmx25m 1615/5925 > -Xmx30m 2800/5925 > -Xmx40m 3872/5925 > -Xmx55m 5773/5925 > With 60m, it gets them all, but dies later with less telling > java.lang.OutOfMemoryError: GC overhead limit exceeded > > This assumes the patch of PDFBOX-5067 already in place -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5068) OutOfMemory while signing large documents - continued
[ https://issues.apache.org/jira/browse/PDFBOX-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ralf Hauser updated PDFBOX-5068: Summary: OutOfMemory while signing large documents - continued (was: OutOfMemory while signing large documents) > OutOfMemory while signing large documents - continued > - > > Key: PDFBOX-5068 > URL: https://issues.apache.org/jira/browse/PDFBOX-5068 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Major > > Continuation of PDFBOX-2512 > > in COSWriter.prepareIncrement(), for the test case > cosDoc.getXrefTable().keySet() has size 5925. For each of thes keys, > cosDoc.getObjectFromPool() gets an object that is not just referencing some > part of the input document, but duplicates it (which is unavoidable in the > case where they are decompressed with FlateFilter - albeit this could > possibly be done "lazy") > -Xmx20m 746/5925 > -Xmx25m 1615/5925 > -Xmx30m 2800/5925 > -Xmx40m 3872/5925 > -Xmx55m 5773/5925 > With 60m, it gets them all, but dies later with less telling > java.lang.OutOfMemoryError: GC overhead limit exceeded > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-5068) OutOfMemory while signing large documents
Ralf Hauser created PDFBOX-5068: --- Summary: OutOfMemory while signing large documents Key: PDFBOX-5068 URL: https://issues.apache.org/jira/browse/PDFBOX-5068 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 2.0.23 Reporter: Ralf Hauser Continuation of PDFBOX-2512 in COSWriter.prepareIncrement(), for the test case cosDoc.getXrefTable().keySet() has size 5925. For each of thes keys, cosDoc.getObjectFromPool() gets an object that is not just referencing some part of the input document, but duplicates it (which is unavoidable in the case where they are decompressed with FlateFilter - albeit this could possibly be done "lazy") -Xmx20m 746/5925 -Xmx25m 1615/5925 -Xmx30m 2800/5925 -Xmx40m 3872/5925 -Xmx55m 5773/5925 With 60m, it gets them all, but dies later with less telling java.lang.OutOfMemoryError: GC overhead limit exceeded -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2602) Enhance command line tools
[ https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257765#comment-17257765 ] ASF subversion and git services commented on PDFBOX-2602: - Commit 1885066 from le...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1885066 ] PDFBOX-2602: remove version override > Enhance command line tools > -- > > Key: PDFBOX-2602 > URL: https://issues.apache.org/jira/browse/PDFBOX-2602 > Project: PDFBox > Issue Type: Bug > Components: Utilities >Affects Versions: 1.8.8, 2.0.0 >Reporter: Maruan Sahyoun >Assignee: Maruan Sahyoun >Priority: Minor > Fix For: 3.0.0 PDFBox > > > The command line tools shall be enhanced to have the same behavior across all > tools. > From the discussion on the dev mailing list > - add an -h option to print the usage > - print the usage to System.err and use an exit code of 1 if there was an > invalid command line parameter > - print messages on exceptions to System.err > - rethrow the exception so java can handle it if it will terminate afterwards > anyway > - use an exit code of 1if rethrowing doesn't make sense > Additional input: > https://clig.dev/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257744#comment-17257744 ] ASF subversion and git services commented on PDFBOX-4297: - Commit 1885065 from Tilman Hausherr in branch 'pdfbox/trunk' [ https://svn.apache.org/r1885065 ] PDFBOX-4297: Sonar fix > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5030) Create Migration guide for 3.0.0
[ https://issues.apache.org/jira/browse/PDFBOX-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257728#comment-17257728 ] Andreas Lehmkühler commented on PDFBOX-5030: (y) thanks for starting this > Create Migration guide for 3.0.0 > > > Key: PDFBOX-5030 > URL: https://issues.apache.org/jira/browse/PDFBOX-5030 > Project: PDFBox > Issue Type: Task > Components: Documentation >Reporter: Maruan Sahyoun >Assignee: Maruan Sahyoun >Priority: Major > Fix For: 3.0.0 PDFBox > > > As to start educating about the migration efforts needed to get to 3.0.0 the > should be a migration guide (evolving over time) to prepare for the release -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5066) ShowSignature: say which digest algorithm was used, detect forged content
[ https://issues.apache.org/jira/browse/PDFBOX-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257725#comment-17257725 ] Tilman Hausherr commented on PDFBOX-5066: - 1. {{certFromSignedData.getSigAlgName()}} returns "SHA256withRSA". I can change the success line to {code} System.out.println(certFromSignedData.getSigAlgName() + " signature verified"); {code} 2. the check is missing, because this is based on code from another project. Here's the segment currently: {code} case "adbe.x509.rsa_sha1": { // example: PDFBOX-2693.pdf COSString certString = (COSString) sigDict.getDictionaryObject(COSName.CERT); //TODO this could also be an array. if (certString == null) { System.err.println("The /Cert certificate string is missing in the signature dictionary"); return; } byte[] certData = certString.getBytes(); CertificateFactory factory = CertificateFactory.getInstance("X.509"); ByteArrayInputStream certStream = new ByteArrayInputStream(certData); Collection certs = factory.generateCertificates(certStream); System.out.println("certs=" + certs); X509Certificate cert = (X509Certificate) certs.iterator().next(); // https://forums.adobe.com/thread/530277 // Contents = contains the crypted message digest // Cert = contains the X509 certificate // to verify signature, see code at // https://stackoverflow.com/questions/43383859/ // inspired by: // https://www.programcreek.com/java-api-examples/index.php?source_dir=pades_signing_2.1.5-master/src/main/java/com/opentrust/spi/pdf/PDFEnvelopedSignature.java // https://github.com/OpenTrust/pades_signing_2.1.5/blob/master/src/main/java/com/opentrust/spi/pdf/PDFEnvelopedSignature.java ASN1InputStream asn1IS = new ASN1InputStream(new ByteArrayInputStream(contents)); ASN1Primitive asn1prim = asn1IS.readObject(); if (!(asn1prim instanceof ASN1OctetString)) { // 276434.pdf throw new IOException("ASN1 octet string expected, but got " + asn1prim.getClass().getSimpleName()); } ASN1OctetString oct = (ASN1OctetString) asn1prim; Signature signature = Signature.getInstance("SHA1withRSA"); signature.initVerify(cert.getPublicKey()); int by; while ((by = signedContentAsStream.read()) != -1) { signature.update((byte) by); } System.out.println("Verification result: " + signature.verify(oct.getOctets())); // get digest algorithm Cipher c = Cipher.getInstance("RSA/NONE/PKCS1Padding", SecurityProvider.getProvider()); c.init(Cipher.DECRYPT_MODE, cert.getPublicKey()); byte[] raw = c.doFinal(oct.getOctets()); DigestInfo di = DigestInfo.getInstance(raw); String algID = di.getAlgorithmId().getAlgorithm().getId(); try { if (sig.getSignDate() != null) { cert.checkValidity(sig.getSignDate().getTime()); System.out.println("Certificate valid at signing time"); } else { System.err.println("Certificate cannot be verified without signing time"); } } catch (CertificateExpiredException ex) { System.err.println("Certificate expired at signing time"); } catch (CertificateNotYetValidException ex) { System.err.println("Certificate not yet valid at signing time"); } if (CertificateVerifier.isSelfSigned(cert)) { System.err.println("Certificate for " + cert.getSubjectX500Principal().getName() + " is self-signed, LOL!"); } else { System.out.println("Certificate is not self-signed"); if (sig.getSignDate() != null) { @SuppressWarnings("unchecked") Store store = new JcaCertStore(certs); SigUtils.verifyCertificateChain(store, cert, sig.getSignDate().getTime()); } } break; {code} > ShowSignature: say which digest algorithm was used, detect forged content > - > > Key: PDFBOX-5066 > URL: https://issues.apache.org/jira/browse/PDFBOX-5066 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Minor > > 1) SHA256 is was used by the signer to get the content digests of > target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be > mentioned like > System.out.println("Signature found"); > so maybe > System.out.println("Signature algorithm: "+algo); > where 'algo' is for example "sha256WithRSAEncryption" (as per > [http://oidref.com/1.2.840.113549.1.1.11]) > 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content > is altered. > > See also PDFBOX-4297 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257722#comment-17257722 ] ASF subversion and git services commented on PDFBOX-4297: - Commit 1885054 from Tilman Hausherr in branch 'pdfbox/trunk' [ https://svn.apache.org/r1885054 ] PDFBOX-4297: use stream instead of byte buffer to handle huge files with a small memory footprint > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2512) OutOfMemory while signing large documents
[ https://issues.apache.org/jira/browse/PDFBOX-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257721#comment-17257721 ] Ralf Hauser commented on PDFBOX-2512: - ok, let's address this in PDFBOX-5067 > OutOfMemory while signing large documents > - > > Key: PDFBOX-2512 > URL: https://issues.apache.org/jira/browse/PDFBOX-2512 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Signing >Affects Versions: 1.8.7 >Reporter: Thomas Chojecki >Assignee: Thomas Chojecki >Priority: Major > Fix For: 1.8.8 > > Attachments: keystore.p12 > > > While working with large documents, we found some memory issues. > 1. The method close() in the COSDocument, clones the objectpool and does not > clean it properly. The cloning in getObjects() cause a OutOfMemory exception. > 2.The COSWriter copy the whole pdf into the memory for signing and does not > use BufferedInputStream for the FileInputStream which also has a big > performance impact. (PDFBOX-1798) > 3. The cloning of COSStreams cause a OutOfMemory exception > I used the CreateSignature example with a about 150 MB big document from here: > https://cdn-reichelt.de/bilder/downloads/reichelt_01-2015_DE_B_HQ.pdf > Additionaly I add a RandomAccessFile to the PDDocument.load in the > CreateSignature class. > PDDocument doc = PDDocument.load(document,new RandomAccessFile(new > File("d:\\temp.bin"), "rw")); (this prevent the OOM for the third case) > The use of a BuffedInputStream in case two, will increase the signing speed > from more than 5 minutes to less than 1 minute. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-5067) make PDVisibleSignDesigner memory aware
Ralf Hauser created PDFBOX-5067: --- Summary: make PDVisibleSignDesigner memory aware Key: PDFBOX-5067 URL: https://issues.apache.org/jira/browse/PDFBOX-5067 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 2.0.23 Reporter: Ralf Hauser Attachments: patch_PDFBOX-2512.txt PDFBOX-2512 might have failed earlier if I hadn't used MemoryUsageSetting.setupMixed(1500) to limit the memory usage of PDDocument document to 15 MB in CreateVisibleSignature in a) setVisibleSignDesigner() and used the now memory-aware constructor of PDVisibleSignDesigner and b) in signPDF(), reused PDDocument setTsaUrl(tsaUrl); PDDocument doc = null; if (null != visibleSignDesigner) { doc = visibleSignDesigner.getDocument(); } if (null == doc) { doc = Loader.loadPDF(inputFile, memoryUsageSetting); } // creating output document and prepare the IO streams. ... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257709#comment-17257709 ] Tilman Hausherr edited comment on PDFBOX-4297 at 1/3/21, 10:55 AM: --- I was able to check your file with streams in the same amount of time. Btw the proposed method doesn't work because it returns a closed input stream. There is more work to do, the method you mentioned, and then the check for "adbe.x509.rsa_sha1" (update: turns out that this one isn't part of our repository, because it contains code from another project). was (Author: tilman): I was able to check your file with streams in the same amount of time. Btw the proposed method doesn't work because it returns a closed input stream. There is more work to do, the method you mentioned, and then the check for "adbe.x509.rsa_sha1". > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2512) OutOfMemory while signing large documents
[ https://issues.apache.org/jira/browse/PDFBOX-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257718#comment-17257718 ] Tilman Hausherr commented on PDFBOX-2512: - This is a closed issue, please don't write there. > OutOfMemory while signing large documents > - > > Key: PDFBOX-2512 > URL: https://issues.apache.org/jira/browse/PDFBOX-2512 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Signing >Affects Versions: 1.8.7 >Reporter: Thomas Chojecki >Assignee: Thomas Chojecki >Priority: Major > Fix For: 1.8.8 > > Attachments: keystore.p12 > > > While working with large documents, we found some memory issues. > 1. The method close() in the COSDocument, clones the objectpool and does not > clean it properly. The cloning in getObjects() cause a OutOfMemory exception. > 2.The COSWriter copy the whole pdf into the memory for signing and does not > use BufferedInputStream for the FileInputStream which also has a big > performance impact. (PDFBOX-1798) > 3. The cloning of COSStreams cause a OutOfMemory exception > I used the CreateSignature example with a about 150 MB big document from here: > https://cdn-reichelt.de/bilder/downloads/reichelt_01-2015_DE_B_HQ.pdf > Additionaly I add a RandomAccessFile to the PDDocument.load in the > CreateSignature class. > PDDocument doc = PDDocument.load(document,new RandomAccessFile(new > File("d:\\temp.bin"), "rw")); (this prevent the OOM for the third case) > The use of a BuffedInputStream in case two, will increase the signing speed > from more than 5 minutes to less than 1 minute. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2512) OutOfMemory while signing large documents
[ https://issues.apache.org/jira/browse/PDFBOX-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257716#comment-17257716 ] Ralf Hauser commented on PDFBOX-2512: - Did a quick test with [^programWinter2015_20210103_091853-sig_LTV.pdf] 35MB when doing -Xmx70m , the signature works with -Xmx50m java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.StringBuilder.toString(StringBuilder.java:407) at org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1281) at org.apache.pdfbox.pdfparser.BaseParser.readObjectNumber(BaseParser.java:1212) at org.apache.pdfbox.pdfparser.PDFObjectStreamParser.privateReadObjectNumbers(PDFObjectStreamParser.java:104) at org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parseObject(PDFObjectStreamParser.java:77) at org.apache.pdfbox.pdfparser.COSParser.parseObjectStreamObject(COSParser.java:779) at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:637) at org.apache.pdfbox.pdfparser.COSParser.dereferenceCOSObject(COSParser.java:586) at org.apache.pdfbox.cos.COSObject.getObject(COSObject.java:115) at org.apache.pdfbox.pdfwriter.COSWriter.prepareIncrement(COSWriter.java:327) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1425) at org.apache.pdfbox.pdmodel.PDDocument.saveIncremental(PDDocument.java:997) ... with -Xmx30m java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3236) at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191) at org.apache.pdfbox.cos.COSStream.createView(COSStream.java:218) at org.apache.pdfbox.pdfparser.PDFObjectStreamParser.(PDFObjectStreamParser.java:48) at org.apache.pdfbox.pdfparser.COSParser.parseObjectStreamObject(COSParser.java:778) at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:637) at org.apache.pdfbox.pdfparser.COSParser.dereferenceCOSObject(COSParser.java:586) at org.apache.pdfbox.cos.COSObject.getObject(COSObject.java:115) at org.apache.pdfbox.pdfwriter.COSWriter.prepareIncrement(COSWriter.java:327) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1425) at org.apache.pdfbox.pdmodel.PDDocument.saveIncremental(PDDocument.java:997) at org.apache.pdfbox.examples.signature.CreateVisibleSignature.signPDF(CreateVisibleSignature.java:... ... > OutOfMemory while signing large documents > - > > Key: PDFBOX-2512 > URL: https://issues.apache.org/jira/browse/PDFBOX-2512 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Signing >Affects Versions: 1.8.7 >Reporter: Thomas Chojecki >Assignee: Thomas Chojecki >Priority: Major > Fix For: 1.8.8 > > Attachments: keystore.p12 > > > While working with large documents, we found some memory issues. > 1. The method close() in the COSDocument, clones the objectpool and does not > clean it properly. The cloning in getObjects() cause a OutOfMemory exception. > 2.The COSWriter copy the whole pdf into the memory for signing and does not > use BufferedInputStream for the FileInputStream which also has a big > performance impact. (PDFBOX-1798) > 3. The cloning of COSStreams cause a OutOfMemory exception > I used the CreateSignature example with a about 150 MB big document from here: > https://cdn-reichelt.de/bilder/downloads/reichelt_01-2015_DE_B_HQ.pdf > Additionaly I add a RandomAccessFile to the PDDocument.load in the > CreateSignature class. > PDDocument doc = PDDocument.load(document,new RandomAccessFile(new > File("d:\\temp.bin"), "rw")); (this prevent the OOM for the third case) > The use of a BuffedInputStream in case two, will increase the signing speed > from more than 5 minutes to less than 1 minute. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257709#comment-17257709 ] Tilman Hausherr edited comment on PDFBOX-4297 at 1/3/21, 10:37 AM: --- I was able to check your file with streams in the same amount of time. Btw the proposed method doesn't work because it returns a closed input stream. There is more work to do, the method you mentioned, and then the check for "adbe.x509.rsa_sha1". was (Author: tilman): I was able to check your file with streams in the same amount of time. Btw the proposed method doesn't work because it returns a closed input stream. There is more work to do, the method you mentioned, and then the check for "adbe.x509.rsa_sha1", and {{Signature.update()}} doesn't work with input streams. > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257709#comment-17257709 ] Tilman Hausherr edited comment on PDFBOX-4297 at 1/3/21, 9:58 AM: -- I was able to check your file with streams in the same amount of time. Btw the proposed method doesn't work because it returns a closed input stream. There is more work to do, the method you mentioned, and then the check for "adbe.x509.rsa_sha1", and {{Signature.update()}} doesn't work with input streams. was (Author: tilman): I was able to check your file with streams in the same amount of time. Btw the proposed method doesn't work because it returns a closed input stream. There is more work to do, the method you mentioned, and then the check for "adbe.x509.rsa_sha1". > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257709#comment-17257709 ] Tilman Hausherr commented on PDFBOX-4297: - I was able to check your file with streams in the same amount of time. Btw the proposed method doesn't work because it returns a closed input stream. There is more work to do, the method you mentioned, and then the check for "adbe.x509.rsa_sha1". > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5066) ShowSignature: say which digest algorithm was used, detect forged content
[ https://issues.apache.org/jira/browse/PDFBOX-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ralf Hauser updated PDFBOX-5066: Description: 1) SHA256 is was used by the signer to get the content digests of target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be mentioned like System.out.println("Signature found"); so maybe System.out.println("Signature algorithm: "+algo); where 'algo' is for example "sha256WithRSAEncryption" (as per [http://oidref.com/1.2.840.113549.1.1.11]) 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content is altered. See also PDFBOX-4297 was: 1) SHA256 is was used by the signer to get the content digests of target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be mentioned like System.out.println("Signature found"); so maybe System.out.println("Signature algorithm: "+algo); where 'algo' is for example "sha256WithRSAEncryption" (as per [http://oidref.com/1.2.840.113549.1.1.11]) 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content is altered. > ShowSignature: say which digest algorithm was used, detect forged content > - > > Key: PDFBOX-5066 > URL: https://issues.apache.org/jira/browse/PDFBOX-5066 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Minor > > 1) SHA256 is was used by the signer to get the content digests of > target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be > mentioned like > System.out.println("Signature found"); > so maybe > System.out.println("Signature algorithm: "+algo); > where 'algo' is for example "sha256WithRSAEncryption" (as per > [http://oidref.com/1.2.840.113549.1.1.11]) > 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content > is altered. > > See also PDFBOX-4297 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257700#comment-17257700 ] Tilman Hausherr commented on PDFBOX-4297: - I've tried with your file, it takes almost two minutes to download the file. However after that the content is there. The rest is done in 1 second. I'll see what happens when going with streams. > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5066) ShowSignature: say which digest algorithm was used, detect forged content
[ https://issues.apache.org/jira/browse/PDFBOX-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ralf Hauser updated PDFBOX-5066: Description: 1) SHA256 is was used by the signer to get the content digests of target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be mentioned like System.out.println("Signature found"); so maybe System.out.println("Signature algorithm: "+algo); where 'algo' is for example "sha256WithRSAEncryption" (as per [http://oidref.com/1.2.840.113549.1.1.11]) 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content is altered. was: 1) SHA256 is was used by the signer to get the content digests of target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be mentioned like System.out.println("Signature found"); so maybe System.out.println("Signature algorithm: "+algo); where also is for example "sha256WithRSAEncryption" (as per http://oidref.com/1.2.840.113549.1.1.11) 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content is altered. > ShowSignature: say which digest algorithm was used, detect forged content > - > > Key: PDFBOX-5066 > URL: https://issues.apache.org/jira/browse/PDFBOX-5066 > Project: PDFBox > Issue Type: Improvement > Components: Signing >Affects Versions: 2.0.23 >Reporter: Ralf Hauser >Priority: Minor > > 1) SHA256 is was used by the signer to get the content digests of > target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be > mentioned like > System.out.println("Signature found"); > so maybe > System.out.println("Signature algorithm: "+algo); > where 'algo' is for example "sha256WithRSAEncryption" (as per > [http://oidref.com/1.2.840.113549.1.1.11]) > 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content > is altered. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-5066) ShowSignature: say which digest algorithm was used, detect forged content
Ralf Hauser created PDFBOX-5066: --- Summary: ShowSignature: say which digest algorithm was used, detect forged content Key: PDFBOX-5066 URL: https://issues.apache.org/jira/browse/PDFBOX-5066 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 2.0.23 Reporter: Ralf Hauser 1) SHA256 is was used by the signer to get the content digests of target/pdfs/notCertified_368835_Sig_en_201026090509.pdf , this should be mentioned like System.out.println("Signature found"); so maybe System.out.println("Signature algorithm: "+algo); where also is for example "sha256WithRSAEncryption" (as per http://oidref.com/1.2.840.113549.1.1.11) 2) for subFilter="adbe.x509.rsa_sha1" it is not detected, if the pdf content is altered. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257693#comment-17257693 ] Ralf Hauser commented on PDFBOX-4297: - Please find with programWinter2015_20210103_091853-sig_LTV.pdf a bigger test file. > We could change the code of ShowSignature, but then we'd probably get > criticism for being slow. if streams are properly implemented, they might even be quicker as there will not be any memory-pages swapping by the operating system > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ralf Hauser updated PDFBOX-4297: Attachment: programWinter2015_20210103_091853-sig_LTV.pdf > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > Attachments: programWinter2015_20210103_091853-sig_LTV.pdf > > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4297) Allow to space efficiently analyse large PDFs
[ https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17257685#comment-17257685 ] Tilman Hausherr commented on PDFBOX-4297: - his is sample code, so nothing prevents you to write your own. I assume you mean the "buf" parameter. This is used only once, to calculate the digest. To calculate the digest from a stream, see here https://stackoverflow.com/a/304350/535646 We could change the code of ShowSignature, but then we'd probably get criticism for being slow. About the missing method, did you try to implement it yourself? Yes maybe we could do that, but I think getSignedContentAsStream() would be better. > Allow to space efficiently analyse large PDFs > - > > Key: PDFBOX-4297 > URL: https://issues.apache.org/jira/browse/PDFBOX-4297 > Project: PDFBox > Issue Type: Improvement > Components: Parsing >Reporter: Ralf Hauser >Priority: Major > > Assume you get a 300+MB large pdf and need to know > 1) the file names of embedded files if any > 2) whether it is encrypted (symmetric or asymmetric) > 3) certification level (and whether it is signed) > This should not use more than 5 MB (extra) memory > > P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle > large PDF files" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org