Re: Jenkins Error
Hi, John Hewson j...@jahewson.com hat am 10. September 2014 um 22:11 geschrieben: I’m getting strange build errors on Jenkins with HTTP 401 “Unauthorized” from https://repository.apache.org. Here’s the log: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.8.1:deploy (default-deploy) on project pdfbox-parent: Failed to deploy artifacts: Could not transfer artifact org.apache.pdfbox:pdfbox-parent:pom:2.0.0-20140910.200319-587 from/to apache.snapshots.https (https://repository.apache.org/content/repositories/snapshots): Failed to transfer file: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/pdfbox-parent-2.0.0-20140910.200319-587.pom. Return code is: 401, ReasonPhrase: Unauthorized. - [Help 1] According to infra@ there was an issue with nexus yesterday which should be solved by now. I've already triggered a build manually to check that. -- John BR Andreas Lehmkühler
Jenkins build is back to normal : PDFBox-trunk » PDFBox parent #1264
See https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-parent/1264/
[jira] [Commented] (PDFBOX-2340) Overhaul PDFBox Documentation
[ https://issues.apache.org/jira/browse/PDFBOX-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129732#comment-14129732 ] Maruan Sahyoun commented on PDFBOX-2340: For now I’ll take a slightly different approach keeping the current web site as is with the Cookbook becoming an kind of detached microsite. So the changes are minimized. At a later stage we can still reintegrate it and drive it directly from the examples package if we choose to do so. Overhaul PDFBox Documentation - Key: PDFBOX-2340 URL: https://issues.apache.org/jira/browse/PDFBOX-2340 Project: PDFBox Issue Type: Improvement Components: Documentation Reporter: Maruan Sahyoun Attachments: Mockup_Documentation.png In oder to make it easier for users of PDFBox to work with the library there shall be an enhanced documentation consisting of an introduction, API references and more well documented examples and code snippets (Cookbook). In order to make it easier to contribute the Cookbook shall be build automatically from the examples/snippet ‚repository‘. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: PDFBox 1.8.7 release?
Hi Andreas, what are your current plans to cut the new release? Dependent on that I could do https://issues.apache.org/jira/browse/PDFBOX-91 [Comb Fields] as a quick fix this weekend to the 1.8 branch. BR Maruan Am 14.08.2014 um 09:08 schrieb Andreas Lehmkühler andr...@lehmi.de: Andreas Lehmkühler andr...@lehmi.de hat am 7. August 2014 um 12:35 geschrieben: Hi, there is already a number of solved issues and I guess it's time for a new bugfix release. I'm working on PDFBOX-2250 and I'd like to finish that first but how about a new release in 2 or 3 weeks from now? WDYT? As there weren't any objections I'm targeting the first week of september to cut the release. BR Andreas Lehmkühler
[jira] [Commented] (PDFBOX-2337) Add an example for highlighting text based on a string
[ https://issues.apache.org/jira/browse/PDFBOX-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129762#comment-14129762 ] Maruan Sahyoun commented on PDFBOX-2337: Do we really need a ICLA / CCLA in this case? As per [~lehmi]’s comment on the users mailing list that might not be necessary. Could we come up with a licensing header for such cases as others might be interested in writing up samples or help enhancing the documentation. I would like to see us making this process as simple as possible. I’d also vote for taking this sample as is and include that in 1.8.x if that doesn’t hold the release process for long and fix it for 2.0 after that. Could be done by Joël or us. WDYT? Add an example for highlighting text based on a string --- Key: PDFBOX-2337 URL: https://issues.apache.org/jira/browse/PDFBOX-2337 Project: PDFBox Issue Type: New Feature Components: Utilities Reporter: Joël Kuiper An often heard request is to be able to highlight a certain text within a PDF programmatically, similar to the highlight functionality in Acrobat or Preview.app. The actual implementation of this functionality is trickier than it appears, since it requires the calculation of bouding boxes from TextPositions. A example class may help people with implementing this (common) functionality. (see for example this discussion https://mail-archives.apache.org/mod_mbox/pdfbox-users/201409.mbox/%3CC8340BB9-E299-4A76-A50B-6155504A0D5B%40joelkuiper.eu%3E) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-1614) Digitally sign PDFs without file system access
[ https://issues.apache.org/jira/browse/PDFBOX-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129782#comment-14129782 ] Andrei Solntsev commented on PDFBOX-1614: - Hi! Can you say when PDFBOX 2.0 will be released? We are waiting for this feature to be available. Digitally sign PDFs without file system access -- Key: PDFBOX-1614 URL: https://issues.apache.org/jira/browse/PDFBOX-1614 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 1.8.1 Reporter: Thierry Boschat Assignee: Thomas Chojecki Fix For: 2.0.0 Hi I'm using pdfbox-1.8.1 to digitally sign PDFs. I find the sample below to handle it. But in this example I have to use a FileInputStream however I want to do it only through streams (without any file system access). I tried to extends FileInputStream to deal with it but I failed. Any tips for me about that problem ? Thanks. File outputDocument = new File(resources/signed + document.getName()); FileInputStream fis = new FileInputStream(document); FileOutputStream fos = new FileOutputStream(outputDocument); int c; while ((c = fis.read(buffer)) != -1) { fos.write(buffer, 0, c); } fis.close(); fis = new FileInputStream(outputDocument); // load document PDDocument doc = PDDocument.load(document); // create signature dictionary PDSignature signature = new PDSignature(); signature.setFilter(PDSignature.FILTER_ADOBE_PPKLITE); // default filter // subfilter for basic and PAdES Part 2 signatures signature.setSubFilter(PDSignature.SUBFILTER_ADBE_PKCS7_DETACHED); signature.setName(signer name); signature.setLocation(signer location); signature.setReason(reason for signature); // the signing date, needed for valid signature signature.setSignDate(Calendar.getInstance()); // register signature dictionary and sign interface doc.addSignature(signature, this); // write incremental (only for signing purpose) doc.saveIncremental(fis, fos); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2337) Add an example for highlighting text based on a string
[ https://issues.apache.org/jira/browse/PDFBOX-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129830#comment-14129830 ] Andreas Lehmkühler commented on PDFBOX-2337: {quote} Do we really need a ICLA / CCLA in this case? {quote} There are no explicit rules to decide that, just a rule of thumb: only substantial changes require signing a CLA. IMO an example on how to use PDFBox doesn't qualify for that. It simply uses PDFBox and doesn't add any new features. {quote} The Apache header you've used is for ASF projects, e.g. Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. isn't true unless you've signed a CLA and contributed this code to Apache. {quote} Good catch. That should be changed to the more general version of the AL 2.0 which can be found [here|http://apache.org/licenses/LICENSE-2.0] in the APPENDIX section at the end. {quote} I’d also vote for taking this sample as is and include that in 1.8.x {quote} +1, we agreed to stop adding any new features to the 1.8 branch. But as this is just an example I don't see any reason not to add it. Add an example for highlighting text based on a string --- Key: PDFBOX-2337 URL: https://issues.apache.org/jira/browse/PDFBOX-2337 Project: PDFBox Issue Type: New Feature Components: Utilities Reporter: Joël Kuiper An often heard request is to be able to highlight a certain text within a PDF programmatically, similar to the highlight functionality in Acrobat or Preview.app. The actual implementation of this functionality is trickier than it appears, since it requires the calculation of bouding boxes from TextPositions. A example class may help people with implementing this (common) functionality. (see for example this discussion https://mail-archives.apache.org/mod_mbox/pdfbox-users/201409.mbox/%3CC8340BB9-E299-4A76-A50B-6155504A0D5B%40joelkuiper.eu%3E) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Jenkins Error
Hi, infra@ just blogged about the incident https://blogs.apache.org/infra/entry/nexus_reduced_performance_issues_resolved BR Andreas Lehmkühler Andreas Lehmkühler andr...@lehmi.de hat am 11. September 2014 um 08:45 geschrieben: Hi, John Hewson j...@jahewson.com hat am 10. September 2014 um 22:11 geschrieben: I’m getting strange build errors on Jenkins with HTTP 401 “Unauthorized” from https://repository.apache.org. Here’s the log: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.8.1:deploy (default-deploy) on project pdfbox-parent: Failed to deploy artifacts: Could not transfer artifact org.apache.pdfbox:pdfbox-parent:pom:2.0.0-20140910.200319-587 from/to apache.snapshots.https (https://repository.apache.org/content/repositories/snapshots): Failed to transfer file: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/pdfbox-parent-2.0.0-20140910.200319-587.pom. Return code is: 401, ReasonPhrase: Unauthorized. - [Help 1] According to infra@ there was an issue with nexus yesterday which should be solved by now. I've already triggered a build manually to check that. -- John BR Andreas Lehmkühler
Re: PDFBox 1.8.7 release?
Hi Maruan, Maruan Sahyoun sahy...@fileaffairs.de hat am 11. September 2014 um 09:32 geschrieben: Hi Andreas, what are your current plans to cut the new release? Dependent on that I could do https://issues.apache.org/jira/browse/PDFBOX-91 [Comb Fields] as a quick fix this weekend to the 1.8 branch. I'm targeting next week, so that your plan would fit in. BR Maruan BR Andreas Lehmkühler Am 14.08.2014 um 09:08 schrieb Andreas Lehmkühler andr...@lehmi.de: Andreas Lehmkühler andr...@lehmi.de hat am 7. August 2014 um 12:35 geschrieben: Hi, there is already a number of solved issues and I guess it's time for a new bugfix release. I'm working on PDFBOX-2250 and I'd like to finish that first but how about a new release in 2 or 3 weeks from now? WDYT? As there weren't any objections I'm targeting the first week of september to cut the release. BR Andreas Lehmkühler
[jira] [Commented] (PDFBOX-2301) RandomAccessBuffer consumes too much memory.
[ https://issues.apache.org/jira/browse/PDFBOX-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129848#comment-14129848 ] Andreas Lehmkühler commented on PDFBOX-2301: The origin issue (PDFBOX-1625) was about merging pdfs and the underlying issue about the usage of the scratch file when merging pdfs (see PDFBOX-1586). PDFBOX-1586 reduces the usage of the scratch file and PDFBOX-1625 tries to detach the source and the destination pdf by cloning. Both are just workarounds and in the case of PDFBOX-1625 it has some side effects. IMHO, we have to overhaul the stream handling within the COS layer and we shouldn't expose the scratch file anymore. The whole stuff should be handled under the hood. The only thing the user may decide is wether to use the file system or the memory as temp area. RandomAccessBuffer consumes too much memory. Key: PDFBOX-2301 URL: https://issues.apache.org/jira/browse/PDFBOX-2301 Project: PDFBox Issue Type: Bug Components: PDModel Reporter: gee Attachments: clone.diff RandomAccessBuffer holds uncompressed image during operation because it is what exactly pdfbox ExtractImages do. but holding uncompressed image instead of compressed one in memory consumes too much memory, not excluding many PDF XObjects that can use filter to compress itself. It would be good if pdfbox provides option that reverts to COSObject state just before the RandomAccess object created(the state that pdf XObject stream parsed and COSDictionary objects haven't created because user doesn't requested it using get() method.) It is crucial feature so that pdfbox can analyze huge pdf file(100MB). In current source, one must close COSStream unless required(and I know closed stream cannot reopened again.) Class Name | Shallow Heap | Retained Heap -- org.apache.pdfbox.cos.COSObject @ 0x5ad4940 | 24 | 8,187,264 |- class class org.apache.pdfbox.cos.COSObject @ 0x58c4020 | 0 | 0 |- generationNumber org.apache.pdfbox.cos.COSInteger @ 0x5ad0080 | 24 |24 |- baseObject org.apache.pdfbox.cos.COSStream @ 0x5b25ea0 | 32 | 8,187,216 | |- class class org.apache.pdfbox.cos.COSStream @ 0x58c3e00 | 8 | 8 | |- items java.util.LinkedHashMap @ 0x5b2a0f0 | 56 | 552 | |- file org.apache.pdfbox.io.RandomAccessBuffer @ 0x5b2a128
[jira] [Created] (PDFBOX-2341) WriteDecodedDoc cant decrypt pdf correctly
simon steiner created PDFBOX-2341: - Summary: WriteDecodedDoc cant decrypt pdf correctly Key: PDFBOX-2341 URL: https://issues.apache.org/jira/browse/PDFBOX-2341 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PDFBOX-2341) WriteDecodedDoc cant decrypt pdf correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] simon steiner updated PDFBOX-2341: -- Attachment: aes256_57.pdf WriteDecodedDoc cant decrypt pdf correctly -- Key: PDFBOX-2341 URL: https://issues.apache.org/jira/browse/PDFBOX-2341 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: aes256_57.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PDFBOX-2341) WriteDecodedDoc cant decrypt pdf correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] simon steiner updated PDFBOX-2341: -- Description: java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing I guess you will ask me to use nonseq was: java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing WriteDecodedDoc cant decrypt pdf correctly -- Key: PDFBOX-2341 URL: https://issues.apache.org/jira/browse/PDFBOX-2341 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: aes256_57.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing I guess you will ask me to use nonseq -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2341) WriteDecodedDoc cant decrypt pdf correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129876#comment-14129876 ] Tilman Hausherr commented on PDFBOX-2341: - There's a part cut off. End of correct stream: {code} BT /F15 10 Tf 1 0 0 -1 0 359.34899902 Tm [002000130016001B00010021000600170004000C001B0012] TJ 1 0 0 -1 0 371.34899902 Tm [00220023002400250026002700010024000300260025002800250027002900210024 74 0029 18 002A0021] TJ ET Q {code} end of bad stream: {code} BT /F15 10 Tf 1 0 0 -1 0 359.34899902 Tm [002000130016001B00010021000600170004000C001B0012] TJ 1 0 0 -1 0 371.34899902 Tm [002200230024002500260027000100 {code} However PDFDebugger shows the correct contents (which is where I got the above data). PDFReader can't render, it has the bad stream and throws an exception Missing closing bracket for hex string. WriteDecodedDoc cant decrypt pdf correctly -- Key: PDFBOX-2341 URL: https://issues.apache.org/jira/browse/PDFBOX-2341 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: aes256_57.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing I guess you will ask me to use nonseq -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
simon steiner created PDFBOX-2342: - Summary: WriteDecodedDoc cant decrypt pdf form correctly Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] simon steiner updated PDFBOX-2342: -- Attachment: test.pdf WriteDecodedDoc cant decrypt pdf form correctly --- Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: test.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2337) Add an example for highlighting text based on a string
[ https://issues.apache.org/jira/browse/PDFBOX-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129931#comment-14129931 ] Joël Kuiper commented on PDFBOX-2337: - I'll change the License header and code style (assuming the Eclipse settings in the repo are still up to date), also I'd be happy to sign a CLA if needed. I could port the functionality to 2.0, however I need (a modified version) of this code in production which still runs 1.8 … so I'll probably only do that after a public release of 2.0 (even a pre-release would be fine, as long as it in Maven) Add an example for highlighting text based on a string --- Key: PDFBOX-2337 URL: https://issues.apache.org/jira/browse/PDFBOX-2337 Project: PDFBox Issue Type: New Feature Components: Utilities Reporter: Joël Kuiper An often heard request is to be able to highlight a certain text within a PDF programmatically, similar to the highlight functionality in Acrobat or Preview.app. The actual implementation of this functionality is trickier than it appears, since it requires the calculation of bouding boxes from TextPositions. A example class may help people with implementing this (common) functionality. (see for example this discussion https://mail-archives.apache.org/mod_mbox/pdfbox-users/201409.mbox/%3CC8340BB9-E299-4A76-A50B-6155504A0D5B%40joelkuiper.eu%3E) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PDFBOX-2337) Add an example for highlighting text based on a string
[ https://issues.apache.org/jira/browse/PDFBOX-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129931#comment-14129931 ] Joël Kuiper edited comment on PDFBOX-2337 at 9/11/14 11:56 AM: --- I'll change the License header and code style (assuming the Eclipse settings in the repo are still up to date), also I'd be happy to sign a CLA if needed. I could port the functionality to 2.0, however I need (a modified version) of this code in production which still runs 1.8 … so I'll probably only do that after a public release of 2.0 (even a pre-release would be fine, as long as it is in Maven) was (Author: joelkuiper): I'll change the License header and code style (assuming the Eclipse settings in the repo are still up to date), also I'd be happy to sign a CLA if needed. I could port the functionality to 2.0, however I need (a modified version) of this code in production which still runs 1.8 … so I'll probably only do that after a public release of 2.0 (even a pre-release would be fine, as long as it in Maven) Add an example for highlighting text based on a string --- Key: PDFBOX-2337 URL: https://issues.apache.org/jira/browse/PDFBOX-2337 Project: PDFBox Issue Type: New Feature Components: Utilities Reporter: Joël Kuiper An often heard request is to be able to highlight a certain text within a PDF programmatically, similar to the highlight functionality in Acrobat or Preview.app. The actual implementation of this functionality is trickier than it appears, since it requires the calculation of bouding boxes from TextPositions. A example class may help people with implementing this (common) functionality. (see for example this discussion https://mail-archives.apache.org/mod_mbox/pdfbox-users/201409.mbox/%3CC8340BB9-E299-4A76-A50B-6155504A0D5B%40joelkuiper.eu%3E) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2301) RandomAccessBuffer consumes too much memory.
[ https://issues.apache.org/jira/browse/PDFBOX-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129954#comment-14129954 ] Timo Boehme commented on PDFBOX-2301: - I'm also interested in fixing the RandomAccess buffer issue since the cloning work-around seems a bit problematic to me too. I did try to understand the issue for this workaround an came to following conclusions: # problem is closing the buffer stream when closing a document but COS objects should be used somewhere else, needing access to the buffered data # the work-around is implemented for COSStream objects only and clones buffer in case it is RandomAccessBuffer; in case of RandomAccessFile this is not done thus here the problem from point 1 still persists (?) # since only COSStream will access the data it writes into buffer, why not simply create a new buffer object instead of cloning? At least this will not duplicate data already in buffer but not used by the stream. The buffer will be garbage collected when there is no reference to the COSStream or to any input stream created on the buffer WDYT? RandomAccessBuffer consumes too much memory. Key: PDFBOX-2301 URL: https://issues.apache.org/jira/browse/PDFBOX-2301 Project: PDFBox Issue Type: Bug Components: PDModel Reporter: gee Attachments: clone.diff RandomAccessBuffer holds uncompressed image during operation because it is what exactly pdfbox ExtractImages do. but holding uncompressed image instead of compressed one in memory consumes too much memory, not excluding many PDF XObjects that can use filter to compress itself. It would be good if pdfbox provides option that reverts to COSObject state just before the RandomAccess object created(the state that pdf XObject stream parsed and COSDictionary objects haven't created because user doesn't requested it using get() method.) It is crucial feature so that pdfbox can analyze huge pdf file(100MB). In current source, one must close COSStream unless required(and I know closed stream cannot reopened again.) Class Name | Shallow Heap | Retained Heap -- org.apache.pdfbox.cos.COSObject @ 0x5ad4940 | 24 | 8,187,264 |- class class org.apache.pdfbox.cos.COSObject @ 0x58c4020 | 0 | 0 |- generationNumber org.apache.pdfbox.cos.COSInteger @ 0x5ad0080 | 24 |24 |- baseObject org.apache.pdfbox.cos.COSStream @ 0x5b25ea0 | 32 | 8,187,216 | |- class class org.apache.pdfbox.cos.COSStream @ 0x58c3e00 | 8 | 8 | |- items java.util.LinkedHashMap @ 0x5b2a0f0 |
[jira] [Created] (PDFBOX-2343) Giving NullPoint exception when we call PDType1Font.HELVETICA_BOLD.getStringWidth(Some String)
Gayan Wijenayaka created PDFBOX-2343: Summary: Giving NullPoint exception when we call PDType1Font.HELVETICA_BOLD.getStringWidth(Some String) Key: PDFBOX-2343 URL: https://issues.apache.org/jira/browse/PDFBOX-2343 Project: PDFBox Issue Type: Bug Affects Versions: 2.0.0 Reporter: Gayan Wijenayaka When we call the PDType1Font.HELVETICA_BOLD.getStringWidth(Some String) it is throwing java.lang.NullPointerException after pdfbox-app-2.0.0-20140903.210612-518 release. Could you please fix this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (PDFBOX-2341) WriteDecodedDoc cant decrypt pdf correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-2341. - Resolution: Won't Fix Assignee: Tilman Hausherr The old parser works as described, I traced through it, saved intermediate files etc. What really happened is a border case that can't be solved without creating trouble elsewhere: the encrypted stream has a hex 0D at the end. In the sequential parser that 0x0D isn't taken into the stream because it is assumed that it doesn't belong to it, i.e. one expects stream-data 0D 0A endstream, or stream-data 0D endstream. The non-sequential parser just takes the stream according to the length. To prove that theory I changed your file so that there is one 0D more in the stream. Now it works with the sequential parser. This suggests that this missing 0D has a meaning for the decryption routine. I also did the opposite test, changed the length of the stream so that 0D isn't included, and not it failed with the nonsequential parser. So long story in short: use nonseq :-) WriteDecodedDoc cant decrypt pdf correctly -- Key: PDFBOX-2341 URL: https://issues.apache.org/jira/browse/PDFBOX-2341 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Assignee: Tilman Hausherr Attachments: aes256_57.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing I guess you will ask me to use nonseq -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (PDFBOX-2341) WriteDecodedDoc cant decrypt pdf correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened PDFBOX-2341: - oops, meant to close only WriteDecodedDoc cant decrypt pdf correctly -- Key: PDFBOX-2341 URL: https://issues.apache.org/jira/browse/PDFBOX-2341 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Assignee: Tilman Hausherr Attachments: aes256_57.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing I guess you will ask me to use nonseq -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2341) WriteDecodedDoc cant decrypt pdf correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130166#comment-14130166 ] simon steiner commented on PDFBOX-2341: --- I tried moving to nonseq but blocked by PDFBOX-2342 WriteDecodedDoc cant decrypt pdf correctly -- Key: PDFBOX-2341 URL: https://issues.apache.org/jira/browse/PDFBOX-2341 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Assignee: Tilman Hausherr Attachments: aes256_57.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc aes256_57.pdf tmp.pdf Kind Regards missing I guess you will ask me to use nonseq -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130228#comment-14130228 ] Tilman Hausherr commented on PDFBOX-2342: - To clarify: Country drop down contents are garbage when WriteDecodedDoc is used with the -nonSeq option WriteDecodedDoc cant decrypt pdf form correctly --- Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: test.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130271#comment-14130271 ] ASF subversion and git services commented on PDFBOX-2261: - Commit 1624334 from [~lehmi] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1624334 ] PDFBOX-2261: move setValue to super class PDChoice Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-2261. Resolution: Fixed I'm done here. PDFBOX-2333 is a follow up for the creation of the appearance stream Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130341#comment-14130341 ] Tim Allison commented on PDFBOX-2261: - Thank you, all! Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130367#comment-14130367 ] ASF subversion and git services commented on PDFBOX-2342: - Commit 1624347 from [~tilman] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1624347 ] PDFBOX-2342: decrypt COSArray too, not just COSString WriteDecodedDoc cant decrypt pdf form correctly --- Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: test.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130370#comment-14130370 ] ASF subversion and git services commented on PDFBOX-2342: - Commit 1624348 from [~tilman] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1624348 ] PDFBOX-2342: allow public access to decryptArray WriteDecodedDoc cant decrypt pdf form correctly --- Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: test.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130386#comment-14130386 ] Tilman Hausherr commented on PDFBOX-2342: - The non sequential parser doesn't decrypt recursively like the sequential one does (why?). In a COSDictionary, only COSStrings are decrypted, the rest is left untouched. There's already a suspicious TODO there. For now, I've just added the decryption of COSArray which solves [~ssteiner1]s problem. But I wonder what else is incorrectly (not) decrypted, e.g. a COSDictionary within a COSDictionary. Why aren't we using this nice does everything method? {code} private void decrypt(COSBase obj, long objNum, long genNum) throws IOException {code} WriteDecodedDoc cant decrypt pdf form correctly --- Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: test.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130386#comment-14130386 ] Tilman Hausherr edited comment on PDFBOX-2342 at 9/11/14 5:52 PM: -- The non sequential parser doesn't decrypt recursively like the sequential one does (why?). In a COSDictionary, only COSStrings are decrypted, the rest is left untouched. There's already a suspicious TODO there. For now, I've just added the decryption of COSArray which solves [~ssteiner1]s problem. But I wonder what else is incorrectly (not) decrypted, e.g. a COSDictionary within a COSDictionary. Why aren't we using this nice does everything method in the SecurityHandler class? {code} private void decrypt(COSBase obj, long objNum, long genNum) throws IOException {code} was (Author: tilman): The non sequential parser doesn't decrypt recursively like the sequential one does (why?). In a COSDictionary, only COSStrings are decrypted, the rest is left untouched. There's already a suspicious TODO there. For now, I've just added the decryption of COSArray which solves [~ssteiner1]s problem. But I wonder what else is incorrectly (not) decrypted, e.g. a COSDictionary within a COSDictionary. Why aren't we using this nice does everything method? {code} private void decrypt(COSBase obj, long objNum, long genNum) throws IOException {code} WriteDecodedDoc cant decrypt pdf form correctly --- Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: test.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PDFBOX-2342) WriteDecodedDoc cant decrypt pdf form correctly
[ https://issues.apache.org/jira/browse/PDFBOX-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130228#comment-14130228 ] Tilman Hausherr edited comment on PDFBOX-2342 at 9/11/14 5:53 PM: -- To clarify: Country drop down box contents are garbage when WriteDecodedDoc is used with the -nonSeq option, but they are fine with the old parser. was (Author: tilman): To clarify: Country drop down contents are garbage when WriteDecodedDoc is used with the -nonSeq option WriteDecodedDoc cant decrypt pdf form correctly --- Key: PDFBOX-2342 URL: https://issues.apache.org/jira/browse/PDFBOX-2342 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: simon steiner Attachments: test.pdf java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar WriteDecodedDoc -nonSeq test.pdf country selection is wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)