[jira] [Commented] (PDFBOX-5840) When splitting, keep named page destinations that are part of target document(s)

2024-06-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855210#comment-17855210
 ] 

ASF subversion and git services commented on PDFBOX-5840:
-

Commit 1918352 from Tilman Hausherr in branch 'pdfbox/branches/3.0'
[ https://svn.apache.org/r1918352 ]

PDFBOX-5840: fix imports

> When splitting, keep named page destinations that are part of target 
> document(s)
> 
>
> Key: PDFBOX-5840
> URL: https://issues.apache.org/jira/browse/PDFBOX-5840
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.31, 3.0.2 PDFBox
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
>Priority: Minor
> Fix For: 2.0.32, 3.0.3 PDFBox, 4.0.0
>
> Attachments: 410609.pdf, named-dest-handling abandoned code.txt
>
>
> Keep named destinations. The current code just ignores them. I wrote some 40 
> lines that would create a name tree in the destination document, but this 
> didn't work because the destination name gets modified when retrieved as a 
> string. So I just keep the actual destination and forget the name, which is a 
> single code line. It's a new document anyway and the average user expectation 
> is that the links "just work".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-5841) Split result document misses metadata after split

2024-06-15 Thread Tilman Hausherr (Jira)
Tilman Hausherr created PDFBOX-5841:
---

 Summary: Split result document misses metadata after split
 Key: PDFBOX-5841
 URL: https://issues.apache.org/jira/browse/PDFBOX-5841
 Project: PDFBox
  Issue Type: Bug
  Components: Writing
Affects Versions: 3.0.3 PDFBox, 4.0.0
Reporter: Tilman Hausherr
 Fix For: 3.0.3 PDFBox, 4.0.0
 Attachments: splitresult1.pdf, splitresult2.pdf

This happens with the test file of PDFBOX-5840 and can also be reproduced with 
the command line utility: the first split result file doesn't have the metadata.

Alternatively it can be reproduced programmatically by adding this code below 
{{assertEquals(5, pageTree.indexOf(pd5.getPage()));}} in 
{code:java}
assertNotNull(dstDoc.getDocumentCatalog().getMetadata());
ByteArrayOutputStream baos = new ByteArrayOutputStream();
dstDoc.save(baos);
PDDocument reloadedDoc = Loader.loadPDF(baos.toByteArray());
assertNotNull(reloadedDoc.getDocumentCatalog().getMetadata());
reloadedDoc.close();
{code}
I believe this is another writing problem, because the metadata exists, but 
gets lost during the first save, not during a second one (not part of the test 
code). It is expected to be object 116. It doesn't happen with 2.0. Attached: 
two saved files by splitting so that the entire file is the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-5841) First split result document misses metadata after split

2024-06-15 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-5841:

Summary: First split result document misses metadata after split  (was: 
Split result document misses metadata after split)

> First split result document misses metadata after split
> ---
>
> Key: PDFBOX-5841
> URL: https://issues.apache.org/jira/browse/PDFBOX-5841
> Project: PDFBox
>  Issue Type: Bug
>  Components: Writing
>Affects Versions: 3.0.3 PDFBox, 4.0.0
>Reporter: Tilman Hausherr
>Priority: Major
> Fix For: 3.0.3 PDFBox, 4.0.0
>
> Attachments: splitresult1.pdf, splitresult2.pdf
>
>
> This happens with the test file of PDFBOX-5840 and can also be reproduced 
> with the command line utility: the first split result file doesn't have the 
> metadata.
> Alternatively it can be reproduced programmatically by adding this code below 
> {{assertEquals(5, pageTree.indexOf(pd5.getPage()));}} in 
> {code:java}
> assertNotNull(dstDoc.getDocumentCatalog().getMetadata());
> ByteArrayOutputStream baos = new ByteArrayOutputStream();
> dstDoc.save(baos);
> PDDocument reloadedDoc = Loader.loadPDF(baos.toByteArray());
> assertNotNull(reloadedDoc.getDocumentCatalog().getMetadata());
> reloadedDoc.close();
> {code}
> I believe this is another writing problem, because the metadata exists, but 
> gets lost during the first save, not during a second one (not part of the 
> test code). It is expected to be object 116. It doesn't happen with 2.0. 
> Attached: two saved files by splitting so that the entire file is the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-5834) [PATCH] PDF split missing names from documentCatalog

2024-06-15 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-5834:

Attachment: 801500.pdf

> [PATCH] PDF split missing names from documentCatalog
> 
>
> Key: PDFBOX-5834
> URL: https://issues.apache.org/jira/browse/PDFBOX-5834
> Project: PDFBox
>  Issue Type: Bug
>Reporter: Simon Steiner
>Priority: Major
> Attachments: 726725.pdf, 801500.pdf, tmp.patch
>
>
> java -jar app/target/pdfbox-app-2.0.32-SNAPSHOT.jar PDFSplit xxx.pdf
> I would expect to see the names dict inside the documentCatalog which is used 
> to store pdf templates



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-5834) [PATCH] PDF split missing names from documentCatalog

2024-06-15 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-5834:

Attachment: 726725.pdf

> [PATCH] PDF split missing names from documentCatalog
> 
>
> Key: PDFBOX-5834
> URL: https://issues.apache.org/jira/browse/PDFBOX-5834
> Project: PDFBox
>  Issue Type: Bug
>Reporter: Simon Steiner
>Priority: Major
> Attachments: 726725.pdf, 801500.pdf, tmp.patch
>
>
> java -jar app/target/pdfbox-app-2.0.32-SNAPSHOT.jar PDFSplit xxx.pdf
> I would expect to see the names dict inside the documentCatalog which is used 
> to store pdf templates



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5834) [PATCH] PDF split missing names from documentCatalog

2024-06-15 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855217#comment-17855217
 ] 

Tilman Hausherr commented on PDFBOX-5834:
-

I have attached two files. This is really weird stuff, which relies on JS 
usage. I wonder why this would have to be split at all.

> [PATCH] PDF split missing names from documentCatalog
> 
>
> Key: PDFBOX-5834
> URL: https://issues.apache.org/jira/browse/PDFBOX-5834
> Project: PDFBox
>  Issue Type: Bug
>Reporter: Simon Steiner
>Priority: Major
> Attachments: 726725.pdf, 801500.pdf, tmp.patch
>
>
> java -jar app/target/pdfbox-app-2.0.32-SNAPSHOT.jar PDFSplit xxx.pdf
> I would expect to see the names dict inside the documentCatalog which is used 
> to store pdf templates



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5836) PDF A-1 falsely validated as invalid for ICC color profile regression

2024-06-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855241#comment-17855241
 ] 

Jochen Stärk commented on PDFBOX-5836:
--

JDK 11 on windows. I'm away for a week now but the week thereafter I'll try to 
deliver source code for a small sample.

> PDF A-1 falsely validated as invalid for ICC color profile regression
> -
>
> Key: PDFBOX-5836
> URL: https://issues.apache.org/jira/browse/PDFBOX-5836
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 3.0.2 PDFBox
>Reporter: Jochen Stärk
>Priority: Major
> Attachments: MustangGnuaccountingBeispielRE-20190610_507blanko.pdf
>
>
> PreflightParser.validate(theFile.toFile()).isValid() throws a "Unable to 
> parse the ICC Profile" on the attached, Libreoffice-generated PDF/A-1. 
> VeraPDF validates the file as valid. It worked with PDF 2 and I need it to be 
> fixed in context of my upgrade to PDFbox 3 
> (https://github.com/ZUGFeRD/mustangproject/issues/373).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5836) PDF A-1 falsely validated as invalid for ICC color profile regression

2024-06-15 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855242#comment-17855242
 ] 

Tilman Hausherr commented on PDFBOX-5836:
-

I used the command line application. My jdk11 version:
java version "11.0.21" 2023-10-17 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.21+9-LTS-193)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.21+9-LTS-193, mixed mode)

> PDF A-1 falsely validated as invalid for ICC color profile regression
> -
>
> Key: PDFBOX-5836
> URL: https://issues.apache.org/jira/browse/PDFBOX-5836
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 3.0.2 PDFBox
>Reporter: Jochen Stärk
>Priority: Major
> Attachments: MustangGnuaccountingBeispielRE-20190610_507blanko.pdf
>
>
> PreflightParser.validate(theFile.toFile()).isValid() throws a "Unable to 
> parse the ICC Profile" on the attached, Libreoffice-generated PDF/A-1. 
> VeraPDF validates the file as valid. It worked with PDF 2 and I need it to be 
> fixed in context of my upgrade to PDFbox 3 
> (https://github.com/ZUGFeRD/mustangproject/issues/373).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org