[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements

2021-04-27 Thread GitBox


valerybokov commented on pull request #107:
URL: https://github.com/apache/pdfbox/pull/107#issuecomment-828184529


   > I don't know and I prefer not to test this, because in the worst case we 
would either not clone enough, or have doubles. This is a very sensitive part 
of the code, I had some hard time around xmas 2018 fixing bugs. The structure 
of PDF isn't a pure tree, there's a lot of references going around, e.g. 
between the structure tree and the pages.
   
   Thanks. I found few minor improvements but I'll try to find more and then 
make commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-5169) PDFMerger produces overly large output PDF

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334464#comment-17334464
 ] 

Tilman Hausherr edited comment on PDFBOX-5169 at 4/28/21, 5:43 AM:
---

I have no trouble opening the result file created with 3.0, maybe this is 
related to fixed bugs. Please retry with the snapshot build.


was (Author: tilman):


I have no trouble opening the result file created with 3.0, maybe this is 
related to fixed bugs. Please retry with the snapshot build.

> PDFMerger produces overly large output PDF
> --
>
> Key: PDFBOX-5169
> URL: https://issues.apache.org/jira/browse/PDFBOX-5169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.22, 2.0.23
> Environment: Debian 10
>Reporter: Jakov Vežić
>Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5169) PDFMerger produces overly large output PDF

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334464#comment-17334464
 ] 

Tilman Hausherr commented on PDFBOX-5169:
-



I have no trouble opening the result file created with 3.0, maybe this is 
related to fixed bugs. Please retry with the snapshot build.

> PDFMerger produces overly large output PDF
> --
>
> Key: PDFBOX-5169
> URL: https://issues.apache.org/jira/browse/PDFBOX-5169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.22, 2.0.23
> Environment: Debian 10
>Reporter: Jakov Vežić
>Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334462#comment-17334462
 ] 

Tilman Hausherr commented on PDFBOX-2602:
-

[~jmvezic] please retry with a snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.0-SNAPSHOT/
the call without parameters should not work on a headless system.

I have no trouble opening the result file created with 3.0, maybe this is 
related to fixed bugs.

> Enhance command line tools
> --
>
> Key: PDFBOX-2602
> URL: https://issues.apache.org/jira/browse/PDFBOX-2602
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.8, 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> The command line tools shall be enhanced to have the same behavior across all 
> tools.
> From the discussion on the dev mailing list
> - add an -h option to print the usage
> - print the usage to System.err and use an exit code of 1 if there was an 
> invalid command line parameter
> - print messages on exceptions to System.err
> - rethrow the exception so java can handle it if it will terminate afterwards 
> anyway
> - use an exit code of 1if rethrowing doesn't make sense
> Additional input:
> https://clig.dev/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334462#comment-17334462
 ] 

Tilman Hausherr edited comment on PDFBOX-2602 at 4/28/21, 5:42 AM:
---

[~jmvezic] please retry with a snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.0-SNAPSHOT/
the call without parameters should not work on a headless system.


was (Author: tilman):
[~jmvezic] please retry with a snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.0-SNAPSHOT/
the call without parameters should not work on a headless system.

I have no trouble opening the result file created with 3.0, maybe this is 
related to fixed bugs.

> Enhance command line tools
> --
>
> Key: PDFBOX-2602
> URL: https://issues.apache.org/jira/browse/PDFBOX-2602
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.8, 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> The command line tools shall be enhanced to have the same behavior across all 
> tools.
> From the discussion on the dev mailing list
> - add an -h option to print the usage
> - print the usage to System.err and use an exit code of 1 if there was an 
> invalid command line parameter
> - print messages on exceptions to System.err
> - rethrow the exception so java can handle it if it will terminate afterwards 
> anyway
> - use an exit code of 1if rethrowing doesn't make sense
> Additional input:
> https://clig.dev/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5180) Snapshot Deploy not working

2021-04-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1733#comment-1733
 ] 

Andreas Lehmkühler commented on PDFBOX-5180:


Any idea on how to solve that issue?

> Snapshot Deploy not working
> ---
>
> Key: PDFBOX-5180
> URL: https://issues.apache.org/jira/browse/PDFBOX-5180
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.24
>Reporter: Oliver Schmidtmer
>Priority: Major
>
> When building and deploying snapshots, the build number is higher than the 
> last uploaded snapshot. So the dependency resolution to 2.0.24-SNAPSHOT does 
> not work.
> It seems the deploy plugin is triggered twice, but only once there are 
> uploads. The other sub projects are also affected.
> [https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/2.0.24-SNAPSHOT/maven-metadata.xml]
> {code:java}
> 
>  org.apache.pdfbox
>  pdfbox
>  2.0.24-SNAPSHOT
>   
>
>20210426.180431
>154
>   
>   20210426180431
>   
>
> jar
> 2.0.24-20210426.180213-153
> 20210426180431
>
>
> pom
>  2.0.24-20210426.180213-153
>  20210426180431
> 
>   
>  
> 
> {code}
> In trunk / 3.0 this doesn't happen:
>  
> [https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/3.0.0-SNAPSHOT/maven-metadata.xml]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333528#comment-17333528
 ] 

Andreas Lehmkühler edited comment on PDFBOX-2602 at 4/28/21, 4:34 AM:
--

It should help as {{main}} isn't called when loading the class. The modifier 
has to be changed so that is isn't a static constant anymore, but that 
shouldn't be an issue at all


was (Author: lehmi):
It should help as {{main}} isn't called when loading the class. The modifier 
has to be changed so that is isn't a static constant anymore, but that should 
be an issue at all

> Enhance command line tools
> --
>
> Key: PDFBOX-2602
> URL: https://issues.apache.org/jira/browse/PDFBOX-2602
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.8, 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> The command line tools shall be enhanced to have the same behavior across all 
> tools.
> From the discussion on the dev mailing list
> - add an -h option to print the usage
> - print the usage to System.err and use an exit code of 1 if there was an 
> invalid command line parameter
> - print messages on exceptions to System.err
> - rethrow the exception so java can handle it if it will terminate afterwards 
> anyway
> - use an exit code of 1if rethrowing doesn't make sense
> Additional input:
> https://clig.dev/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334424#comment-17334424
 ] 

ASF subversion and git services commented on PDFBOX-2602:
-

Commit 1889246 from Tilman Hausherr in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1889246 ]

PDFBOX-2602: delay awt call which fails on headless systems when PDFBox.main is 
called

> Enhance command line tools
> --
>
> Key: PDFBOX-2602
> URL: https://issues.apache.org/jira/browse/PDFBOX-2602
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.8, 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> The command line tools shall be enhanced to have the same behavior across all 
> tools.
> From the discussion on the dev mailing list
> - add an -h option to print the usage
> - print the usage to System.err and use an exit code of 1 if there was an 
> invalid command line parameter
> - print messages on exceptions to System.err
> - rethrow the exception so java can handle it if it will terminate afterwards 
> anyway
> - use an exit code of 1if rethrowing doesn't make sense
> Additional input:
> https://clig.dev/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333528#comment-17333528
 ] 

Andreas Lehmkühler commented on PDFBOX-2602:


It should help as {{main}} isn't called when loading the class. The modifier 
has to be changed so that is isn't a static constant anymore, but that should 
be an issue at all

> Enhance command line tools
> --
>
> Key: PDFBOX-2602
> URL: https://issues.apache.org/jira/browse/PDFBOX-2602
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.8, 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> The command line tools shall be enhanced to have the same behavior across all 
> tools.
> From the discussion on the dev mailing list
> - add an -h option to print the usage
> - print the usage to System.err and use an exit code of 1 if there was an 
> invalid command line parameter
> - print messages on exceptions to System.err
> - rethrow the exception so java can handle it if it will terminate afterwards 
> anyway
> - use an exit code of 1if rethrowing doesn't make sense
> Additional input:
> https://clig.dev/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5169) PDFMerger produces overly large output PDF

2021-04-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333487#comment-17333487
 ] 

Jakov Vežić commented on PDFBOX-5169:
-

In the meantime, managed to get this merge running on Windows with 3.0.0RC1, 
using

 
{code:java}
java -jar pdfbox-app-3.0.0-RC1.jar merge -o=out.pdf -i=1.pdf -i=2.pdf
{code}
however, it takes a really long time on 6-core CPU to complete - about 4 
minutes. The PDF it produces hangs almost immediately when opened with Acrobat 
Reader, and gives out "not responding" so I have to force close it. Opening the 
PDF in Chrome works flawlessly however, which is weird.

 

> PDFMerger produces overly large output PDF
> --
>
> Key: PDFBOX-5169
> URL: https://issues.apache.org/jira/browse/PDFBOX-5169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.22, 2.0.23
> Environment: Debian 10
>Reporter: Jakov Vežić
>Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333481#comment-17333481
 ] 

Tilman Hausherr edited comment on PDFBOX-2602 at 4/27/21, 6:55 PM:
---

I wonder if it would help if the initialization
{code}
private static final int SHORCUT_KEY_MASK = 
Toolkit.getDefaultToolkit().getMenuShortcutKeyMask();
{code}
is moved to the PDFDebugger constructor. See also
https://stackoverflow.com/questions/8517121/java-what-is-the-difference-between-init-and-clinit


was (Author: tilman):
I wonder if it would help if the initialization
{code}
private static final int SHORCUT_KEY_MASK = 
Toolkit.getDefaultToolkit().getMenuShortcutKeyMask();
{code}
is moved to the PDFDebugger constructor.

> Enhance command line tools
> --
>
> Key: PDFBOX-2602
> URL: https://issues.apache.org/jira/browse/PDFBOX-2602
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.8, 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> The command line tools shall be enhanced to have the same behavior across all 
> tools.
> From the discussion on the dev mailing list
> - add an -h option to print the usage
> - print the usage to System.err and use an exit code of 1 if there was an 
> invalid command line parameter
> - print messages on exceptions to System.err
> - rethrow the exception so java can handle it if it will terminate afterwards 
> anyway
> - use an exit code of 1if rethrowing doesn't make sense
> Additional input:
> https://clig.dev/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333481#comment-17333481
 ] 

Tilman Hausherr commented on PDFBOX-2602:
-

I wonder if it would help if the initialization
{code}
private static final int SHORCUT_KEY_MASK = 
Toolkit.getDefaultToolkit().getMenuShortcutKeyMask();
{code}
is moved to the PDFDebugger constructor.

> Enhance command line tools
> --
>
> Key: PDFBOX-2602
> URL: https://issues.apache.org/jira/browse/PDFBOX-2602
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.8, 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> The command line tools shall be enhanced to have the same behavior across all 
> tools.
> From the discussion on the dev mailing list
> - add an -h option to print the usage
> - print the usage to System.err and use an exit code of 1 if there was an 
> invalid command line parameter
> - print messages on exceptions to System.err
> - rethrow the exception so java can handle it if it will terminate afterwards 
> anyway
> - use an exit code of 1if rethrowing doesn't make sense
> Additional input:
> https://clig.dev/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2602) Enhance command line tools

2021-04-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333473#comment-17333473
 ] 

Jakov Vežić commented on PDFBOX-2602:
-

I'm getting all sorts of exceptions running PDFBox on Debian 10 (Digital 
Ocean), can't even run a single command. 3.0.0-RC1.

 

Just running
{code:java}
java -jar pdfbox-app-3.0.0-RC1.jar -help
{code}
gives the following:
{code:java}
Exception in thread "main" java.lang.ExceptionInInitializerError
at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/java.lang.Class.newInstance(Class.java:584)
at picocli.CommandLine$DefaultFactory.create(CommandLine.java:5486)
at picocli.CommandLine$DefaultFactory.create(CommandLine.java:5512)
at 
picocli.CommandLine$Model$CommandUserObject.getInstance(CommandLine.java:11813)
at 
picocli.CommandLine$Model$CommandUserObject.get(CommandLine.java:11838)
at picocli.CommandLine$Model$FieldBinding.set(CommandLine.java:11661)
at 
picocli.CommandLine$Model$CommandReflection.initFromAnnotatedTypedMembers(CommandLine.java:11532)
at 
picocli.CommandLine$Model$CommandReflection.initFromAnnotatedFields(CommandLine.java:11466)
at 
picocli.CommandLine$Model$CommandReflection.extractCommandSpec(CommandLine.java:11399)
at 
picocli.CommandLine$Model$CommandSpec.forAnnotatedObject(CommandLine.java:6202)
at picocli.CommandLine.(CommandLine.java:227)
at picocli.CommandLine.toCommandLine(CommandLine.java:3517)
at picocli.CommandLine.addSubcommand(CommandLine.java:373)
at picocli.CommandLine.addSubcommand(CommandLine.java:354)
at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:51)
Caused by: java.awt.HeadlessException:
No X11 DISPLAY variable was set, but this program performed an operation which 
requires it.
at 
java.desktop/sun.awt.HeadlessToolkit.getMenuShortcutKeyMask(HeadlessToolkit.java:135)
at org.apache.pdfbox.debugger.PDFDebugger.(PDFDebugger.java:154)
... 19 more

{code}
If I do what's recommended here 
([https://stackoverflow.com/questions/662421/no-x11-display-variable-what-does-it-mean)]
 I get:
{code:java}
Exception in thread "main" java.awt.AWTError: Can't connect to X11 window 
server using ':0.0' as the value of the DISPLAY variable.
at java.desktop/sun.awt.X11GraphicsEnvironment.initDisplay(Native 
Method)
at 
java.desktop/sun.awt.X11GraphicsEnvironment$1.run(X11GraphicsEnvironment.java:102)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at 
java.desktop/sun.awt.X11GraphicsEnvironment.(X11GraphicsEnvironment.java:61)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:315)
at 
java.desktop/java.awt.GraphicsEnvironment$LocalGE.createGE(GraphicsEnvironment.java:101)
at 
java.desktop/java.awt.GraphicsEnvironment$LocalGE.(GraphicsEnvironment.java:83)
at 
java.desktop/java.awt.GraphicsEnvironment.getLocalGraphicsEnvironment(GraphicsEnvironment.java:129)
at java.desktop/sun.awt.X11.XToolkit.(XToolkit.java:231)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:315)
at java.desktop/java.awt.Toolkit$2.run(Toolkit.java:588)
at java.desktop/java.awt.Toolkit$2.run(Toolkit.java:583)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.desktop/java.awt.Toolkit.getDefaultToolkit(Toolkit.java:582)
at org.apache.pdfbox.debugger.PDFDebugger.(PDFDebugger.java:154)
at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/java.lang.Class.newInstance(Class.java:584)
at picocli.CommandLine$DefaultFactory.create(CommandLine.java:5486)
at picocli.CommandLine$DefaultFactory.create(CommandLine.java:5512)
at 
picocli.CommandLine$Model$CommandUserObject.getInstance(CommandLine.java:11813)
at 
picocli.CommandLine$Model$CommandUserObject.get(CommandLine.java:11838)
at picocli.CommandLine$Model$Fiel

[jira] [Commented] (PDFBOX-5169) PDFMerger produces overly large output PDF

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333437#comment-17333437
 ] 

Tilman Hausherr commented on PDFBOX-5169:
-

Yes open a new ticket or comment in PDFBOX-2602. The cause is that PDFBox.java 
initializes the PDFDebugger class, even if it isn't used.

> PDFMerger produces overly large output PDF
> --
>
> Key: PDFBOX-5169
> URL: https://issues.apache.org/jira/browse/PDFBOX-5169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.22, 2.0.23
> Environment: Debian 10
>Reporter: Jakov Vežić
>Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5169) PDFMerger produces overly large output PDF

2021-04-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333427#comment-17333427
 ] 

Jakov Vežić commented on PDFBOX-5169:
-

I'm not sure I understand the exception, I'm running PDFBox without a GUI, on a 
headless server, I assume that's what CLI is targeted for anyway. It also needs 
to be an automated task from PHP so I can't do "export DISPLAY=:0.0". Is there 
a reason for this new requirement (2.0.23 doesn't require it, for example)? 
Maybe I should open a new ticket for this?

> PDFMerger produces overly large output PDF
> --
>
> Key: PDFBOX-5169
> URL: https://issues.apache.org/jira/browse/PDFBOX-5169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.22, 2.0.23
> Environment: Debian 10
>Reporter: Jakov Vežić
>Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5169) PDFMerger produces overly large output PDF

2021-04-27 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333418#comment-17333418
 ] 

Tilman Hausherr commented on PDFBOX-5169:
-

Re the exception, see here: 
https://stackoverflow.com/questions/662421/no-x11-display-variable-what-does-it-mean

> PDFMerger produces overly large output PDF
> --
>
> Key: PDFBOX-5169
> URL: https://issues.apache.org/jira/browse/PDFBOX-5169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.22, 2.0.23
> Environment: Debian 10
>Reporter: Jakov Vežić
>Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-5051) Slow rendering for specific PDF file

2021-04-27 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-5051.
-
  Assignee: Tilman Hausherr
Resolution: Fixed

Thanks [~Schmidor] !

> Slow rendering for specific PDF file
> 
>
> Key: PDFBOX-5051
> URL: https://issues.apache.org/jira/browse/PDFBOX-5051
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.21
> Environment: Amazon Coretto jdk11.0.3_7, OpenJDK 15
>Reporter: Stefan Mueller
>Assignee: Tilman Hausherr
>Priority: Major
> Fix For: 2.0.24, 3.0.0 PDFBox
>
> Attachments: slowpdfbox.pdf
>
>
> Doing PDF rendering upon the document being attached is rather slow.
> It takes 18 seconds on a Core i7 machine with 32 GB of RAM and no maximum 
> being imposed upon the JVM.
> {code:java}
>  System.setProperty("sun.java2d.cmm", 
> "sun.java2d.cmm.kcms.KcmsServiceProvider");
>  long ms = System.currentTimeMillis();        
>  try (final PDDocument document = PDDocument.load(new 
> File("C:\\temp\\slowpdfbox.pdf")))
> {         
> ms = System.currentTimeMillis() -ms;         
> System.out.println("Took " + ms + " milliseconds for loading");               
>       
> PDFRenderer pdfRenderer = new PDFRenderer(document);            
> pdfRenderer.setSubsamplingAllowed(true);            
> for (int page = 0; page < 1; ++page)            
> {             
> ms = System.currentTimeMillis();                             
> BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB); 
>                                
> ms = System.currentTimeMillis() -ms;                
> System.out.println("Took " + ms + " milliseconds for rendering");             
>       
>                 
> String fileName = "c:\\temp\\test.jpg";                
> ImageIOUtil.writeImage(bim, fileName, 300); //<---this number            
> }           
>  document.close();        
> } 
> catch (IOException e){           
>  System.err.println("Exception while reading pdf document - " + e);        
> }
> {code}
> Console Output:
> Took 262 milliseconds for loading
> Dez. 18, 2020 6:25:15 VORM. 
> org.apache.pdfbox.pdmodel.graphics.color.PDICCBased ensureDisplayProfile
> WARNUNG: ICC profile is Perceptual, ignoring, treating as Display class
> Took 17914 milliseconds for rendering
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5169) PDFMerger produces overly large output PDF

2021-04-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333407#comment-17333407
 ] 

Andreas Lehmkühler commented on PDFBOX-5169:


We changed the command line parameters, see 
https://pdfbox.apache.org/3.0/migration.html for some first hints.

In your case:
{code}
Usage: pdfbox merge [-hV] -o= -i= [-i=]...
  -h, --help   Show this help message and exit.
  -i, --input= the PDF files to merge.
  -o, --output=   the merged PDF file.
  -V, --versionPrint version information and exit.
{code}

> PDFMerger produces overly large output PDF
> --
>
> Key: PDFBOX-5169
> URL: https://issues.apache.org/jira/browse/PDFBOX-5169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.22, 2.0.23
> Environment: Debian 10
>Reporter: Jakov Vežić
>Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2941) Improve PDFDebugger (2)

2021-04-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1783#comment-1783
 ] 

ASF subversion and git services commented on PDFBOX-2941:
-

Commit 1889242 from Tilman Hausherr in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1889242 ]

PDFBOX-2941: check rotation input value; keep image type; improve java doc; 
support some negative values; use int instead of double

> Improve PDFDebugger (2)
> ---
>
> Key: PDFBOX-2941
> URL: https://issues.apache.org/jira/browse/PDFBOX-2941
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Priority: Major
> Attachments: gs-bugzilla694570.pdf, keep_zoom.diff, osx-tabs.png, 
> pdfdebugger-screenshot-reverted.png, pdfdebugger-screenshot-trunc.png, 
> screenshot_debugger_new.png, screenshot_debugger_not_aligned.png, 
> screenshot_debugger_old.png, screenshot_w7_fontsize.png, 
> separate_filter_choice_from_text_hex_views.diff, sonar_qube_resolve.diff, 
> sonar_qube_resolve_25_08.diff
>
>
> This is a follow-up issue to PDFBOX-2530 to implement extra ideas that came 
> up in GSoC2015, ideas that were not implemented due to lack of time, and new 
> ideas.
> *Viewing*
>  - refactor PDFDebugger.java
>  - ✓ render glyphs of fonts
>  - ✓ refactor StreamPane to share stream filtering among Text view and hex 
> view
>  - ✓ password dialog when hitting protected PDF
>  - show "pretty" XML
>  - display filtered streams even if the unfiltered stream is corrupt 
> (PDFBOX-2976)
>  - ✓ display the "caused by" part exception stack trace (nested exceptions)
>  - ✓ keep zoom
>  - ✓ integrate DrawPrintTextLocations into rendering
>  - integrate area text extraction with a mouse-created rectangle that shows 
> the coordinates in a status line
>  - ✓ show permission flags of {{Encrypt/P}} entry
>  - ✓ show signature flags of {{Root/AcroForm/SigFlags}} entry, see Table 219 
> in PDF spec
>  - ✓ show page labels additional to page number (see file from TIKA-2121 as 
> example)
>  - ✓ "reopen" menu item (useful when editing an existing PDF to create a 
> reduced PDF)
>  - choose zoom automatically so that PDF page can be seen in full
> *Editing*
>  - save modified PDFs
>  - editing in hex viewer
>  - remove nodes (e.g. elements from a COSDictionary)
>  - delete array or dictionary elements
>  - load content streams
>  - edit & keep content streams



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2941) Improve PDFDebugger (2)

2021-04-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1785#comment-1785
 ] 

ASF subversion and git services commented on PDFBOX-2941:
-

Commit 1889243 from Tilman Hausherr in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1889243 ]

PDFBOX-2941: check rotation input value; keep image type; improve java doc; 
support some negative values; use int instead of double

> Improve PDFDebugger (2)
> ---
>
> Key: PDFBOX-2941
> URL: https://issues.apache.org/jira/browse/PDFBOX-2941
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Priority: Major
> Attachments: gs-bugzilla694570.pdf, keep_zoom.diff, osx-tabs.png, 
> pdfdebugger-screenshot-reverted.png, pdfdebugger-screenshot-trunc.png, 
> screenshot_debugger_new.png, screenshot_debugger_not_aligned.png, 
> screenshot_debugger_old.png, screenshot_w7_fontsize.png, 
> separate_filter_choice_from_text_hex_views.diff, sonar_qube_resolve.diff, 
> sonar_qube_resolve_25_08.diff
>
>
> This is a follow-up issue to PDFBOX-2530 to implement extra ideas that came 
> up in GSoC2015, ideas that were not implemented due to lack of time, and new 
> ideas.
> *Viewing*
>  - refactor PDFDebugger.java
>  - ✓ render glyphs of fonts
>  - ✓ refactor StreamPane to share stream filtering among Text view and hex 
> view
>  - ✓ password dialog when hitting protected PDF
>  - show "pretty" XML
>  - display filtered streams even if the unfiltered stream is corrupt 
> (PDFBOX-2976)
>  - ✓ display the "caused by" part exception stack trace (nested exceptions)
>  - ✓ keep zoom
>  - ✓ integrate DrawPrintTextLocations into rendering
>  - integrate area text extraction with a mouse-created rectangle that shows 
> the coordinates in a status line
>  - ✓ show permission flags of {{Encrypt/P}} entry
>  - ✓ show signature flags of {{Root/AcroForm/SigFlags}} entry, see Table 219 
> in PDF spec
>  - ✓ show page labels additional to page number (see file from TIKA-2121 as 
> example)
>  - ✓ "reopen" menu item (useful when editing an existing PDF to create a 
> reduced PDF)
>  - choose zoom automatically so that PDF page can be seen in full
> *Editing*
>  - save modified PDFs
>  - editing in hex viewer
>  - remove nodes (e.g. elements from a COSDictionary)
>  - delete array or dictionary elements
>  - load content streams
>  - edit & keep content streams



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-5179) PDFBox PrintImageLocations: Extracts images in wrong orientation

2021-04-27 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-5179.
---
Resolution: Not A Bug

OK, good to hear that. You also helped discover two bugs, which I'll fix in 
PDFBOX-2941.

> PDFBox PrintImageLocations: Extracts images in wrong orientation
> 
>
> Key: PDFBOX-5179
> URL: https://issues.apache.org/jira/browse/PDFBOX-5179
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.21
> Environment: Windows 10, Tomcat 9, jdk 1.8_0161
>Reporter: Michael Carell
>Priority: Major
> Attachments: 0137d7db-7ca3-4573-a73a-1386699d922c.jpeg, 
> Testdokument.pdf, org_0137d7db-7ca3-4573-a73a-1386699d922c.jpeg
>
>
> The PrintImageLocation class writes images in the original and rotated 
> format. I used the algorithm in a image processing application. I found some 
> PDF documents, where the extracted pictures have a wrong orientation. I have 
> build a document to test that case.
> How to reproduce:
> I have constructed a MS Word document, that contains 4 images with different 
> rotation levels (0°, 90°, 180°, 270°). I exported the document to PDF format. 
> If you call PrintImageLocation with that PDF, the pictures within the 
> document, that are rotated are displayed in in the right order and the 
> "rotate" images, have the orientation from the document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[GitHub] [pdfbox] THausherr commented on pull request #107: potential memory leaks and small performance improvements

2021-04-27 Thread GitBox


THausherr commented on pull request #107:
URL: https://github.com/apache/pdfbox/pull/107#issuecomment-827769059


   I don't know and I prefer not to test this, because in the worst case we 
would either not clone enough, or have doubles. This is a very sensitive part 
of the code, I had some hard time around xmas 2018 fixing bugs. The structure 
of PDF isn't a pure tree, there's a lot of references going around, e.g. 
between the structure tree and the pages.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements

2021-04-27 Thread GitBox


valerybokov commented on pull request #107:
URL: https://github.com/apache/pdfbox/pull/107#issuecomment-827584201


   Hi, @THausherr! I researched the PDFCloneUtility code and found one detail. 
This class has two fields as collections (cloneVersion and clonedValues). I 
understood these collections never clears. Is all their content really needed 
for clonning or you can clear some items for specific level of stack or before 
some method, or after getting an item from collection?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-5180) Snapshot Deploy not working

2021-04-27 Thread Oliver Schmidtmer (Jira)
Oliver Schmidtmer created PDFBOX-5180:
-

 Summary: Snapshot Deploy not working
 Key: PDFBOX-5180
 URL: https://issues.apache.org/jira/browse/PDFBOX-5180
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 2.0.24
Reporter: Oliver Schmidtmer


When building and deploying snapshots, the build number is higher than the last 
uploaded snapshot. So the dependency resolution to 2.0.24-SNAPSHOT does not 
work.

It seems the deploy plugin is triggered twice, but only once there are uploads. 
The other sub projects are also affected.

[https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/2.0.24-SNAPSHOT/maven-metadata.xml]
{code:java}

 org.apache.pdfbox
 pdfbox
 2.0.24-SNAPSHOT
  
   
   20210426.180431
   154
  
  20210426180431
  
   
jar
2.0.24-20210426.180213-153
20210426180431
   
   
pom
 2.0.24-20210426.180213-153
 20210426180431

  
 

{code}
In trunk / 3.0 this doesn't happen:
 
[https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/3.0.0-SNAPSHOT/maven-metadata.xml]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5178) Parsing differences between 2.0.23 and 3.0

2021-04-27 Thread Michael Klink (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333066#comment-17333066
 ] 

Michael Klink commented on PDFBOX-5178:
---

:) Yeah, there always are surprises in that "specification"...

Beware, though: It is allowed to have a 0 as third entry of *W* and quite a 
number of documents make use of that. If that were the case here, I wouldn't 
know a mechanism to find out which of those objects 14 is the currently correct 
one.

> Parsing differences between 2.0.23 and 3.0
> --
>
> Key: PDFBOX-5178
> URL: https://issues.apache.org/jira/browse/PDFBOX-5178
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.23, 3.0.0 PDFBox
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>Priority: Major
> Attachments: poppler-704-0.pdf
>
>
> There are some weird differences in parsing the attached file, 2.0.23 shows 
> "BigTIFF.tif" in the /Contents of the first annotation and a loop at 
> Root/Pages/Kids/[0]/Annots/[0]/FS (always 14 0 R), while 3.0 doesn't have 
> that, but doesn't have "BigTIFF.tif". I'm not sure which one (if any) is 
> wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5179) PDFBox PrintImageLocations: Extracts images in wrong orientation

2021-04-27 Thread Michael Carell (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333049#comment-17333049
 ] 

Michael Carell commented on PDFBOX-5179:


One additional comment. The solution for the bug in my programm was to take the 
page rotation into accout:

{color:#0033b3}int {color}{color:#00}pageRotation {color}= 
({color:#0033b3}int{color}) {color:#00}page{color}.getRotation();
{color:#00}Matrix ctmNew {color}= 
getGraphicsState().getCurrentTransformationMatrix();
{color:#0033b3}int {color}{color:#00}objectRotation {color}= 
({color:#0033b3}int{color}) 
{color:#00}Math{color}.round({color:#00}Math{color}.toDegrees({color:#00}Math{color}.atan2({color:#00}ctmNew{color}.getShearY(),
 {color:#00}ctmNew{color}.getScaleY(;
{color:#0033b3}int {color}{color:#00}trueRotation {color}= 
{color:#00}objectRotation {color}- {color:#00}pageRotation{color};

If pageRotation is 90° and imageRotation ist 90° the true rotation is 0°. In 
this case the image is extracted as displayed.

> PDFBox PrintImageLocations: Extracts images in wrong orientation
> 
>
> Key: PDFBOX-5179
> URL: https://issues.apache.org/jira/browse/PDFBOX-5179
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.21
> Environment: Windows 10, Tomcat 9, jdk 1.8_0161
>Reporter: Michael Carell
>Priority: Major
> Attachments: 0137d7db-7ca3-4573-a73a-1386699d922c.jpeg, 
> Testdokument.pdf, org_0137d7db-7ca3-4573-a73a-1386699d922c.jpeg
>
>
> The PrintImageLocation class writes images in the original and rotated 
> format. I used the algorithm in a image processing application. I found some 
> PDF documents, where the extracted pictures have a wrong orientation. I have 
> build a document to test that case.
> How to reproduce:
> I have constructed a MS Word document, that contains 4 images with different 
> rotation levels (0°, 90°, 180°, 270°). I exported the document to PDF format. 
> If you call PrintImageLocation with that PDF, the pictures within the 
> document, that are rotated are displayed in in the right order and the 
> "rotate" images, have the orientation from the document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5179) PDFBox PrintImageLocations: Extracts images in wrong orientation

2021-04-27 Thread Michael Carell (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333048#comment-17333048
 ] 

Michael Carell commented on PDFBOX-5179:


Hello, sorry for any inconvenience,

I had a knot in the brain.

The algorithm works fine. My test case is wrong. The background ist, that I 
tried to reproduce a problem, that I had with an image in an landscape 
orientated PDF document. I tried to create a test document that contains images 
in different locations. I rotated the images within the document and expected, 
that my programm is able to find out the angle by which I rotated them. My 
mistake. 

 

> PDFBox PrintImageLocations: Extracts images in wrong orientation
> 
>
> Key: PDFBOX-5179
> URL: https://issues.apache.org/jira/browse/PDFBOX-5179
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.21
> Environment: Windows 10, Tomcat 9, jdk 1.8_0161
>Reporter: Michael Carell
>Priority: Major
> Attachments: 0137d7db-7ca3-4573-a73a-1386699d922c.jpeg, 
> Testdokument.pdf, org_0137d7db-7ca3-4573-a73a-1386699d922c.jpeg
>
>
> The PrintImageLocation class writes images in the original and rotated 
> format. I used the algorithm in a image processing application. I found some 
> PDF documents, where the extracted pictures have a wrong orientation. I have 
> build a document to test that case.
> How to reproduce:
> I have constructed a MS Word document, that contains 4 images with different 
> rotation levels (0°, 90°, 180°, 270°). I exported the document to PDF format. 
> If you call PrintImageLocation with that PDF, the pictures within the 
> document, that are rotated are displayed in in the right order and the 
> "rotate" images, have the orientation from the document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org