[ 
https://issues.apache.org/jira/browse/PDFBOX-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-5623:
------------------------------------
    Description: 
We have an online service where our customers post their PDF files so that we 
can render them. 

One of our customer noticed recently that one of its signed document did not 
show the image associated with the signature. They gave me the right to share 
this document and you will find it attached 
([^PDFBOX-issue-rendering-signature.pdf]).

The problem is in the last page, page 9. The issue can easily be reproduced 
using pdfbox-app-2.0*.jar PDFToImage.

Result with pdfbox 2.0.22 is:

!pdfbox22-page9-br.jpg!

Result with pdfbox 2.0.23 or later is:

!pdfbox23-page9-br.jpg!

The regression was introduced with commit (seen in git) 
[f34a33824c4363b9b683245cb582328dc92b79ca|https://github.com/apache/pdfbox/commit/f34a33824c4363b9b683245cb582328dc92b79ca],
 dated 2021-03-02 07:12:11+0000. The associated ticket was PDFBOX-5112.

The issue is in PDFXrefStreamParser's ObjectNumbers constructor, as it assumes 
that the COSInteger objects in the COSArray are necessarily sorted. In the case 
of the attached pdf, they are not, and this causes the parser to abort browsing 
the array too soon.

I have a patch for that on branch 2.0: 
[^Fixing_the_problem_when_the_COSArray_is_not_sorted_in_increasing_order_.patch]

With this patch the image is created successfully. However, there are warning 
that appear, that did not exist in version 2.0.22:
{noformat}
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6789] found [6791]
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6790] found [5327]
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6791] found [6485]
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6485] found [6789]
{noformat}
There may be additional fixes to be made in order to fully support this PDF. I 
did not have time to investigate, and also my knowledge of the codebase if 
fairly limited. So help would be appreciated here.

Thanks.

  was:
We have an online service where our customers post their PDF files so that we 
can render them. 

One of our customer noticed recently that one of its signed document did not 
show the image associated with the signature. They gave me the right to share 
this document and you will find it attached 
([^PDFBOX-issue-rendering-signature.pdf]).

The problem is in the last page, page 9. The issue can easily be reproduced 
using pdfbox-app-2.0*.jar PDFToImage.

Result with pdfbox 2.0.22 is:

!pdfbox22-page9-br.jpg!

Result with pdfbox 2.0.23 or later is:

!pdfbox23-page9-br.jpg!

The regression was introduced with commit (seen in git) 
f34a33824c4363b9b683245cb582328dc92b79ca, dated 2021-03-02 07:12:11+0000. The 
associated ticket was PDFBOX-5112.

The issue is in PDFXrefStreamParser's ObjectNumbers constructor, as it assumes 
that the COSInteger objects in the COSArray are necessarily sorted. In the case 
of the attached pdf, they are not, and this causes the parser to abort browsing 
the array too soon.

I have a patch for that on branch 2.0: 
[^Fixing_the_problem_when_the_COSArray_is_not_sorted_in_increasing_order_.patch]

With this patch the image is created successfully. However, there are warning 
that appear, that did not exist in version 2.0.22:
{noformat}
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6789] found [6791]
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6790] found [5327]
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6791] found [6485]
Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
WARNING: found wrong object number. expected [6485] found [6789]
{noformat}
There may be additional fixes to be made in order to fully support this PDF. I 
did not have time to investigate, and also my knowledge of the codebase if 
fairly limited. So help would be appreciated here.

Thanks.


> Signature Image not Rendered starting with PDFBox 2.0.23 + patch provided
> -------------------------------------------------------------------------
>
>                 Key: PDFBOX-5623
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5623
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.23, 2.0.24, 2.0.25, 2.0.26, 2.0.27, 2.0.28
>         Environment: Java 8, Windows 10 and Ubuntu 22
>            Reporter: Lionel Fradin
>            Priority: Major
>         Attachments: 
> Fixing_the_problem_when_the_COSArray_is_not_sorted_in_increasing_order_.patch,
>  PDFBOX-issue-rendering-signature.pdf, pdfbox22-page9-br.jpg, 
> pdfbox23-page9-br.jpg
>
>
> We have an online service where our customers post their PDF files so that we 
> can render them. 
> One of our customer noticed recently that one of its signed document did not 
> show the image associated with the signature. They gave me the right to share 
> this document and you will find it attached 
> ([^PDFBOX-issue-rendering-signature.pdf]).
> The problem is in the last page, page 9. The issue can easily be reproduced 
> using pdfbox-app-2.0*.jar PDFToImage.
> Result with pdfbox 2.0.22 is:
> !pdfbox22-page9-br.jpg!
> Result with pdfbox 2.0.23 or later is:
> !pdfbox23-page9-br.jpg!
> The regression was introduced with commit (seen in git) 
> [f34a33824c4363b9b683245cb582328dc92b79ca|https://github.com/apache/pdfbox/commit/f34a33824c4363b9b683245cb582328dc92b79ca],
>  dated 2021-03-02 07:12:11+0000. The associated ticket was PDFBOX-5112.
> The issue is in PDFXrefStreamParser's ObjectNumbers constructor, as it 
> assumes that the COSInteger objects in the COSArray are necessarily sorted. 
> In the case of the attached pdf, they are not, and this causes the parser to 
> abort browsing the array too soon.
> I have a patch for that on branch 2.0: 
> [^Fixing_the_problem_when_the_COSArray_is_not_sorted_in_increasing_order_.patch]
> With this patch the image is created successfully. However, there are warning 
> that appear, that did not exist in version 2.0.22:
> {noformat}
> Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
> WARNING: found wrong object number. expected [6789] found [6791]
> Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
> WARNING: found wrong object number. expected [6790] found [5327]
> Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
> WARNING: found wrong object number. expected [6791] found [6485]
> Jun 16, 2023 5:18:29 PM org.apache.pdfbox.pdfparser.COSParser findObjectKey
> WARNING: found wrong object number. expected [6485] found [6789]
> {noformat}
> There may be additional fixes to be made in order to fully support this PDF. 
> I did not have time to investigate, and also my knowledge of the codebase if 
> fairly limited. So help would be appreciated here.
> Thanks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to