[ 
https://issues.apache.org/jira/browse/FOP-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Gibson  updated FOP-3271:
------------------------------
    Description: 
We have PDF images that are exported direct from Excel (with accessibility 
enabled).  When rendering an accessible PDF output, the images fail to get 
rendered in final PDF output.

FOP logs show an index out of bounds exception:
{code:java}
Caused by: java.lang.IndexOutOfBoundsException: Index -1 out of bounds for 
length 6
        at 
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
        at 
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
        at 
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
        at java.base/java.util.Objects.checkIndex(Objects.java:385)
        at java.base/java.util.ArrayList.get(ArrayList.java:427)
        at 
org.apache.fop.pdf.PDFStructElem.addKidInSpecificOrder(PDFStructElem.java:208)
        at 
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:209)
        at 
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:154)
        at 
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.copyStructure(StructureTreeMerger.java:89)
        at 
org.apache.fop.render.pdf.pdfbox.TaggedPDFConductor.handleLogicalStructure(TaggedPDFConductor.java:68)
        at 
org.apache.fop.render.pdf.pdfbox.AbstractPDFBoxHandler.createStreamForPDF(AbstractPDFBoxHandler.java:114)
        at 
org.apache.fop.render.pdf.pdfbox.PDFBoxImageHandler.handleImage(PDFBoxImageHandler.java:77)
        ... 62 more{code}
 

Because the following method returns -1 ...
{code:java}
public final class StructureTreeMergerUtil {

    public static int findObjectPositionInKidsArray(COSObject kidObj) {
        COSDictionary kid = (COSDictionary) kidObj.getObject();
        COSObject parentObj = (COSObject) kid.getItem(COSName.P);
        COSDictionary parent = (COSDictionary) parentObj.getObject();
        COSBase kids = parent.getItem(COSName.K);
        if (kids instanceof COSArray) {
            COSArray kidsArray = (COSArray)kids;
            return kidsArray.indexOfObject(kid);
        } else {
            return 0;
        }
    } {code}
It turns out that the Excel exported PDF images have records that do not exist 
in that record's parent's children.  This can be seen in the attached images, 
and are always "Artifact" records (although I'm not sure that means all 
artifact records are always broken).

I've also attached a reproduction with a simple fo file, and two pdf images 
that present this issue.  Command to execute it is
{code:java}
fop.bat -a -fo test.fo -pdf test.pdf {code}
 

My PDF spec knowledge is low.  So I'm currently unsure whether Excel is 
producing broken PDFs, or whether there is a bug in FOP's pdf-image handling 
when copying over the structure tree on externally imported PDF images.

 

Hoping someone can shed some light here.  

 

Maybe the fix would be as simple as returning 0 instead of -1 from the above 
method?

  was:
We have PDF images that are exported direct from Excel (with accessibility 
enabled).  When rendering an accessible PDF output, the images fail to get 
rendered in final PDF output.

FOP logs show an index out of bounds exception:
{code:java}
Caused by: java.lang.IndexOutOfBoundsException: Index -1 out of bounds for 
length 6
        at 
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
        at 
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
        at 
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
        at java.base/java.util.Objects.checkIndex(Objects.java:385)
        at java.base/java.util.ArrayList.get(ArrayList.java:427)
        at 
org.apache.fop.pdf.PDFStructElem.addKidInSpecificOrder(PDFStructElem.java:208)
        at 
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:209)
        at 
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:154)
        at 
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.copyStructure(StructureTreeMerger.java:89)
        at 
org.apache.fop.render.pdf.pdfbox.TaggedPDFConductor.handleLogicalStructure(TaggedPDFConductor.java:68)
        at 
org.apache.fop.render.pdf.pdfbox.AbstractPDFBoxHandler.createStreamForPDF(AbstractPDFBoxHandler.java:114)
        at 
org.apache.fop.render.pdf.pdfbox.PDFBoxImageHandler.handleImage(PDFBoxImageHandler.java:77)
        ... 62 more{code}
 

Because the following method returns -1 ...


> pdf-images: Fail to render accessible pdf image in accessible PDF output when 
> "Artifact" elements present in image
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOP-3271
>                 URL: https://issues.apache.org/jira/browse/FOP-3271
>             Project: FOP
>          Issue Type: Bug
>          Components: renderer/pdf
>    Affects Versions: 2.10, 2.11
>            Reporter: Mark Gibson 
>            Priority: Major
>         Attachments: image1-artifactNotInParentsChildren.png, 
> image2-artifactNotInParentsChildren.png
>
>
> We have PDF images that are exported direct from Excel (with accessibility 
> enabled).  When rendering an accessible PDF output, the images fail to get 
> rendered in final PDF output.
> FOP logs show an index out of bounds exception:
> {code:java}
> Caused by: java.lang.IndexOutOfBoundsException: Index -1 out of bounds for 
> length 6
>         at 
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
>         at 
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
>         at 
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
>         at java.base/java.util.Objects.checkIndex(Objects.java:385)
>         at java.base/java.util.ArrayList.get(ArrayList.java:427)
>         at 
> org.apache.fop.pdf.PDFStructElem.addKidInSpecificOrder(PDFStructElem.java:208)
>         at 
> org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:209)
>         at 
> org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:154)
>         at 
> org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.copyStructure(StructureTreeMerger.java:89)
>         at 
> org.apache.fop.render.pdf.pdfbox.TaggedPDFConductor.handleLogicalStructure(TaggedPDFConductor.java:68)
>         at 
> org.apache.fop.render.pdf.pdfbox.AbstractPDFBoxHandler.createStreamForPDF(AbstractPDFBoxHandler.java:114)
>         at 
> org.apache.fop.render.pdf.pdfbox.PDFBoxImageHandler.handleImage(PDFBoxImageHandler.java:77)
>         ... 62 more{code}
>  
> Because the following method returns -1 ...
> {code:java}
> public final class StructureTreeMergerUtil {
>     public static int findObjectPositionInKidsArray(COSObject kidObj) {
>         COSDictionary kid = (COSDictionary) kidObj.getObject();
>         COSObject parentObj = (COSObject) kid.getItem(COSName.P);
>         COSDictionary parent = (COSDictionary) parentObj.getObject();
>         COSBase kids = parent.getItem(COSName.K);
>         if (kids instanceof COSArray) {
>             COSArray kidsArray = (COSArray)kids;
>             return kidsArray.indexOfObject(kid);
>         } else {
>             return 0;
>         }
>     } {code}
> It turns out that the Excel exported PDF images have records that do not 
> exist in that record's parent's children.  This can be seen in the attached 
> images, and are always "Artifact" records (although I'm not sure that means 
> all artifact records are always broken).
> I've also attached a reproduction with a simple fo file, and two pdf images 
> that present this issue.  Command to execute it is
> {code:java}
> fop.bat -a -fo test.fo -pdf test.pdf {code}
>  
> My PDF spec knowledge is low.  So I'm currently unsure whether Excel is 
> producing broken PDFs, or whether there is a bug in FOP's pdf-image handling 
> when copying over the structure tree on externally imported PDF images.
>  
> Hoping someone can shed some light here.  
>  
> Maybe the fix would be as simple as returning 0 instead of -1 from the above 
> method?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to