Hi,

when trying to implement the parent tree for a structure tree root (PDF 32000-1:2008, section 14.7.2) using org.apache.pdfbox.pdmodel.common.PDNumberTreeNode I found out that the current implementation of PDNumberTreeNode is based on the following assumption: the values in the number tree have the same COS type and represent the same PD type (the valueType member variable). However, this is not the case for the ParentTree.

In a ParentTree number tree (see section 14.7.4.4) the values can either be "an indirect reference to the parent structure element" (-> COSDictionary) or "an array of indirect references to the sequences’ parent structure elements" (-> COSArray). So it cannot be mapped to one PD type (would be either PDStructureElement or PDStructureElement[]).

I therefore, suggest a different approach:

- A number tree node on the COS level:

COSNumberTreeNode<E extends COSBase>
{
  public List<COSNumberTreeNode> getKids();

  public void setKids(List<COSNumberTreeNode> kids);

  public Integer getUpperLimit();

  public Integer getLowerLimit();

  protected void setUpperLimit(Integer upper);

  protected void setLowerLimit(Integer lower);

  // get COS object value
  protected E getValue(int index);

  protected void setValue(int index, E value);
}


- A number tree node on the PD level with specific getters/setters:

PDPageLabelTreeNode extends COSNumberTreeNode<COSDictionary>
{
  // using getValue(int) return value of type COSDictionary
  // create PDPageLabelRange from return value
  public PDPageLabelRange getPageLabelRange(int startPage);

  // using setValue(int, COSDictionary)
  public void setLabelItem(int startPage, PDPageLabelRange item);

  ...
}

or

PDParentTreeNode2 extends COSNumberTreeNode<COSBase>
{
  // StructParents
  // using getValue(int), if return value is COSArray
  // select item from COSArray (index: mcid)
  // create PDStructureElement
  public PDStructureElement getStructParent(PDPage page, int mcid);

  // StructParent
  // using getValue(int), if return value is COSDictionary
  // create PDStructureElement
  public PDStructureElement getStructParent(PDAnnotation annotation);

  ...
}

--
Johannes Koch
Fraunhofer Institute for Applied Information Technology FIT
Web Compliance Center
Schloss Birlinghoven, D-53757 Sankt Augustin, Germany
Phone: +49-2241-142628    Fax: +49-2241-142065

Reply via email to