[ 
https://issues.apache.org/jira/browse/PDFBOX-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627684#comment-16627684
 ] 

Tilman Hausherr commented on PDFBOX-4323:
-----------------------------------------

I ran this code:
{code:java}
        PDDocument doc = PDDocument.load(new 
URL("https://issues.apache.org/jira/secure/attachment/12941226/fda-form-356h-Scrubbed.pdf";).openStream());
        PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
        for (PDField field : acroForm.getFieldTree())
        {
            List<PDAnnotationWidget> widgets = field.getWidgets();
            PDAnnotationWidget widget = widgets.get(0);
            if (widget != null)
            {
                int pageNo = doc.getPages().indexOf(widget.getPage());
                if (pageNo < 0)
                {
                    System.out.println(field.getFullyQualifiedName());
                }
            }
        }{code}
and got many results, e.g.

db_ind_rare_disease_desg_10_y
 db_ind_rare_disease_desg_10_n
 db_ind_rare_disease_desg_10

I could not find these fields on pages. On page 4 it goes up to 9. The field 
can be found at {{Root/AcroForm/Fields/[176]}} . The /P dictionary is not a 
page, it is a "Template", so that doesn't count. On the bottom of page 4 I 
found "Add Second Continuation Page for #15". So I assume this is some sort of 
dynamic PDF, i.e. that adds pages.

So I added this line:
{code:java}
System.out.println(widget.getPage().getCOSObject().getItem(COSName.TYPE));{code}
And it turned out that all pages < 0 are templates.

> Not able to determine the page (page number) of the some form fields
> --------------------------------------------------------------------
>
>                 Key: PDFBOX-4323
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4323
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm, PDModel
>    Affects Versions: 2.0.2
>            Reporter: Amit Maheshwari
>            Priority: Major
>         Attachments: fda-form-356h-Scrubbed.pdf
>
>
> I am not able to decide the page number of some form fields (specially of 
> page 4, 5 of attached pdf).
> How I'm trying to get page number:
>  # First I get list of all pages (as in 'PDPageTree') of pdf using 
> 'pdDocumentCatalog.GetPages()'
>  # Then I get 'PDAcroForm' for the same pdf using 'getAcroForm()' method
>  # Then I get list of all Fields (as in 'PDFieldTree') from previously got 
> AcroForm
> I use all these information in following code to get Page Number:
>  
> var widgets = field.getWidgets();
>  var widget = (widgets.toArray()[0] as PDAnnotationWidget);
>  if (widget != null)
>  {
>         int pageNo = pages.indexOf(widget.getPage());
>  }
>  
> There is no error, just I am getting pageNo = -1 (as list doesn't contain 
> such page)
> But for some fields, list of pages doesn't contain the page which I get from 
> 'widget.getPage()' 
>  
> Let me know if some more clarification required.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to