[jira] [Commented] (PDFBOX-1792) Different metadata extracted with NonSequentialPDFParser vs classic parser on some documents

JIRA Tue, 10 Dec 2013 01:06:03 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844104#comment-13844104
 ]


Andreas Lehmkühler commented on PDFBOX-1792:
--------------------------------------------

The testcase you are talking about wasn't there in the first place. You added 
it when "disabling" it. Have a look at revision 1458423 before your checkin

http://svn.apache.org/viewvc/pdfbox/branches/1.8/pdfbox/src/test/java/org/apache/pdfbox/pdmodel/TestPDDocumentInformation.java?revision=1458423&view=markup

The issue only exists in your local environment. Otherwise the jenkins build 
should have failed, but it didn't.

IMO you should revert your changes and once the issue with the other pdf and 
the parsing it is solved, we should (re)add the testcase and the sample pdf as 
well. But let's do that in the trunk first.

> Different metadata extracted with NonSequentialPDFParser vs classic parser on 
> some documents
> --------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-1792
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1792
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.8.3
>            Reporter: Tim Allison
>            Priority: Minor
>         Attachments: PDFBOX-1792.tar.gz, testPDF_acroForm2.pdf
>
>
> The traditional parser is able to extract metadata from a test document from 
> TIKA-738.  The NonSequentialPDFParser is not able to extract metadata from 
> that file.  Another file from the Tika test suite has metadata that can be 
> extracted by the NonSequentialPDFParser but not by classic. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (PDFBOX-1792) Different metadata extracted with NonSequentialPDFParser vs classic parser on some documents

Reply via email to