[ 
https://issues.apache.org/jira/browse/TIKA-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14204817#comment-14204817
 ] 

Tim Allison commented on TIKA-1467:
-----------------------------------

Thank you, [~tilman] and [~lehmi].  In r1637868, I've made the move and added 
two tests.  Until we upgrade to PDFBox 1.8.8 this won't work if the user is 
using the NonSequentialParser.  I'm going to leave this issue open until we 
upgrade.

> pdf:encrypted:false with encrypted pdf
> --------------------------------------
>
>                 Key: TIKA-1467
>                 URL: https://issues.apache.org/jira/browse/TIKA-1467
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.6
>         Environment: $java -version
> java version "1.6.0_25"
> Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
> Java HotSpot(TM) Client VM (build 20.0-b11, mixed mode, sharing)
>            Reporter: Thomas Ledoux
>
> When extracting metadata from the encryption_noprinting.pdf file found in the 
> pdfCabinetOfHorrors 
> (https://github.com/openplanets/format-corpus/tree/master/pdfCabinetOfHorrors)
> $java -jar tika-app-1.7-20141105.092424-471.jar -j encryption_noprinting.pdf
> We get a 
> INFO - Document is encrypted
> but the resulting JSON has : "pdf:encrypted":"false"
> Looking at the PDFParser, it seems that the first information comes when 
> reading the PDF but when the metadata is retrieve the PDF is no longer 
> encrypted... the encryption fact should be retain to be added to the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to