Thomas Ledoux created TIKA-1467:
-----------------------------------

             Summary: pdf:encrypted:false with encrypted pdf
                 Key: TIKA-1467
                 URL: https://issues.apache.org/jira/browse/TIKA-1467
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.6
         Environment: $java -version
java version "1.6.0_25"
Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
Java HotSpot(TM) Client VM (build 20.0-b11, mixed mode, sharing)
            Reporter: Thomas Ledoux


When extracting metadata from the encryption_noprinting.pdf file found in the 
pdfCabinetOfHorrors 
(https://github.com/openplanets/format-corpus/tree/master/pdfCabinetOfHorrors)

$java -jar tika-app-1.7-20141105.092424-471.jar -j encryption_noprinting.pdf

We get a 
INFO - Document is encrypted

but the resulting JSON has : "pdf:encrypted":"false"

Looking at the PDFParser, it seems that the first information comes when 
reading the PDF but when the metadata is retrieve the PDF is no longer 
encrypted... the encryption fact should be retain to be added to the metadata.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to