[ 
https://issues.apache.org/jira/browse/TIKA-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264112#comment-13264112
 ] 

Nick Burch commented on TIKA-876:
---------------------------------

We still can't help you very much without a (small) sample file, any chance you 
could upload one?

If your PDFs really are wrapped in PKCS7, then we'll need something that 
unpacks the PCKS7 wrapper, and for signed files (initially - no way to supply 
the private key yet for encrypted ones) triggers the recursing parser for the 
contents. I think BouncyCastle might help for this, it's worth a look to start 
with

In r1331634 I've added some mime magic for pkcs7 files. I'm not sure if it's 
quite right or not, but it seems OK for a few files I've tried. It'll need 
someone who knows the PCKS format (or maybe just DER encoding?) to be sure 
though. Ideally, we should distinguish between signed, encrypted and 
signed+encrypted, but I'm not sure how we do that...
                
> Signed pdf parsing
> ------------------
>
>                 Key: TIKA-876
>                 URL: https://issues.apache.org/jira/browse/TIKA-876
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.0
>         Environment: Java 6.0, Ubuntu
>            Reporter: Fausto Cruzeiro de Moraes
>              Labels: features
>             Fix For: 1.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Is there an estimated date for implementing default parsing for signed 
> documents, like signed pdf files (pk7s format), for example?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to