Hi,
Does Tika support detecting if a PDF has embedded files, and even better return 
an array of the file names?

I was forwarded a "signed" PDF from a vendor that appearantly makes their own 
signing solution. The PDF doesn't contain any standard PaDES properties that 
triggers the signature panel in Acrobat or hasSignature:true or any of the 
other signature properties in Tika.

It consisted of embedding six html files with various technical info inside the 
PDF, like here, from the raw content:

obj
<</Names[(Appendix 1 Evidence Quality Framework.html) 99 0 R (Appendix 2 
Service Description.html) 101 0 R (Appendix 3 Evidence Log.html) 105 0 R 
(Appendix 4 Evidence of Time.html) 107 0 R (Appendix 5 Evidence of Intent.html) 
109 0 R (Appendix 6 Digital Signature Documentation.html) 103 0 R (Evidence 
Quality of xxxxx E-signed Documents.html) 97 0 R]>>
endobj
112 0 obj

>From a security perspective this would also be very useful when using Tika as 
>a secure file gateway for file analysis and detecting malicious files.

Thanks,
Willy T. Koch
Norway

Reply via email to