Hi David...
I agree with you... but some functions like find_parts() do not work if there 
are not Content-Type Headers... making impossible the analysis of some 
attachments...
i am writing a plugin to detect suspicious PDFs...
Maybe there's a better way to analyze attachments that using find_parts()....
Thanks!
------PedroD


>You should not trust what the files extension says that the file is. Also
>file(1) does not yet do a good enough job to be reliable this way.

>As for guessing, I think that the best guess that could be applied would
>be a test of the file to see if, once decoded, it is a utf-8 encoded,
>ASCII, or iso8859-X encoded text file. Failing that I would assume it is
>either an MS doc/ppt/spreadsheet/etc, pdf file, or pure binary. Then you
>could try trusting the file extension.
>Otherwise, it is a text file and could contain an innocent html or
>an uncompressed ps file or a dangerous JS infection program.
>Either way I'd be really careful.
>
>What is your use case?
>What do you intend to do with a pdf file vs. an html one?
>
>Sincerely,
>David


   

Reply via email to