On Sat, Sep 27, 2008 at 7:01 PM, Chris Rebert <[EMAIL PROTECTED]> wrote: > Looking at the docs for the mimetypes module, it just guesses based on > the filename (and extension), not the actual contents of the file, so > it doesn't really help the OP, who wants to make sure their program > isn't misled by an inaccurate extension.
One other way to detect a pdf is to just read the first 4 bytes from the file. Valid pdf files start with "%PDF-". Something similar can be done with Word docs but I don't know what the magic bytes are. This approach is pretty similar to what the file command does but is probably a better approach if you have to support multiple platforms. -mike -- ________________________________ Michael E. Crute http://mike.crute.org God put me on this earth to accomplish a certain number of things. Right now I am so far behind that I will never die. --Bill Watterson -- http://mail.python.org/mailman/listinfo/python-list