On 31Aug2014 13:45, Tim Chase <python.l...@tim.thechases.com> wrote:
Tinkering around with a little script, I found myself with the need
to walk a directory tree and process mail messaged found within.
Sometimes these end up being mbox files (with multiple messages
within), sometimes it's a Maildir structure with messages in each
individual file and extra holding directories, and sometimes it's a
MH directory.  To complicate matters, there's also the possibility of
non-{mbox,maildir,mh) files such as binary MUA caches appearing
alongside these messages.

Python knows how to handle each just fine as long as I tell it what
type of file to expect.  But is there a straight-forward way to
distinguish them?  (FWIW, the *nix "file" utility is just reporting
"ASCII text", sometimes "with very long lines", and sometimes
erroneously flags them as C or C++ files‽).

All I need is "is it maildir, mbox, mh, or something else" (I don't
have to get more complex for the "something else") inside an os.walk
loop.

Here is my code for these tests:

    def ismhdir(path):
      ''' Test if `path` points at an MH directory.
      '''
      return os.path.isfile(os.path.join(path, '.mh_sequences'))

    def ismaildir(path):
      ''' Test if `path` points at a Maildir directory.
      '''
      for subdir in ('new', 'cur', 'tmp'):
        if not os.path.isdir(os.path.join(path,subdir)):
          return False
      return True

    def ismbox(path):
      ''' Open path and check that its first line begins with "From ".
      '''
      fp=None
      try:
        fp=open(path)
        from_ = fp.read(5)
      except IOError:
        if fp is not None:
          fp.close()
        return False
      fp.close()
      return from_ == 'From '

I would use these is code somewhat like this (imagining your use case):

  if ismaildir(path):
    ...
  elif ismhdir(path):
    ...
  elif ismbox(path):
    ...
  else:
    reject other known special files here
    continue traversing downward otherwise

Cheers,
Cameron Simpson <c...@zip.com.au>

Gabriel Genellina: See PEP 234 http://www.python.org/dev/peps/pep-0234/
Angus Rodgers:
  You've got to love a language whose documentation contains sentences
  beginning like this:
    "Among its chief virtues are the following four -- no, five -- no,
    six -- points: [...]"
from python-list@python.org
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to