[ Dire-Warning: this is a proof of concept script, it works for me on my
system.  Other than that I can't say, except that it requires a
development version of mutt and a patch that is labelled by it's author
as "broken"... ]

I have about 50,000 messages in a couple of dozen nfs mounted Maildirs
(all the old stuff is in mbox archives) - Searching all this mail is a
big drag, since mutt can't 'limit' across multiple mailboxes and
grepping through all this junk can take hours.

Michael Elkins' header cache patch is an experiment in speeding up
access to Maildirs by scattering db files everywhere - I have the idea
that eventually these db files can speed-up searching as well.

This script can search those 50,000 email headers and populate a
temporary results folder containing 3,000 messages in about 40 seconds.
It's an *extremely* blunt instrument, but I hope it will inspire
somebody else to write something a bit more elegant and precise.

It can be called with a macro, though you have to switch to the
'=search-results' folder to see the results:

    macro index \cL <shell-escape>'search-maildir.pl '

It takes a single perl-regex argument, which it matches against
everything, sort-of like these:

    [EMAIL PROTECTED]

    '(Newbie Question|unsubscribe|@yahoo.com|@hotmail.com|BIG5)'

-- 
Bruno
#!/usr/bin/perl
use strict;
use Fcntl;
use DB_File;
use File::Find;

# This script requires the mutt header caching patch from
# http://www.sigpipe.org:8080/mutt/ NOTE: currently this patch is declared
# broken, so this will probably delete all your files, drink your beer and eat
# your cat.  Bruno Postle <[EMAIL PROTECTED]>

# Change these and make sure $resultdir exists as a Maildir, NOTE: this script
# will *delete* all existing files in $resultdir
my $mailfolder = "/home/bruno/Mail";
my $resultdir = "$mailfolder/search-results";

my $pattern = $ARGV[0];
my ( %hash, @folders );

unlink <$resultdir/cur/*>;

find ( { wanted => \&maildirs }, "$mailfolder" );

sub maildirs
{
    return unless /hcache\.db/;
    return if ( $File::Find::name =~ /$resultdir/ );
    my $maildir = $File::Find::name;
    $maildir =~ s/\/hcache\.db//;
    push @folders, $maildir;
}

foreach ( @folders )
{
    my $x = tie %hash, "DB_File", "$_/hcache.db"
        or die "Cannot open $_/hcache.db: $!\n";
    find ( { wanted => \&messages }, "$_" );
    undef $x;
    untie %hash;
}    

sub messages
{
     return unless /:/;
     my $uid = $File::Find::name;
     $uid =~ s/.*\/([^\/]*):.*/${1}/;
     if ( $hash{$uid} =~ /$pattern/i )
     {
         link ( "$File::Find::name", "$resultdir/cur/$uid:2,S" );
     }
}

1;


Reply via email to