Header cache patch and searching through multiple mailboxes
[ Dire-Warning: this is a proof of concept script, it works for me on my system. Other than that I can't say, except that it requires a development version of mutt and a patch that is labelled by it's author as broken... ] I have about 50,000 messages in a couple of dozen nfs mounted Maildirs (all the old stuff is in mbox archives) - Searching all this mail is a big drag, since mutt can't 'limit' across multiple mailboxes and grepping through all this junk can take hours. Michael Elkins' header cache patch is an experiment in speeding up access to Maildirs by scattering db files everywhere - I have the idea that eventually these db files can speed-up searching as well. This script can search those 50,000 email headers and populate a temporary results folder containing 3,000 messages in about 40 seconds. It's an *extremely* blunt instrument, but I hope it will inspire somebody else to write something a bit more elegant and precise. It can be called with a macro, though you have to switch to the '=search-results' folder to see the results: macro index \cL shell-escape'search-maildir.pl ' It takes a single perl-regex argument, which it matches against everything, sort-of like these: [EMAIL PROTECTED] '(Newbie Question|unsubscribe|@yahoo.com|@hotmail.com|BIG5)' -- Bruno #!/usr/bin/perl use strict; use Fcntl; use DB_File; use File::Find; # This script requires the mutt header caching patch from # http://www.sigpipe.org:8080/mutt/ NOTE: currently this patch is declared # broken, so this will probably delete all your files, drink your beer and eat # your cat. Bruno Postle [EMAIL PROTECTED] # Change these and make sure $resultdir exists as a Maildir, NOTE: this script # will *delete* all existing files in $resultdir my $mailfolder = /home/bruno/Mail; my $resultdir = $mailfolder/search-results; my $pattern = $ARGV[0]; my ( %hash, @folders ); unlink $resultdir/cur/*; find ( { wanted = \maildirs }, $mailfolder ); sub maildirs { return unless /hcache\.db/; return if ( $File::Find::name =~ /$resultdir/ ); my $maildir = $File::Find::name; $maildir =~ s/\/hcache\.db//; push @folders, $maildir; } foreach ( @folders ) { my $x = tie %hash, DB_File, $_/hcache.db or die Cannot open $_/hcache.db: $!\n; find ( { wanted = \messages }, $_ ); undef $x; untie %hash; } sub messages { return unless /:/; my $uid = $File::Find::name; $uid =~ s/.*\/([^\/]*):.*/${1}/; if ( $hash{$uid} =~ /$pattern/i ) { link ( $File::Find::name, $resultdir/cur/$uid:2,S ); } } 1;
Re: Searching in multiple mailboxes
Big Brother tells me that Mark Weinem wrote: On Mon, 23 Oct 2000, Benjamin Korvemaker wrote: See "grepm" and "grepmail" But are there no tools for Maildirs? cd Maildir; find . -type f | xargs fgrep -l searchstring -- "Restore your inalienable human rights. Jack McKinney Vote Libertarian. http://www.lp.org http://www.lorentz.com http://www.harrybrowne2000.org [EMAIL PROTECTED] 1024D/D68F2C07 4096g/38AEF076 PGP signature
Re: Searching in multiple mailboxes
On Wed, 25 Oct 2000, Suresh Ramasubramanian wrote: I believe grepmail does maildirs rather well. man grepmail: "[...] Mailboxes must be traditional, UNIX /bin/mail mailbox format [...]" Ciao, Mark
Re: Searching in multiple mailboxes
On Wed, 25 Oct 2000, Jack McKinney wrote: cd Maildir; find . -type f | xargs fgrep -l searchstring Wow, what a comfortable search tool ;-) Ciao, Mark PGP signature
Re: Searching in multiple mailboxes
On Wed, Oct 25, 2000 at 07:46:33PM +0200, Mark Weinem ([EMAIL PROTECTED]) wrote: On Wed, 25 Oct 2000, Suresh Ramasubramanian wrote: I believe grepmail does maildirs rather well. man grepmail: "[...] Mailboxes must be traditional, UNIX /bin/mail mailbox format [...]" Maybe I'm confused as to what maildir-format comprises, but wouldn't one search maildirs with plain old 'grep'? -Rich -- -- Rich Lafferty --- Sysadmin/Programmer, Instructional and Information Technology Services Concordia University, Montreal, QC (514) 848-7625 - [EMAIL PROTECTED] --
Re: Searching in multiple mailboxes
Big Brother tells me that Mark Weinem wrote: On Wed, 25 Oct 2000, Jack McKinney wrote: cd Maildir; find . -type f | xargs fgrep -l searchstring Wow, what a comfortable search tool ;-) For those who remember reading news this way, I thought you'd appreciate it. Sometimes simple solutions are best. If one is using zsh, one could try this: mutt -f (cat $(find . -type f | xargs fgrep -l searchstring)). This might not work due to the missing 'From ' line, but that can always be added: mutt -f (for i in $(find . -type f | xargs fgrep -l searchstring) ; do ; grep '^From: ' $i | head -1 | sed s/From:/From/ ; cat $i ; echo ; done) If one is using a lesser shell, something like this might work: for i in `find . -type f | xargs fgrep -l searchstring` ; do mutt -f $i ; done None of these are tested, BTW. I have been meaning to patch the mailindex package I posted about earlier to process maildirs (it would be a lot easier to write than the way it is currently written, which has to parse mailboxes). -- "Restore your inalienable human rights. Jack McKinney Vote Libertarian. http://www.lp.org http://www.lorentz.com http://www.harrybrowne2000.org [EMAIL PROTECTED] 1024D/D68F2C07 4096g/38AEF076 PGP signature
Re: Searching in multiple mailboxes
On Mon, 23 Oct 2000, Benjamin Korvemaker wrote: See "grepm" and "grepmail" But are there no tools for Maildirs? Ciao, Mark PGP signature
Re: Searching in multiple mailboxes
Mark Weinem proclaimed on mutt-users that: On Mon, 23 Oct 2000, Benjamin Korvemaker wrote: See "grepm" and "grepmail" But are there no tools for Maildirs? I believe grepmail does maildirs rather well. -- Suresh Ramasubramanian + Wallopus Malletus Indigenensis mallet @ cluestick.org + Lumber Cartel of India, tinlcI I never met a piece of chocolate I didn't like.
Searching in multiple mailboxes
There is something that would be nice to have in Mutt: Searching in multiple (or all) mailboxes. Imagine this scenario: You are looking for a specified text in every message you have. It is only possible to search one mailbox, so this might be handy. -- Wouter Verheijen [EMAIL PROTECTED]
Re: Searching in multiple mailboxes
Big Brother tells me that Wouter Verheijen wrote: There is something that would be nice to have in Mutt: Searching in multiple (or all) mailboxes. Imagine this scenario: You are looking for a specified text in every message you have. It is only possible to search one mailbox, so this might be handy. Not as easy as one would hope. This could be VERY slow, depending on how much mail you have (I currently have 192MB, AFTER compression). A better solution is to index your mail. I wrote a perl/MySQL package to handle this a while back. It has a couple of bugs that still need to be worked out when I get a chance: http://www.lorentz.com/mailindex.tar.gz -- "Restore your inalienable human rights. Jack McKinney Vote Libertarian. http://www.lp.org http://www.lorentz.com http://www.harrybrowne2000.org [EMAIL PROTECTED] 1024D/D68F2C07 4096g/38AEF076 PGP signature
Re: Searching in multiple mailboxes
On Mon, Oct 23, 2000 at 08:58:06PM +0200, Wouter Verheijen wrote: There is something that would be nice to have in Mutt: Searching in multiple (or all) mailboxes. Imagine this scenario: You are looking for a specified text in every message you have. It is only possible to search one mailbox, so this might be handy. See "grepm" and "grepmail" http://privat.schlund.de/b/barsnick/sw/grepm.html http://grepmail.sourceforge.net/ -- Benjamin KorvemakerDonkeys kill more people [EMAIL PROTECTED] annually than plane crashes. PGP signature