Sorry to continue this off-topic thread, but Volker Kuhlmann pointed
out a bug and some things to improve in the script I sent out, so I'm
sending out a new version for the record (i.e. people searching the
list archives with Google).
> :0
> md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; }
>else { if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } }
>s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum |
>sed -e 's/ .*//'
- You could probably save CPU cycles by using formail instead of perl
for this, but I haven't tried that.
> :0:$md5cache.lock
> * !? ! fgrep -q "$md5sum" "$md5cache" && echo "$md5sum" >> "$md5cache"
> dupes
- Use "$LOCKEXT" instead of ".lock".
- Unfortunately, this local lock only protects delivery, not the test
itself. This is a bug.
- You can save CPU cycles by putting the fgrep and echo commands in
separate procmail tests so procmail can run the commands directly,
without a shell.
Here's a new version of the script. I hope this is correct now!
SHELL=/bin/sh
MAILDIR=.../Mail
LOGFILE=$MAILDIR/Log
md5cache=$MAILDIR/MD5
:0
md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; } else
{ if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } }
s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum | sed
-e 's/ .*//'
LOCKFILE=$md5cache$LOCKEXT
:0:
* ? fgrep -q "$md5sum" "$md5cache"
dupes
:0ic
| echo "$md5sum" >> "$md5cache"
LOCKFILE
... etc ...