Sorry to continue this off-topic thread, but Volker Kuhlmann pointed
out a bug and some things to improve in the script I sent out, so I'm
sending out a new version for the record (i.e. people searching the
list archives with Google).

> :0
> md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; } 
>else { if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } } 
>s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum | 
>sed -e 's/ .*//'

- You could probably save CPU cycles by using formail instead of perl
for this, but I haven't tried that.

> :0:$md5cache.lock
> * !? ! fgrep -q "$md5sum" "$md5cache" && echo "$md5sum" >> "$md5cache"
> dupes

- Use "$LOCKEXT" instead of ".lock".

- Unfortunately, this local lock only protects delivery, not the test
itself. This is a bug.

- You can save CPU cycles by putting the fgrep and echo commands in
separate procmail tests so procmail can run the commands directly,
without a shell.

Here's a new version of the script. I hope this is correct now!


SHELL=/bin/sh

MAILDIR=.../Mail
LOGFILE=$MAILDIR/Log

md5cache=$MAILDIR/MD5

:0
md5sum=| perl -e 'while (($_ = <>) && !/^\r?\n$/) { if (/^[ \t]/) { next if $r; } else 
{ if (/^(date|from|subject|to|cc)\s*:/i) { $r = 0; } else { $r = 1; next } } 
s/\r?\n$/\n/; print; } print "\n"; while (<>) { s/\r?\n$/\n/; print; }' | md5sum | sed 
-e 's/ .*//'

LOCKFILE=$md5cache$LOCKEXT

:0:
* ? fgrep -q "$md5sum" "$md5cache"
dupes

:0ic
| echo "$md5sum" >> "$md5cache"

LOCKFILE

... etc ...

Reply via email to