Re: scoring top posters

David Champion Thu, 30 Jul 2015 09:23:39 -0700

Hi Mattias -

* On 30 Jul 2015, Matthias Apitz wrote: 
> 
> are some other text lines. Of course we need here a good regular
> expression because the line 'On 29 Jul 2015, Matthias Apitz wrote:'
> is highly configurable and language dependent.


That's why I wouldn't do it with anything regular-expression-based, like
mutt.  Here's an example procmail rule which I haven't tested.

toppostlines=`awk '/^>/ {exit;} /^ *$/ {next;} /^[^ ]*:/ {next;} {total += 1;} 
END {print total}'`
:0 f
| formail -i "X-Top-Post-Lines: $toppostlines"

This tells how many non-blank lines occur between the header and quoted
text, without regard to what's in those lines.  Two or three is probably
not top-posty.  You could go further and count unquoted lines AFTER the
first quoted line.  Then mutt can score on X-Top-Post-Lines.  

This doesn't help with any encoded mail -- you'd need a smarter filter
for that.  Smart decoding and inability to run filters on your mail
service are the main reasons you would want to do it inside mutt, but
that seems very challenging at best (and impossible at worst).

Making this more general is left as an exercise.  But I wouldn't
recommend it really.  I find personally that downscoring top-posters is
a pretty poor way to judge content.  If it works for you, great, but you
must not exchange email with very many normal people. :)

-- 
David Champion • d...@bikeshed.us

Re: scoring top posters

Reply via email to