At 01:10 PM 8/16/02 -0700, you wrote:

>Hey, all. 
>
>Take a look at this --
>
><http://www.paulgraham.com/spam.html>
>
>It's a new technique for identifying spam. The more I look into the details,
>the more I think we have the "anti-spam killer app", becaues it tunes itself
>to the individual (or site), adapts as the anti-spammers adapt, and the
>technique used is fairly easy to implement and damn difficult for a spammer
>to avoid....
>
>Damn, I wish I'd thought of this.
>
>(I've dropped a pointer to it at
>http://www.chuqui.com/cgi-bin/mwf/topic_show.pl?tid=389)


Yea, I read that... It really started wheels turning in my head.  Now, if I'd coded
in lisp more recently than 20 years ago it would have been a little easier, since I 
hadn't
taken stats any more recently either, and couldn't remember Bayesean analysis to save 
my
soul...

Of course, it still requires either you accept someone elses starting corpus, or you 
have
to have someone still tag the spam as it arrives.

But it wouldn't be *that* hard to add to mailman...  Add a button to the admindb page 
for
"this was spam".  Ideally add a button into the pipermail interface (yea, I know, 
change
pipermail; ugh feh) that the admin can use that says "This message is spam.  Treat it 
a so".

Both would make the message go away, and get added to the lists spam corpus 
correspondingly
deleting it from the lists good corpus stats if it was in the archives (why?  because 
if it made it that
far then it was considered valid email, and we need to get its keywords *out* of that 
database).

We'd also archive the entire note for future reference.

At some point, when we had enough data to trust the corpus sample size, the list admin 
would be
given the option of turning on the spam filter, which could just throw the same 
results out that the
moderation checks do; discard, hold, pass, etc.

Meanwhile, back at the ranch, the site owner could run a site-level filter if they 
liked, independant
of the list-level one (hence why we save the messages.  This allows the site admin to 
review messages
tagged as spam, and agree or disagree on their inclusion, preventing the 
psycho-moderator syndrome),
which could be used to seed new lists with a better starting corpus...


_______________________________________________
Mailman-Developers mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman-21/listinfo/mailman-developers

Reply via email to