On Thu, 15 Jan 2004 18:15:40 +0200 (EET) Nerijus Baliunas <[EMAIL PROTECTED]> wrote:
NB> On Thu, 15 Jan 2004 16:27:35 +0100 (Romance Standard Time) Vadim Zeitlin <[EMAIL
PROTECTED]> wrote:
NB>
NB> VZ> XN> 3. Iterate with a script over all the messages in the current folder,
NB> VZ> XN> and potentially move them into another folder.
NB> VZ>
NB> VZ> Yes, this should be possible. I didn't try it but I did try other things
NB> VZ> with Python and at least the possibility to write custom filters is very
NB> VZ> useful and they do work. I used this to write a filter to catch all this
NB> VZ> gibberish spam with 3 lines of random words in each message. It is quite
NB> VZ> simple to detect in Python after getting the msg text using MailFolder
NB> VZ> methods
NB>
NB> Could you please post your script here?
Here it is, slightly improved (I had to add export a new class to PYthon
to do this and it took me about 3 minutes in all, including testing on 2
platforms!).
import re
import Message
import MimePart
import MimeType
#import MDialogs
def isgibberish(msg_):
"Detect if the message is a gibberish spam meant to foil Bayesian filters"
msg = Message.MessagePtr(msg_)
# apparently they also appear with other subjects but this form is the
# most frequent one
if not re.match("Re: [A-Z]{3,8},", msg.Subject()):
#MDialogs.Status("ot the right subject form")
return 0
# they're also always multipart/alternative with text and html inside
partTop = msg.GetTopMimePart()
if partTop.GetType().GetFull() != "MULTIPART/ALTERNATIVE":
#MDialogs.Status("Not MULTIPART/ALTERNATIVE")
return 0
# the text part comes first, as usual, but check for this
partText = partTop.GetNested()
if partText.GetType().GetFull() != "TEXT/PLAIN":
#MDialogs.Status("Not TEXT/PLAIN")
return 0
# and they have exactly 3 lines of gibberish in the text part
if partText.GetNumberOfLines() != 3:
#MDialogs.Status("Not 3 lines")
return 0
# yes, it does look like spam
return 1
You may decide not to do the subject tests and the lines with MDialogs are
there only for debugging. To use this just put it in spam.py file somewhere
where M can find it and add a filter test with kind == Python and
argument == "spam.isgibberish" (spam is the name of the .py file). The
action may be whatever you want, although tarring spammers in feathers is
not unfortunately supported by M yet :-/
NB> VZ> and it doesn't make sense to add a test for this to C++ code as in
NB> VZ> a few months such spams probably will have disappeared anyhow.
NB>
NB> Well, who knows? They are designed to fool bayesian filters, so IMHO
NB> they will continue to evolve...
Yes, what I meant was that they were surely going to change, so it doesn't
make sense to put such tests in the main program permanently.
Regards,
VZ
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Mahogany-Developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/mahogany-developers