I have a lot of messages with gif spam, and am happy to try out any utilities etc you have. I am finding many are classified correctly, but many are not, so I retrain on them. I use mbox format, so all my spam are in one file, but can separate out various messages if required. Peter Barker
On Friday 04 August 2006 18:23, Tony Meyer wrote: > [Tim Stone] > > > I've got a bit of experience with the PIL, if I get a few spare > > minutes, > > maybe I'll mess with something. Can someone send me a typical spam > > image? > > Does the recent increase in messages about this indicate that other > people are interested in trying things out? As mentioned previously, > I've been working on these for a while now (when I have the time ;)) > and I can put up patches, utilities & result summaries if people are > interested. I've generated tokens indicating the presence & type of > images, sizes (width, height, total), and a variety of things using PIL. > > Nothing so far has helped more than hurt, and the closest made the > size of the database balloon. I've been working with my own mail > mostly, since many of the corpora around are old enough that these > messages are pretty rare (I really ought to re-export my mail, I > guess). I also put together a collection of tokenizations as one of > my submissions for TREC2006 (it did ok, but not great, on the public > corpora - still waiting on the private results). > > =Tony.Meyer _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
