On 2/1/2011 5:44 AM, John Hardin wrote:
On Mon, 31 Jan 2011, Warren Togami Jr. wrote:

On 01/26/2011 06:48 PM, Daryl C. W. O'Shea wrote:

SPAM: 51330 (150000 required)

Joao Gouveia will soon be requesting an account to join the nightly
masscheck. He has a significant quantity of spam, and hopefully much
of it is European language so it should add to our diversity.

I wonder how scoring will be affected if his corpus is >50k messages?

:)


Yikes. He has over 1 million per day spam. He's figuring out a way to filter it to eliminate duplicates and do a random sample of ~20k * 7 days. But still, that's going to skew us too much.

Warren

Reply via email to