Re: [SAtalk] Spam Corpus

2003-07-28 Thread Bob Apthorpe
Hi, On Sun, 27 Jul 2003 15:53:40 -0700 John Rudd [EMAIL PROTECTED] wrote: On Sunday, Jul 27, 2003, at 12:27 US/Pacific, Nix wrote: On Wed, 23 Jul 2003, Daniel Carrera stipulated: On Thu, Jul 24, 2003 at 12:00:13AM +0100, Nix wrote: Spam actually seems to differ quite a lot between

Re: [SAtalk] Spam Corpus

2003-07-28 Thread Daniel Carrera
I get about 20 bits of spam a day and much more ham than that in mailing list and personal traffic; I can wait 10 days to collect enough spam to train SA (NB: 251 spams since 7/15.) If it takes you more than a week or two to collect enough spam and ham to train Bayes, you don't have much of

Re: [SAtalk] Spam Corpus

2003-07-28 Thread Yorkshire Dave
On Mon, 2003-07-28 at 08:52, Daniel Carrera wrote: What's UBE? I'm sure that the U stands for Un and the E for Email. What's the B for? Bulk or Boilerplate the other two definitions you'll often see are UCE where the C means Commercial, and UAE where the A means Automated. -- Scanned by

Re: [SAtalk] Spam Corpus

2003-07-28 Thread Kai MacTane
At 7/28/03 07:41 AM , Yorkshire Dave wrote: On Mon, 2003-07-28 at 08:52, Daniel Carrera wrote: What's UBE? I'm sure that the U stands for Un and the E for Email. What's the B for? Yorkshire Dave has defined the A, B and C. The U is actually Unsolicited. Just in case the above isn't a typo.

Re: [SAtalk] Spam Corpus

2003-07-27 Thread Nix
On Wed, 23 Jul 2003, Daniel Carrera stipulated: On Thu, Jul 24, 2003 at 12:00:13AM +0100, Nix wrote: Spam actually seems to differ quite a lot between individuals, Really? Why would that be the case? I think it depends which spammers' mailing lists you've landed up on. -- `We cannot get

Re: [SAtalk] Spam Corpus

2003-07-27 Thread John Rudd
On Sunday, Jul 27, 2003, at 12:27 US/Pacific, Nix wrote: On Wed, 23 Jul 2003, Daniel Carrera stipulated: On Thu, Jul 24, 2003 at 12:00:13AM +0100, Nix wrote: Spam actually seems to differ quite a lot between individuals, Really? Why would that be the case? I think it depends which spammers'

Re: [SAtalk] Spam Corpus

2003-07-24 Thread Derek Shaw - spamAssassin testing
Daniel Carrera wrote: On Thu, Jul 24, 2003 at 12:00:13AM +0100, Nix wrote: Spam actually seems to differ quite a lot between individuals, Really? Why would that be the case? The whole point of spam is that it's intended for no one in particular and they make no research to find out if

Re: [SAtalk] Spam Corpus

2003-07-23 Thread Nix
On Tue, 22 Jul 2003, Daniel Carrera stipulated: Hi all, Does anyone have a pool of spam they can lend me? ;) I'm trying to teach SA's Bayesian filer. I have no shortage of ham to give it. I brought the ham pool almost to 500 messages just today. But I only have 60 spams to give it.

Re: [SAtalk] Spam Corpus

2003-07-23 Thread Daniel Carrera
On Thu, Jul 24, 2003 at 12:00:13AM +0100, Nix wrote: Spam actually seems to differ quite a lot between individuals, Really? Why would that be the case? The whole point of spam is that it's intended for no one in particular and they make no research to find out if you are at all likely to

[SAtalk] Spam Corpus

2003-07-22 Thread Daniel Carrera
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, Does anyone have a pool of spam they can lend me? ;) I'm trying to teach SA's Bayesian filer. I have no shortage of ham to give it. I brought the ham pool almost to 500 messages just today. But I only have 60 spams to give it. I