Re: SOUGHT 2.0

jdow Fri, 05 Dec 2014 12:37:51 -0800

On 2014-12-05 08:28, Axb wrote:

On 12/05/2014 05:20 PM, Kris Deugau wrote:

Axb wrote:

On 12/05/2014 01:15 AM, Ian Zimmerman wrote:

On Thu, 04 Dec 2014 22:41:13 +0100,
Axb <[email protected]> wrote:


Axb> To be able to create usable rules, several times/day I need feeds
Axb> to spit *at least* +150k/day. As I don't have the data....

150k of what?  Bytes?  Emails?  Tokens?


Sorry, thought this was obvious...

SOUGHT type rule generation extracts txt strings from spams so it means
+150k spams/day


It seems to work reasonably well for me with ~2-3K each ham and spam,
and even provides a handful of subrules even with ~225 spam subtype
messages.  (I generate a number of sets of rules with different subtypes
of spam.)

It's probably not nearly as *effective* as it could be with larger
working sets.


Agreed.

... I use about 5-15k from the last 8 hrs (amount varies dramatically) per rule
gen run *for local* use, but that's hardly representative for global coverage.

Add LKML to your large batch of training email and I bet you get "interesting"results, at best.


And one must always remember that one person's spam is another person's ham.

{o.o}

Re: SOUGHT 2.0

Reply via email to