I was surprised to see that the "AWL: Auto-whitelist
adjustment" rule added 31.1 (thirty-one point one!) to
the score of the following email from this very list
server.
That was easily enough to mis-flag it as spam.
I'd appreciate it if someone would explain how an AWL
adjustment is supposed
Good point.
I hadn't considered the transient nature of
blacklists.
Having said that, it seems to me that the content of
spam also changes over time, yet the GA seems to cope
with that.
Perhaps it's just a matter of degree
If the content spam is consistent _enough_ to permit
GA scoring,
On Thu:10:01, Matt Sergeant wrote:
> Kingsley G. Morse Jr. wrote:
> > Good point. Combinations of some rules may be more
> > indicative of spam than others.
> >
> > It would be great if the GA could infer the boolean
> > logic, as well as the scores.
>
>
How about sampling the network checks, so that instead
of 400,000, only doing, say 500?
It seems to me that sampling a few hundred network
checks would arrive at a better score for them than
hand coding.
My two cents,
Kingsley
Skip Montanaro <[EMAIL PROTECTED]> wrote:
On Wed:17:57, Craig R Hug
Good point. Combinations of some rules may be more
indicative of spam than others.
It would be great if the GA could infer the boolean
logic, as well as the scores.
Thanks,
Kingsley
On Wed:20:45, Tony L. Svanstrom wrote:
> On Wed, 29 May 2002 the voices made Kingsley G. Morse Jr. wr
On Wed:11:43, Rob Winters wrote:
[...]
> SA does not give any credit to the cumulative effect
[...]
It seems to me that properly weighted scores would
avoid this problem. I'd like to think that a good
optimization algorithm, such as a genetic algorithm,
could do the job.
Thanks,
Kingsley
__
Hi Craig,
Thanks for explaining why some scores aren't evolved.
I'm an old GA and optimization programmer, so I
naturally find SA's use of a GA pretty interesting.
You suggested using all network tests.
I've installed razor and made sure spamassassin isn't
called with the -L option.
However,
I installed SA 2.20 a few days ago and it's
mis-categorizing more emails than I'd like. I'll
*guess* that it's missing 10% of spams and
mislabelling 1% of my legitimate email as spam.
The obvious explanation is that I'm doing something
wrong, like not using razor or spamd.
However, I noticed th
Daniel,
Being an old AI/GA programmer who just started using
SA, your post fascinates me. Thanks for the update on
your research.
On Mon:22:07, Daniel Quinlan wrote:
[...]
> My only gripe is that having so many rules is somewhat clumsy in the
> scores file, even using arguments. What if spamass