Re: "bout u" campaign

David Jones Mon, 17 Jul 2017 10:39:06 -0700

On 07/17/2017 12:03 PM, Jesse Norell wrote:

This description:


On Thu, 2017-07-13 at 15:07 +0100, Martin Gregorie wrote:

I'm continuing to get good results from a multi-level approach:

I use two or more subrules with low scores (0.01 or so) that are
combined by an AND relation in a meta-rule that triggers a suitably
spammy score when all subrules get hits.

The subrules are typically automatically assembled lists of words or
phrases - automatically assembled because that makes maintenance
vastly
easier. The list contents are typically words and phrases found in
spam, e.g. one list might be selling phrases such as "get you rocks
off
with" that are unlikely to appear in personal or legit commercial mail
and another might be names or slang terms for less common
pharmaceuticals.



and what David Jones has been describing in this thread of identifying
specific combinations of rules (his based on reputation vs. content)
both remind me of the description of Marc Perkel's "evolution filter",
which from memory identified sets of rules which are very indicative of
ham/spam.   Both David and Martin are reporting good success, as did
Marc - maybe worth looking into implementing in spamassassin?

Does masscheck automate meta rule creation? (ie. not just generate
scores)  Not the full "evolution filter" idea which would have to run on
the endpoint, but that would benefit everyone via rule updates.

I have been working on rebuilding the SA project's server the past fourmonths. The first priority was getting the spamassassin.org hidden DNSmaster active again. This was pretty easy. The second priority was themasscheck processing which turned out to be pretty time intensive andstill could have an open issue so SA updates are currently on hold.

From what I can tell, the masscheck is only meant to dynamically updatethe rule scores in 72_scores.cf (manual scores are in 50_scores.cf) andhelp validate new rules added by the SA developers. I doesn't createnew rules. It's not able to create new rules based on content since themasscheck processing is run locally by easy user. The email content isnot uploaded to the SA server. Only a special log file showing all ofthe rule hits each message hit for ham and spam is sent to the SA server.

It would be nice if there was a local tool that could be part of the SAproject that would extend the masscheck processing and help buildcontent and meta rules. This would create more interest in masscheckingand get more people involved. (I use my masscheck ham/spam to alsotrain my Bayes DB or else it may not have been helpful enough for me toset it up and understand the value of it.) I suspect the advanced usersof SA like Kevin's KAM.cf rules and a few others on this list havesomething like this they are using to build custom rules in an automatedway. Thankfully Kevin publishes his KAM.cf and allows public downloading.

I know that Kevin has a desire to be able to speed up rule developmentand SA updates (could take up to ~40 hours today if it weren't currentlyon hold) to react faster to new spam but it will never be fast enough toreact to zero-hour spam like other technologies. The best thing you cando is selective greylisting, rate limiting, DCC, Razor, Pyzor, and hopethe RBLs catch up quickly. I also have a local ruleset that I addzero-hour spam to shortcircuit as spam based on content which does apretty good job at most new spam and phishing but some still get throughnow and then from compromised accounts.


--
David Jones

Re: "bout u" campaign

Reply via email to