On 10/13, Adam Katz wrote:
PS: As an SA Committer, do I have access to those logs?
Don't think so, but you can just ask for a regular masscheck account if you
don't already have one, and with that account do:
rsync --exclude '*~' -vaz rsync.spamassassin.org::corpus ./
--
I'd rather be happy
On 10/10/2011 9:16 AM, dar...@chaosreigns.com wrote:
On 10/10, Marc Perkel wrote:
On 9/28/2011 8:02 AM, dar...@chaosreigns.com wrote:
On 09/28, Marc Perkel wrote:
You would only have to test the rule combinations that the message
actually triggered. So if it hit 10 rules then it would be
On 9/28/2011 8:02 AM, dar...@chaosreigns.com wrote:
You definitely have a good point that it would only be necessary to
track the combinations that actually show up in emails, however
1024 is only the possible combinations from one set of 10 rules.
The number of combinations in the actual
On 9/28/2011 8:02 AM, dar...@chaosreigns.com wrote:
On 09/28, Marc Perkel wrote:
You would only have to test the rule combinations that the message
actually triggered. So if it hit 10 rules then it would be 1024
combinations. Seems not to be unreasonable to me.
You definitely have a good
On 10/10, Marc Perkel wrote:
On 9/28/2011 8:02 AM, dar...@chaosreigns.com wrote:
On 09/28, Marc Perkel wrote:
You would only have to test the rule combinations that the message
actually triggered. So if it hit 10 rules then it would be 1024
combinations. Seems not to be unreasonable to me.
On 9/27/2011 9:25 PM, dar...@chaosreigns.com wrote:
On 09/27, Marc Perkel wrote:
Here's the kind of think I'm seeing. Spam talks about money - low
score. Spam talks about Jesus - low score. Spam talks about money
and Jesus and throw in a dear someone and it's spam. I'm hoping to
detect
On 09/28, Marc Perkel wrote:
You would only have to test the rule combinations that the message
actually triggered. So if it hit 10 rules then it would be 1024
combinations. Seems not to be unreasonable to me.
You definitely have a good point that it would only be necessary to track
the
On 09/28, dar...@chaosreigns.com wrote:
On 09/28, Marc Perkel wrote:
You would only have to test the rule combinations that the message
actually triggered. So if it hit 10 rules then it would be 1024
combinations. Seems not to be unreasonable to me.
combinations in the actual corpora
On 9/25/2011 5:37 PM, RW wrote:
On Sun, 25 Sep 2011 09:28:32 -0700
Marc Perkel wrote:
Here's what I'd like to be able to do. I'd like a program of some
sort where I could take word tokes - like name of rules that were
triggered - and look for rule combinations that indicate spam or ham.
For
On 09/27, Marc Perkel wrote:
Here's the kind of think I'm seeing. Spam talks about money - low
score. Spam talks about Jesus - low score. Spam talks about money
and Jesus and throw in a dear someone and it's spam. I'm hoping to
detect combinations automatcally.
You're not really talking about
Another possibility would be to generate meta rules from random sets of
three rules. Some (actually random) examples:
meta RANDOM_3_A = (MPART_ALT_DIFF GAPPY_SUBJECT URI_UNSUBSCRIBE)
meta RANDOM_3_B = (RCVD_IN_MAPS_OPS WEIRD_PORT FSL_FAKE_GMAIL_RCVD)
meta RANDOM_3_C = (FB_CAN_LONGER
Here's what I'd like to be able to do. I'd like a program of some sort
where I could take word tokes - like name of rules that were triggered -
and look for rule combinations that indicate spam or ham. For example, a
message triggers 4 rules A B C and D. These rules are combined as follows:
A
On Sun, 25 Sep 2011 09:28:32 -0700
Marc Perkel supp...@junkemailfilter.com wrote:
Each rule combo is then looked up for how often it occurs in spam and
how often it occurs in ham. Then the results are combined into some
sort of likelihood of being spam or ham.
We looked at (and even
On Sun, 25 Sep 2011 09:28:32 -0700, Marc Perkel wrote:
Hope you all understand what I'm saying here. How would someone do
something like that?
meta foo ((a + b + c + d) x)
where x is how many of the rules that need to hit
then make __a __b __c __d body header what ever you like to scan for
On Sun, 25 Sep 2011 09:28:32 -0700
Marc Perkel wrote:
Here's what I'd like to be able to do. I'd like a program of some
sort where I could take word tokes - like name of rules that were
triggered - and look for rule combinations that indicate spam or ham.
For example, a message triggers 4
15 matches
Mail list logo