Re: Why is RP_MATCHES_RCVD so "heavy"?

2016-11-23 Thread Eric Abrahamsen
Matus UHLAR - fantomas  writes:

>>Eric Abrahamsen wrote:
>>> I get a lot of spam that passes the RP_MATCHES_RCVD test; it wouldn't
>>> make it into my inbox otherwise. I see the scoring recently got bumped
>>> to -3.0, which makes false negatives even more likely.
>>>
>>> I'm not expert enough in the nature of spam to really understand why
>>> this test is so strong, nor to feel confident in simply whacking a few
>>> points off it without knowing more.
>>>
>>> In the year or so that I've been running my own mail server, I don't
>>> think I've seen a *single* false positive (at least not one that I
>>> noticed), but get maybe an average of two spam mails into my inbox every
>>> day. I've beefed up the BAYES scores, and that helped, but haven't
>>> tweaked anything else.
>>>
>>> Can anyone tell me why it's scored so heavily? Would it be a bad idea to
>>> just drop it down to -1.5 or something?
>
> On 23.11.16 10:29, Kris Deugau wrote:
>>This is a rule whose usefulness is likely to vary a lot more for your
>>mail stream.
>>
>>Locally, I found it was firing on enough of the reported false-negatives
>>that I squashed it down to a purely advisory -0.001 quite a while ago,
>>and I haven't seen any issues with doing so.
>>
>>I didn't disable it outright as some others do, since it's used in
>>several meta rules.
>
> meta rules should match __RP_MATCHES_RCVD which is exactly the same rule
> - blanking RP_MATCHES_RCVD should make no difference
>
> Thus I (again) recommend blanking it...

Thanks to all of you for the responses! I'll weaken the rule a bit and
see how it goes -- looking at total scores for the spam that makes it
past SA, just a point or two should do it.

It was helpful seeing everyone's thought-process here, thanks again.

E



Re: Why is RP_MATCHES_RCVD so "heavy"?

2016-11-23 Thread Matus UHLAR - fantomas

Eric Abrahamsen wrote:

I get a lot of spam that passes the RP_MATCHES_RCVD test; it wouldn't
make it into my inbox otherwise. I see the scoring recently got bumped
to -3.0, which makes false negatives even more likely.

I'm not expert enough in the nature of spam to really understand why
this test is so strong, nor to feel confident in simply whacking a few
points off it without knowing more.

In the year or so that I've been running my own mail server, I don't
think I've seen a *single* false positive (at least not one that I
noticed), but get maybe an average of two spam mails into my inbox every
day. I've beefed up the BAYES scores, and that helped, but haven't
tweaked anything else.

Can anyone tell me why it's scored so heavily? Would it be a bad idea to
just drop it down to -1.5 or something?


On 23.11.16 10:29, Kris Deugau wrote:

This is a rule whose usefulness is likely to vary a lot more for your
mail stream.

Locally, I found it was firing on enough of the reported false-negatives
that I squashed it down to a purely advisory -0.001 quite a while ago,
and I haven't seen any issues with doing so.

I didn't disable it outright as some others do, since it's used in
several meta rules.


meta rules should match __RP_MATCHES_RCVD which is exactly the same rule
- blanking RP_MATCHES_RCVD should make no difference

Thus I (again) recommend blanking it...

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Despite the cost of living, have you noticed how popular it remains? 


Re: Why is RP_MATCHES_RCVD so "heavy"?

2016-11-23 Thread Kris Deugau
Eric Abrahamsen wrote:
> I get a lot of spam that passes the RP_MATCHES_RCVD test; it wouldn't
> make it into my inbox otherwise. I see the scoring recently got bumped
> to -3.0, which makes false negatives even more likely.
> 
> I'm not expert enough in the nature of spam to really understand why
> this test is so strong, nor to feel confident in simply whacking a few
> points off it without knowing more.
> 
> In the year or so that I've been running my own mail server, I don't
> think I've seen a *single* false positive (at least not one that I
> noticed), but get maybe an average of two spam mails into my inbox every
> day. I've beefed up the BAYES scores, and that helped, but haven't
> tweaked anything else.
> 
> Can anyone tell me why it's scored so heavily? Would it be a bad idea to
> just drop it down to -1.5 or something?

This is a rule whose usefulness is likely to vary a lot more for your
mail stream.

Locally, I found it was firing on enough of the reported false-negatives
that I squashed it down to a purely advisory -0.001 quite a while ago,
and I haven't seen any issues with doing so.

I didn't disable it outright as some others do, since it's used in
several meta rules.

-kgd


Re: Why is RP_MATCHES_RCVD so "heavy"?

2016-11-23 Thread Bill Cole

On 22 Nov 2016, at 17:54, Eric Abrahamsen wrote:


I get a lot of spam that passes the RP_MATCHES_RCVD test; it wouldn't
make it into my inbox otherwise. I see the scoring recently got bumped
to -3.0, which makes false negatives even more likely.

I'm not expert enough in the nature of spam to really understand why
this test is so strong, nor to feel confident in simply whacking a few
points off it without knowing more.

In the year or so that I've been running my own mail server, I don't
think I've seen a *single* false positive (at least not one that I
noticed), but get maybe an average of two spam mails into my inbox 
every

day. I've beefed up the BAYES scores, and that helped, but haven't
tweaked anything else.

Can anyone tell me why it's scored so heavily?


Probably someone more intimate withe the RuleQA process can explain it. 
To me it looks too noisy to be scored so strongly, and for years I've 
had it pegged for my systems at -0.3. I suspect that much of the 
non-matching spam is stuff that many sites exclude well ahead of SA, so 
it is not as indicative in production systems as it is in RuleQA.



Would it be a bad idea to
just drop it down to -1.5 or something?


In the past 2 years on multiple mail systems I have had no indication of 
any false positives which would have been cured by a stronger ham score 
for RP_MATCHES_RCVD. My reduction to -0.3 was based on the rule 
chronically redeeming a stream of snowshoe spam that was otherwise 
scoring in the ~6 range. Whether and how far you reduce its power should 
be based on your local circumstances, but -1.5 strikes me as probably a 
reasonable & prudent guess in the absence of careful analysis.


Re: Why is RP_MATCHES_RCVD so "heavy"?

2016-11-23 Thread @lbutlr
On Nov 22, 2016, at 3:54 PM, Eric Abrahamsen  wrote:
> I get a lot of spam that passes the RP_MATCHES_RCVD test; it wouldn't
> make it into my inbox otherwise. I see the scoring recently got bumped
> to -3.0, which makes false negatives even more likely.

I do see this in spam, but I see it so much more in ham that I’ve not changed 
the score. The spam that does hit it seems to score very highly in other areas 
(bayes_99 and bayes_999 especially). I see it in a lot of mail that is often 
tagged by the user as spam, but os not actually spam. For example, emails from 
macy’s or target which the user did sign up for, but is too lazy to unsubscribe.

But run it against your corpus and adjust the score as needed.




Re: Why is RP_MATCHES_RCVD so "heavy"?

2016-11-22 Thread Ian Zimmerman
On 2016-11-22 14:54, Eric Abrahamsen wrote:

> Can anyone tell me why it's scored so heavily? Would it be a bad idea
> to just drop it down to -1.5 or something?

I score it as 0, and I think a number of others on this list (with much
more expertise than me) do the same.

-- 
Please *no* private Cc: on mailing lists and newsgroups
Personal signed mail: please _encrypt_ and sign
Don't clear-text sign: http://cr.yp.to/smtp/8bitmime.html


Why is RP_MATCHES_RCVD so "heavy"?

2016-11-22 Thread Eric Abrahamsen
I get a lot of spam that passes the RP_MATCHES_RCVD test; it wouldn't
make it into my inbox otherwise. I see the scoring recently got bumped
to -3.0, which makes false negatives even more likely.

I'm not expert enough in the nature of spam to really understand why
this test is so strong, nor to feel confident in simply whacking a few
points off it without knowing more.

In the year or so that I've been running my own mail server, I don't
think I've seen a *single* false positive (at least not one that I
noticed), but get maybe an average of two spam mails into my inbox every
day. I've beefed up the BAYES scores, and that helped, but haven't
tweaked anything else.

Can anyone tell me why it's scored so heavily? Would it be a bad idea to
just drop it down to -1.5 or something?

Thanks,
Eric