Re: Increasing score based on membership to commercial whitelist

2011-09-26 Thread darxus
On 09/24, David Bennett wrote:
> It occurred to me that a sender that is paying their way into my inbox
> is almost certainly sending me junk mail.   A little research in my
> inbox and it turns out to be right on the money.  All stuff that I
> didn't want. 

I'm very curious what exactly your statistics looked like.  I'll point you
to the spamassassin Rule QA stats that are publicly available:

> # commercial buy-in whitelists (most likely junk)
> score RCVD_IN_BSP_TRUSTED 0.500
> score RCVD_IN_BSP_OTHER 0.500
> score RCVD_IN_BONDEDSENDER 0.500
> score HABEAS_ACCREDITED_COI 0 0.5 0 0.5
> score HABEAS_ACCREDITED_SOI 0 0.25 0 0.25
> score HABEAS_CHECKED 0 0.1 0 0.1

I don't see any of the above in the current spamassassin rules.  What
version of spamassassin are you running?  Anything before 3.3.0 is very
much not recommended.

Ah yes, all but RCVD_IN_BONDEDSENDER were replaced with
RCVD_IN_RP_CERTIFIED in version 3.3.0:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6247

And it looks like RCVD_IN_BONDEDSENDER was replaced by RCVD_IN_BSP_OTHER
and RCVD_IN_BSP_TRUSTED some time over four years ago:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5476

I'm guessing you're not actually getting hits on any of these six, and just
added them based on an article that hasn't been updated in four years?

> score RCVD_IN_IADB_VOUCHED 0 0.2 0 0.2
> score RCVD_IN_IADB_DOPTIN 0 0.4 0 0.4
> score RCVD_IN_IADB_ML_DOPTIN 0 0.6 0 0.6

http://ruleqa.spamassassin.org/?daterev=20110924-r1175130-n&rule=%2FRCVD_IN_IADB
  MSECSSPAM% HAM% S/ORANK   SCORE  NAME   WHO/AGE
  00   0.0117   0.0000.460.00  RCVD_IN_IADB_VOUCHED  
  00   0.7806   0.0000.660.00  RCVD_IN_IADB_DOPTIN  
  000   0.5000.450.00  RCVD_IN_IADB_ML_DOPTIN  

Hit ZERO out of 362,124 spams.  Also hit a pretty insignificant amount
of ham (non-spam).  

> score RCVD_IN_DNSWL_LOW 0 0.1 0 0.1
> score RCVD_IN_DNSWL_MED 0 0.4 0 0.4
> score RCVD_IN_DNSWL_HI 0 0.8 0 0.8

http://ruleqa.spamassassin.org/?daterev=20110924-r1175130-n&rule=%2FDNSWL
  MSECSSPAM% HAM% S/ORANK   SCORE  NAME   WHO/AGE
  0   0.0003   1.8893   0.0000.750.00  RCVD_IN_DNSWL_HI  
  0   0.0224  25.6371   0.0010.860.00  RCVD_IN_DNSWL_MED  
  0   0.0376  12.0356   0.0030.790.00  RCVD_IN_DNSWL_LOW  
  0   0.2090  21.8867   0.0090.660.00  RCVD_IN_DNSWL_NONE  

25.6% of ham hits RCVD_IN_DNSWL_MED.  So you're adding a score of 0.4
to a quarter of your ham, when that rule is only hitting 0.02% of spam
(81 out of 362,124 spams).  And that's just one of the three dnswl rules
you're scoring as bad.

I have pretty graphs of dnswl stats over time here:
http://www.chaosreigns.com/dnswl/
(Chrome renders that badly, firefox renders it well, the
non-standardization pains me.)
The two at the bottom are spam vs. ham numbers in the mass-check corpora,
not specific to dnswl.


I assure you, if there were a test that was causing spam to get through,
that wasn't still worth running because a vastly overwhelming majority
of the emails it hit were ham (theoretically reducing false positives,
which is more important than missing a few spams), spamassassin developers
would be very interested to hear about it, and remove it.

If you have that kind of information, please do provide it.

-- 
"If you are not paranoid... you may not be paying attention."
 - j...@creative-net.net, on an IDPA mailing list
http://www.ChaosReigns.com


Re: Increasing score based on membership to commercial whitelist

2011-09-26 Thread Simon Loewenthal
"David F. Skoll"  wrote:

On Mon, 26 Sep 2011 13:49:36 -0400
dar...@chaosreigns.com wrote:

> On 09/24, David Bennett wrote:
> > It occurred to me that a sender that is paying their way into my
> > inbox is almost certainly sending me junk mail. A little research
> > in my inbox and it turns out to be right on the money. All stuff
> > that I didn't want.

> Disclaimer: I'm a dnswl.org admin, although haven't been active
> lately. Also, dnswl.org (provider of the data used by RCVD_IN_DNSWL_*
> rules) doesn't charge anybody for being listed. They only charge
> very high volume users of the data for use of the data, like Spamhaus
> and some other major blacklist providers.

As someone listed on dnswl.org, I can confirm this. We did not pay to
get our domain (roaringpenguin.com) or IP addresses listed. And I
assume that if we spam, we will be delisted in a hurry.

So please don't automatically assume that we're spammers just because
we're on dnswl.org. :)

Regards,

David.


My domains are listed on dnswl.org, & I did not pay a penny, although if I had 
then it would have been a penny worth paying for :)

-- 
If you cannot beat them, try to côntrole them.

Re: Increasing score based on membership to commercial whitelist

2011-09-26 Thread Ned Slider

On 26/09/11 19:00, David F. Skoll wrote:

On Mon, 26 Sep 2011 13:49:36 -0400
dar...@chaosreigns.com wrote:


On 09/24, David Bennett wrote:

It occurred to me that a sender that is paying their way into my
inbox is almost certainly sending me junk mail.   A little research
in my inbox and it turns out to be right on the money.  All stuff
that I didn't want.



Disclaimer:  I'm a dnswl.org admin, although haven't been active
lately. Also, dnswl.org (provider of the data used by RCVD_IN_DNSWL_*
rules) doesn't charge anybody for being listed.  They only charge
very high volume users of the data for use of the data, like Spamhaus
and some other major blacklist providers.


As someone listed on dnswl.org, I can confirm this.  We did not pay to
get our domain (roaringpenguin.com) or IP addresses listed.  And I
assume that if we spam, we will be delisted in a hurry.

So please don't automatically assume that we're spammers just because
we're on dnswl.org. :)

Regards,

David.



Same here.

I'm not generally a big fan of whitelists (at least not someone else's 
whitelists), but the fact I bothered to sign up serves as a good 
indicator of the level of trust I have in dnswl.org - and that's trust 
that has been earned based on what I see within my own mail flow.


Each to their own of course, but I have no issue with dnswl.org nor it's 
default scoring in SA which seems very reasonable to me.





Re: Increasing score based on membership to commercial whitelist

2011-09-26 Thread David F. Skoll
On Mon, 26 Sep 2011 13:49:36 -0400
dar...@chaosreigns.com wrote:

> On 09/24, David Bennett wrote:
> > It occurred to me that a sender that is paying their way into my
> > inbox is almost certainly sending me junk mail.   A little research
> > in my inbox and it turns out to be right on the money.  All stuff
> > that I didn't want.

> Disclaimer:  I'm a dnswl.org admin, although haven't been active
> lately. Also, dnswl.org (provider of the data used by RCVD_IN_DNSWL_*
> rules) doesn't charge anybody for being listed.  They only charge
> very high volume users of the data for use of the data, like Spamhaus
> and some other major blacklist providers.

As someone listed on dnswl.org, I can confirm this.  We did not pay to
get our domain (roaringpenguin.com) or IP addresses listed.  And I
assume that if we spam, we will be delisted in a hurry.

So please don't automatically assume that we're spammers just because
we're on dnswl.org. :)

Regards,

David.


Re: Increasing score based on membership to commercial whitelist

2011-09-26 Thread darxus
On 09/24, David Bennett wrote:
> It occurred to me that a sender that is paying their way into my inbox
> is almost certainly sending me junk mail.   A little research in my
> inbox and it turns out to be right on the money.  All stuff that I
> didn't want. 

Disclaimer:  I'm a dnswl.org admin, although haven't been active lately.
Also, dnswl.org (provider of the data used by RCVD_IN_DNSWL_* rules)
doesn't charge anybody for being listed.  They only charge very high
volume users of the data for use of the data, like Spamhaus and some
other major blacklist providers.

Most of what I want to say is that spamassassin generally assigns scores
very carefully calculated to give the most accurate results.  The mass-check
process takes lots of real-world spams and non-spams, and using that
calculates ideal scores for every rule.

Although the DNSWL rules are among those which have hard-coded (not
"mutable") scores.  I believe this is because it's a problem to get the
mass-check contributors to set their trusted_networks reliably enough
to avoid really messing up the results.

It might help if the submitted mass-check data included a one way salted
hash of the last untrusted IP, with a different salt per user, so we could
verify they're not showing a bunch of spam all from the same IP address,
which is likely to mean they have a trusted_networks configured poorly
(it would mean that reported address is probably a trusted relay).


As you can imagine, the mass-check process can always use more contributors
to increase the accuracy of spamassassin.  You are only uploading a list of
the tests each email hit, not the emails themselves, so it's not a privacy
concern.  To contribute:
http://wiki.apache.org/spamassassin/NightlyMassCheck
The dev@ mailing list occasionally requires some harassing to get new
accounts created, which also drives me nuts.


Also, as you can imagine, having your trusted_networks not configured
properly will also screw up the results of many of the network tests.

-- 
"Every normal man must be tempted at times to spit upon his hands,
hoist the black flag, and begin slitting throats."
 - Henry Louis Mencken (1880-1956)
http://www.ChaosReigns.com


Re: unsubscribe

2011-09-26 Thread Benny Pedersen

On Mon, 26 Sep 2011 13:24:36 +, Londen, Michael van wrote:


P PLEASE CONSIDER THE ENVIRONMENT BEFORE PRINTING THIS MESSAGE.


save trees dont post html :)

send a email to users-unsubscr...@spamassassin.apache.org reply to what 
you get back, and then you are off the list


Re:[SOLVED] Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Marcin Mirosław

W dniu 26.09.2011 15:53, Bowie Bailey pisze:

There is nothing in that sample that would cause the rule to fire.  I
downloaded it and ran it against my SA and did not get a match for
MISSING_SUBJECT.  The only thing I can think of is that the headers end
at the first blank line.  If there is a blank line somewhere in the
headers, that will cause SA to treat everything below that line as part
of the body rather than the header.


Email doesn't contain body so there is nothing after headers.


Download your sample from pastebin and run it through SA to see if it
still matches the rule for you.  You may have inadvertently fixed the
problem when you munged the recipient address prior to uploading the sample.


I've downoloaded email from pastebin and nothing changed.

Reason: PEBKAC , i've got redundant backslash in own rules. I should use 
--lint more often.


Sorry for noise and thanks for your time!
Regards.


Re: Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Marcin Mirosław

W dniu 26.09.2011 15:52, Matus UHLAR - fantomas pisze:

I don't see other X-Spam headers there. How are you running
spamassassin? Aren't you using amavis ot other software using just
spamassassin libraries?

Are you sure some 3rd party does not modify mail headers?


No, i don't use any 3rd packages, i'm using exim+spamd.
Sorry i didn't start spamd with en_us locales, some headers are translated.
All headers from SA are in X-Spam-Report, header X-Szczegoly contains 
report which rules hitted email.


Re: Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Bowie Bailey
On 9/26/2011 9:37 AM, Marcin Mirosław wrote:
> Hello!
> I'd like to ask you if this rule works correctly? I've sended email from 
> thunderbird and roundcube and in both cases this rule scores email. Here 
> is sample email: http://pastebin.com/rVTwNp5X (with little mungled 
> recipient).
> Rules are in version: 1162027, spamassassin-3.3.2
> Thanks for help.
> Regards.

There is nothing in that sample that would cause the rule to fire.  I
downloaded it and ran it against my SA and did not get a match for
MISSING_SUBJECT.  The only thing I can think of is that the headers end
at the first blank line.  If there is a blank line somewhere in the
headers, that will cause SA to treat everything below that line as part
of the body rather than the header.

Download your sample from pastebin and run it through SA to see if it
still matches the rule for you.  You may have inadvertently fixed the
problem when you munged the recipient address prior to uploading the sample.

-- 
Bowie


Re: Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Matus UHLAR - fantomas

On 26.09.11 15:37, Marcin Mirosław wrote:
I'd like to ask you if this rule works correctly? I've sended email 
from thunderbird and roundcube and in both cases this rule scores 
email. Here is sample email: http://pastebin.com/rVTwNp5X (with 
little mungled recipient).

Rules are in version: 1162027, spamassassin-3.3.2


I don't see other X-Spam headers there. How are you running 
spamassassin? Aren't you using amavis ot other software using just 
spamassassin libraries?


Are you sure some 3rd party does not modify mail headers?
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Fucking windows! Bring Bill Gates! (Southpark the movie)


Why rule "MISSING_SUBJECT" is fired?

2011-09-26 Thread Marcin Mirosław

Hello!
I'd like to ask you if this rule works correctly? I've sended email from 
thunderbird and roundcube and in both cases this rule scores email. Here 
is sample email: http://pastebin.com/rVTwNp5X (with little mungled 
recipient).

Rules are in version: 1162027, spamassassin-3.3.2
Thanks for help.
Regards.


unsubscribe

2011-09-26 Thread Londen, Michael van


Met vriendelijke groet,
[Beschrijving: 
C:\Users\admin_mlonde01\AppData\Roaming\Microsoft\Handtekeningen\akn.gif] 
Michael van Londen
informatie & media technologie
netwerkbeheerder
T:

+31356714900 (Extern)/1234 Optie 2 (Intern)

F:

+31356714538

E:

michael.vanlon...@akn.nl


W: http://www.akn.nl

P Please consider the environment before printing this message.

<>

charset in rules

2011-09-26 Thread Matus UHLAR - fantomas

Hello,

I was trying to write a rule that would lower the effect of FRT_PENIS1
rule, since this one often matches text in czech/slovak language
(e.g. peníze == money)

I didn't want to zero score of FRT_PENIS1, because that still may catch 
some spam.


I have expected that putting UTF-8 text into the body rule like

body__PENI_NOPENIS  /pen[íě]\s?z/

(e.g. iacute, ecaron)

could help me, however this rule does not match on 3 mails I've 
checked.


I wondered when I changed the used character set to iso-8859-2, it 
matched (even very badly formatted HTML mail with HTML encoding).


body__PENI_NOPENIS  /pen[\xED\xEC]\s?z/

Is this expected behaviour?

my version of SA is 3.3.1 with perl 5.12.3 and LC_CTYPE is set to 
sk_SK.utf-8

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Spam is for losers who can't get business any other way.