Re: Dealing with low scoring spam - tighter MTA integration

2009-03-05 Thread Kenneth Porter
--On Thursday, March 05, 2009 10:31 PM +0100 Andrzej Adam Filip 
 wrote:



I try hard to preach that SA methodology of creating "spam score" based
on weighted tests *CAN* be applied at this point too.
I would like too apply such test in milter (MIMEDefang) that uses SA
anyway in my installation.


A cheap way of doing it would be to construct an artificial message from 
the information available. One would probably want to use a custom set of 
rules (ie. strip out most of the normal rules that assume a full set of 
headers and a regular body).



At "RCPT TO:" stage there are available:
* connecting client IP address (last mail hop)
  so big part of DNSBL and DNSWL tests *CAN* be used
* envelope sender for SPF based tests
* envelope sender and envelope recipient for auto white/black listing
  (producing some kind of grey-listing based for first attempt from
  unknown reputation source)


Instead of running all of SA, perhaps you could just invoke the individual 
plugins from their Perl entry points. I'm not familiar enough with SA's 
architecture to know how practical that is, though.


Re: Dealing with low scoring spam - tighter MTA integration

2009-03-05 Thread Andrzej Adam Filip
James Wilkinson  wrote:

> Andrzej Adam Filip wrote:
>> At "RCPT TO:" stage there are available:
>> * connecting client IP address (last mail hop)
>>   so big part of DNSBL and DNSWL tests *CAN* be used
>> * envelope sender for SPF based tests
>> * envelope sender and envelope recipient for auto white/black listing
>>   (producing some kind of grey-listing based for first attempt from
>>   unknown reputation source)
>
> Are you thinking that it might be good to tie this in to the
> SpamAssassin AWL score? So a sender with an existing low AWL might be
> allowed through even if the sending host gets on one or two DNSBLs?

I want "a platform" allowing many people to contribute 
"small improvements" e.g. whilte-listing based on combination
of sender address and ASN (or routing prefix).

> And you’re missing the possibility of doing reverse DNS lookups, too.

I have considered it to be obvious derivate of "connecting client IP address"

-- 
[pl>en: Andrew] Andrzej Adam Filip : a...@onet.eu
Seek simplicity -- and distrust it.
  -- Alfred North Whitehead


Re: Dealing with low scoring spam - tighter MTA integration

2009-03-05 Thread James Wilkinson
Andrzej Adam Filip wrote:
> At "RCPT TO:" stage there are available:
> * connecting client IP address (last mail hop)
>   so big part of DNSBL and DNSWL tests *CAN* be used
> * envelope sender for SPF based tests
> * envelope sender and envelope recipient for auto white/black listing
>   (producing some kind of grey-listing based for first attempt from
>   unknown reputation source)

Are you thinking that it might be good to tie this in to the
SpamAssassin AWL score? So a sender with an existing low AWL might be
allowed through even if the sending host gets on one or two DNSBLs?

And you’re missing the possibility of doing reverse DNS lookups, too.

James.

-- 
E-mail: james@ | A: Because people don’t normally read bottom to top.
aprilcottage.co.uk | Q: Why is top-posting such a bad thing?
   | A: Top-posting.
   | Q: What is the most annoying thing in e-mail and usenet?


Re: Dealing with low scoring spam - tighter MTA integration

2009-03-05 Thread Andrzej Adam Filip
Kenneth Porter  wrote:

> --On Thursday, March 05, 2009 7:43 AM +0100 Andrzej Adam Filip
>  wrote:
>
>> What I would like to see is a option to make spam assassin to produce
>> "weighted scores" based on subset of all tests capable to work on subset
>> of the "final data" available *before* message headers&body are
>> transfered in SMTP session.
>
> Before you get the DATA part, you only have the EHLO and envelope. 

At "RCPT TO:" stage there are available:
* connecting client IP address (last mail hop)
  so big part of DNSBL and DNSWL tests *CAN* be used
* envelope sender for SPF based tests
* envelope sender and envelope recipient for auto white/black listing
  (producing some kind of grey-listing based for first attempt from
  unknown reputation source)

> Not a real need for a full-blown SA scan at that point.

I try hard to preach that SA methodology of creating "spam score" based
on weighted tests *CAN* be applied at this point too.
I would like too apply such test in milter (MIMEDefang) that uses SA
anyway in my installation.

> What rules would  you apply that couldn't be done with a simple Perl
> function?

SA is not "a simple set of perl functions"? ;-)

Delivering such functionality via SA would assure keeping sync of
weights with changing spamming patterns. Some spammers are smart,
many spammers are smart enough to follow so quality of maintenance team
and maintenance methodology does make difference.

> (For lurkers, MIMEDefang allows one to write a Sendmail milter in
> Perl, by providing a C-to-Perl translation layer.)

-- 
[pl>en: Andrew] Andrzej Adam Filip : a...@onet.eu
You can't have everything.  Where would you put it?
  -- Steven Wright


Re: Dealing with low scoring spam - tighter MTA integration

2009-03-05 Thread Kenneth Porter
--On Thursday, March 05, 2009 7:43 AM +0100 Andrzej Adam Filip 
 wrote:



What I would like to see is a option to make spam assassin to produce
"weighted scores" based on subset of all tests capable to work on subset
of the "final data" available *before* message headers&body are
transfered in SMTP session.


Before you get the DATA part, you only have the EHLO and envelope. Not a 
real need for a full-blown SA scan at that point. What rules would you 
apply that couldn't be done with a simple Perl function? (For lurkers, 
MIMEDefang allows one to write a Sendmail milter in Perl, by providing a 
C-to-Perl translation layer.)





Re: Dealing with low scoring spam - tighter MTA integration

2009-03-04 Thread Andrzej Adam Filip
Kenneth Porter  wrote:

> --On Wednesday, March 04, 2009 4:02 PM +0100 Andrzej Adam Filip
>  wrote:
>
>> May be spamassassin should create set of tests intended for use before
>> replying "RCPT TO:" in SMTP session?
>
> Check out 
>
> MIMEDefang includes SA integration.

I know MIMEDefang and I use it on one installation.

What I would like to see is a option to make spam assassin to produce
"weighted scores" based on subset of all tests capable to work on subset
of the "final data" available *before* message headers&body are
transfered in SMTP session.

-- 
[pl>en: Andrew] Andrzej Adam Filip : a...@onet.eu
Treaties are like roses and young girls -- they last while they last.
  -- Charles DeGaulle


Re: Dealing with low scoring spam - tighter MTA integration [was: 2 + 2 != 4 - Spamassassin needs a new paradigm]

2009-03-04 Thread Kenneth Porter
--On Wednesday, March 04, 2009 4:02 PM +0100 Andrzej Adam Filip 
 wrote:



May be spamassassin should create set of tests intended for use before
replying "RCPT TO:" in SMTP session?


Check out 

MIMEDefang includes SA integration.




Re: Dealing with low scoring spam - tighter MTA integration [was: 2 + 2 != 4 - Spamassassin needs a new paradigm]

2009-03-04 Thread SM

At 07:02 04-03-2009, Andrzej Adam Filip wrote:

May be spamassassin should create set of tests intended for use before
replying "RCPT TO:" in SMTP session?
[ test based on: sending IP address, envelope sender, envelope
recipient, and name in helo/ehlo ]


SpamAssassin processes the message and returns the result.  The way 
it is designed, it can be integrated in different environments as it 
is MTA agnostic.  The change you propose could be done by introducing 
a new command in the protocol to evaluate the envelope information only.


It would be easier to do all that through a milter as there is less 
overhead.  The downside is that you will get more false positives.


Regards,
-sm 



Re: Dealing with low scoring spam - tighter MTA integration

2009-03-04 Thread Andrzej Adam Filip
John Hardin  wrote:

> On Wed, 4 Mar 2009, Andrzej Adam Filip wrote:
>
>>> This would be an entirely different application, not SA, wouldn't it?
>>
>> It can be developed using the same "spam score" logic, based subset of
>> all tests requiring only the subset of "final data" available during
>> "classic run".
>
> So in other words something like SMTP-time DNSBL tests that score
> points towards rejection rather than being pass/fail? That sounds like
> a good idea.
>
>> I do think that promoting tools that encourage postmaster to care very
>> much about mail server (IP address) reputation can make real difference
>> e.g. caring to be above reputation "none" in DNSWL to avoid grey-listing.
>
> Agreed. But, performing major redesign of SA to achieve this pre-RCPT
> is going to be a tough sell.
>
>>> Well, this probably could be done in SA using a multi-level protocol
>>> capable of returning values at different stages. However, this seems
>>> perfectly suited for a lightweight tool, rather than a hog that is
>>> designed to scan and process entire messages. :)
>>
>> During initial tests/deployment *much* simpler implementation can be
>> used with recommended action based on spam score:
>>
>> It would require redesign of 50_scores.cf structure.
>>  e.g. instead of
>>score RCVD_IN_DNSWL_HI 0 -8 0 -8
>>  something like that
>># N - Network, B - Bayes, nX - no X, R - "RCPT TO:"
>>score RCVD_IN_DNSWL_HI nNnB=0 NnB=-8 nNB=0 NB=-8 R=-8
>>  or shorter
>>score RCVD_IN_DNSWL_HI N=-8 R=-8
>
> Why would SA be served by _major_ modifications like this, rather than
> writing a new milter that focuses on determining the reputation of an
> IP? Are you really willing to break _all_ existing SA installations
> for this?
>
> Please don't try to make SA a "do everything" tool, you'll likely
> weaken what it does an outstanding job of today.

0) Such _major_ modification means introducing it in next _major_
   spamassassin release unless it can be made "downward compatible"
   e.g. by using *separate* "score file" for at "RCPT TO:" tests.

   Where there's a Will, there's a way

1) I want milter(s) (MIMEDefang's filtering script in perl) to use
   spamassassin in such role. I personally prefer such tools from teams
   with well established "maintenance reputation". I also believe that
   SA "score tuning methodology" would fit very well too.
2) Anyway limiting scores to *only* four cases *SHOULD NOT* "stay forever".

-- 
[pl>en: Andrew] Andrzej Adam Filip : a...@onet.eu
"All the people are so happy now, their heads are caving in.
I'm glad they are a snowman with protective rubber skin"
  -- They Might Be Giants


Re: Dealing with low scoring spam - tighter MTA integration

2009-03-04 Thread John Hardin

On Wed, 4 Mar 2009, Andrzej Adam Filip wrote:


This would be an entirely different application, not SA, wouldn't it?


It can be developed using the same "spam score" logic, based subset of
all tests requiring only the subset of "final data" available during
"classic run".


So in other words something like SMTP-time DNSBL tests that score points 
towards rejection rather than being pass/fail? That sounds like a good 
idea.



I do think that promoting tools that encourage postmaster to care very
much about mail server (IP address) reputation can make real difference
e.g. caring to be above reputation "none" in DNSWL to avoid grey-listing.


Agreed. But, performing major redesign of SA to achieve this pre-RCPT is 
going to be a tough sell.



Well, this probably could be done in SA using a multi-level protocol
capable of returning values at different stages. However, this seems
perfectly suited for a lightweight tool, rather than a hog that is
designed to scan and process entire messages. :)


During initial tests/deployment *much* simpler implementation can be
used with recommended action based on spam score:

It would require redesign of 50_scores.cf structure.
 e.g. instead of
   score RCVD_IN_DNSWL_HI 0 -8 0 -8
 something like that
   # N - Network, B - Bayes, nX - no X, R - "RCPT TO:"
   score RCVD_IN_DNSWL_HI nNnB=0 NnB=-8 nNB=0 NB=-8 R=-8
 or shorter
   score RCVD_IN_DNSWL_HI N=-8 R=-8


Why would SA be served by _major_ modifications like this, rather than 
writing a new milter that focuses on determining the reputation of an IP? 
Are you really willing to break _all_ existing SA installations for this?


Please don't try to make SA a "do everything" tool, you'll likely weaken 
what it does an outstanding job of today.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Failure to plan ahead on someone else's part does not constitute
  an emergency on my part. -- David W. Barts in a.s.r
---
 4 days until Daylight Saving Time begins in U.S. - Spring Forward


Re: Dealing with low scoring spam - tighter MTA integration

2009-03-04 Thread Andrzej Adam Filip
Karsten Bräckelmann  wrote:

> On Wed, 2009-03-04 at 16:02 +0100, Andrzej Adam Filip wrote:
>> Karsten Bräckelmann  wrote:
>
>> > About 98-99% of my spam in-stream scores as high, that any such proposal
>> > results in a useless increase of the score.
>> >
>> > The problem lies with the LOW scoring spam. Alas, these do not tend to
>> > trigger on a solid subset or meta as you proposed. In particular, RBL
>> > hits are quite rare, even more so for multiple hits. The few rules hit
>> > by low scorers are quite diverse, which complicates this.
>> 
>> May be spamassassin should create set of tests intended for use before
>> replying "RCPT TO:" in SMTP session?
>> [ test based on: sending IP address, envelope sender, envelope
>> recipient, and name in helo/ehlo ]
>
> This would be an entirely different application, not SA, wouldn't it?

It can be developed using the same "spam score" logic, based subset of
all tests requiring only the subset of "final data" available during
"classic run".

I do think that promoting tools that encourage postmaster to care very
much about mail server (IP address) reputation can make real difference
e.g. caring to be above reputation "none" in DNSWL to avoid grey-listing.

> Well, this probably could be done in SA using a multi-level protocol
> capable of returning values at different stages. However, this seems
> perfectly suited for a lightweight tool, rather than a hog that is
> designed to scan and process entire messages. :)

During initial tests/deployment *much* simpler implementation can be
used with recommended action based on spam score:

It would require redesign of 50_scores.cf structure.
  e.g. instead of
score RCVD_IN_DNSWL_HI 0 -8 0 -8
  something like that
# N - Network, B - Bayes, nX - no X, R - "RCPT TO:"
score RCVD_IN_DNSWL_HI nNnB=0 NnB=-8 nNB=0 NB=-8 R=-8
  or shorter
score RCVD_IN_DNSWL_HI N=-8 R=-8

>> Possible "recommended actions":  accept, temporary reject, permanent
>> reject - with choice based on "spam score" *AND* mail source reputation.
>> 
>> Temporary reject in SMTP session should increase chances of DNSBL hits
>> by reducing "blind spot" period of newly created spam sources.
>
> Experience with grey-listing, tempfail or whatever varies wildly given
> the posts to this list. Some do report, that the zombies won't retry
> anyway after being tempfailed once. So a later DNSBL hit after the list
> catching up and DNS propagation may be even irrelevant.

There are "DUL zombies" that effectively do frequent "IP address hoping"
and  "static NAT zombies". The former are bigger in number, the later
produce higher spam volume (IMHO).

-- 
[pl>en: Andrew] Andrzej Adam Filip : a...@onet.eu
All the taxes paid over a lifetime by the average American are spent by
the government in less than a second.
  -- Jim Fiebig


Re: Dealing with low scoring spam - tighter MTA integration [was: 2 + 2 != 4 - Spamassassin needs a new paradigm]

2009-03-04 Thread Karsten Bräckelmann
On Wed, 2009-03-04 at 16:02 +0100, Andrzej Adam Filip wrote:
> Karsten Bräckelmann  wrote:

> > About 98-99% of my spam in-stream scores as high, that any such proposal
> > results in a useless increase of the score.
> >
> > The problem lies with the LOW scoring spam. Alas, these do not tend to
> > trigger on a solid subset or meta as you proposed. In particular, RBL
> > hits are quite rare, even more so for multiple hits. The few rules hit
> > by low scorers are quite diverse, which complicates this.
> 
> May be spamassassin should create set of tests intended for use before
> replying "RCPT TO:" in SMTP session?
> [ test based on: sending IP address, envelope sender, envelope
> recipient, and name in helo/ehlo ]

This would be an entirely different application, not SA, wouldn't it?

Well, this probably could be done in SA using a multi-level protocol
capable of returning values at different stages. However, this seems
perfectly suited for a lightweight tool, rather than a hog that is
designed to scan and process entire messages. :)


> Possible "recommended actions":  accept, temporary reject, permanent
> reject - with choice based on "spam score" *AND* mail source reputation.
> 
> Temporary reject in SMTP session should increase chances of DNSBL hits
> by reducing "blind spot" period of newly created spam sources.

Experience with grey-listing, tempfail or whatever varies wildly given
the posts to this list. Some do report, that the zombies won't retry
anyway after being tempfailed once. So a later DNSBL hit after the list
catching up and DNS propagation may be even irrelevant.


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}