from:"mike"

Does SpamAssassin even have facilities to do that? Don't all rules run 
all the time? SpamAssassin still needs to run all the rules because MTAs 
might have different spam mark / spam delete /etc thresholds than the 
one set in SA.

The number of cycles you're talking about is the same as an RBL lookup 
so I really don't see it as being significant. The DNS service does all 
the heavy lifting and I'm planning to make it public.

On 3/1/2019 5:09 PM, Rupert Gallagher wrote:

Case study:

example.com bans any e-mail sent from its third levels up, and does it 
by spf.

spf-banned.example.com sent mail, and my SA at server.com adds a big 
fat penalty, high enough to bounch it.

Suppose I do not bounch it, and use your filter to check for its 
websites. It turns out that both example.com and 
spf-banned.example.com have a website. Was it worth it to spend cycles 
on it? I guess not. The spf is an accepted rfc and it should have 
priority. So, I recommend the website test to first read the result of 
the SPF test, quit when positive, continue otherwise.

--- ruga

On 3/1/2019 5:09 PM, Rupert Gallagher wrote:

Case study:

example.com bans any e-mail sent from its third levels up, and does it 
by spf.

spf-banned.example.com sent mail, and my SA at server.com adds a big 
fat penalty, high enough to bounch it.

Suppose I do not bounch it, and use your filter to check for its 
websites. It turns out that both example.com and 
spf-banned.example.com have a website. Was it worth it to spend cycles 
on it? I guess not. The spf is an accepted rfc and it should have 
priority. So, I recommend the website test to first read the result of 
the SPF test, quit when positive, continue otherwise.

--- ruga

On Fri, Mar 1, 2019 at 22:31, Grant Taylor <mailto:gtay...@tnetconsulting.net>> wrote:

On 02/28/2019 09:39 PM, Mike Marynowski wrote:
> I modified it so it checks the root domain and all subdomains up to the
> email domain.

:-)

> As for your question - if afraid.org has a website then you are 
correct,

> all subdomains of afraid.org will not flag this rule, but if lots of
> afraid.org subdomains are sending spam then I imagine other spam
> detection methods will have a good chance of catching it.

ACK

afraid.org is much like DynDNS in that one entity (afaid.org themselves
or DynDNS) provide DNS services for other entities.

I don't see a good way to differentiate between the sets of entities.

> I'm not sure what you mean by "working up the tree" - if afraid.org has
> a website and I work my way up the tree then either way eventually I'll
> hit afraid.org and get a valid website, no?

True.

I wonder if there is any value in detecting zone boundaries via not
going any higher up the tree past the zone that's containing the email
domain(s).

Perhaps something like that would enable differentiation between Afraid
& DynDNS and the entities that they are hosting DNS services for.
(Assuming that there are separate zones.

> My current implementation fires off concurrent HTTP requests to the 
root

> domain and all subdomains up to the email domain and waits for a valid
> answer from any of them.

ACK

s/up to/down to/

I don't grok the value of doing this as well as you do. But I think
your use case is enough different than mine such that I can't make an
objective value estimate.

That being said, I do find the idea technically interesting, even if I
think I'll not utilize it.

--
Grant. . . .
unix || die

Re: Spam rule for HTTP/HTTPS request to sender's root domain




On 3/1/2019 4:31 PM, Grant Taylor wrote:
afraid.org is much like DynDNS in that one entity (afaid.org 
themselves or DynDNS) provide DNS services for other entities.


I don't see a good way to differentiate between the sets of entities.


I haven't come across any notable amount of spam that's punched through 
all the other detection methods in place with a reply-to/from email 
address subdomain on a service like that. I'm sure it happens though and 
in that case this filter simply won't add any value.

Re: Spam rule for HTTP/HTTPS request to sender's root domain


On 3/1/2019 1:07 PM, RW wrote:

Sure, but had it turned-out that most of these domains didn't have the A
record necessary for your HTTP test, it wouldn't have been worth doing
anything more complicated.


I've noticed a lot of the spam domains appear to point to actual web 
servers but throw 403 or 503 errors, which A records wouldn't help with 
and has been taken into account here. As for being "more complicated" - 
it's basically done and running in my test environment for final 
tweaking haha, so bit late now :P It was only a day's work to put 
everything together including the DNS service and caching layer, so meh. 
Unless you mean complicated in the sense that it's more technically 
complicated as opposed to effort wise.



You don't need an A record for email. The last time I looked it just
tests that there's enough DNS for a bounce to be received, so an A or
MX for the sender domain.


I'm confusing different tests here, you can disregard my previous message.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

Sorry, I meant I thought it was doing those checks because I know I was 
playing with checking A records before and figured the rules would have 
it enabled by default...I tried to find the rules after I sent that 
message and realized that was related to sender domain A record checks 
done in my MTA.


On 3/1/2019 2:26 PM, Antony Stone wrote:

On Friday 01 March 2019 at 17:37:18, Mike Marynowski wrote:


Quick sampling of 10 emails: 8 of them have valid A records on the email
domain. I presumed SpamAssassin was already doing simple checks like that.

That doesn't sound like a good idea to me (presuming, I mean).


Antony.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

Looking for an A record on what - just the email address domain or the 
chain of parent domains as well? If the latter, well a lack of A record 
will cause this to fail so it's kind of embedded in.


Quick sampling of 10 emails: 8 of them have valid A records on the email 
domain. I presumed SpamAssassin was already doing simple checks like that.


On 3/1/2019 10:23 AM, RW wrote:

On Wed, 27 Feb 2019 12:16:20 -0500
Mike Marynowski wrote:

Almost all of the spam emails that are
coming through do not have a working website at the room domain of
the sender.

Did you establish what fraction of this spam could be caught just by
looking for an A record?

Re: Spam rule for HTTP/HTTPS request to sender's root domain

Changing up the algorithm a bit. Once a domain has been added to the 
cache, the DNS service will perform HTTP checks in the background 
automatically on a much more aggressive schedule for invalid domains so 
that temporary website problems are much less of an issue and invalid 
domains don't delay mail delivery threads for up to 15s after TTL 
expirations during the initial test period with progressively increasing 
TTLs - queries can always return instantly after the first one, as long 
as the domain has been queried in the last 30 days and is still in cache.


Domains deemed to have "invalid" websites will be rechecked much more 
aggressively in the background to ensure newly queried domains with 
temporary website issues stop tripping this filter as soon as possible. 
There will be a "sliding window" of a few days where temporary website 
issues during the window won't cause the filter to trip, it just needs 
to provide a valid response sometime during the sliding window to stay 
in good standing.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

For anyone who wants to play around with this, the DNS service has been 
posted. You can test the existence of a website on a domain or any of 
its parent domains by making DNS queries as follows:


subdomain.domain.com.httpcheck.singulink.com

So, if you wanted to check if mail1.mx.google.com or any of its parent 
domains have a website, you would do a DNS query with a 30 second 
timeout for:


mail1.mx.google.com.httpcheck.singulink.com

This will check the following domains for a valid HTTP response within 
15 seconds:


mail1.mx.google.com
mx.google.com
google.com

If a valid HTTP response comes back then the DNS query will return 
NXDOMAIN with a 7 day TTL. If no valid HTTP response comes back then the 
DNS query will return 127.0.0.1 with progressively increasing TTLs:


#1: 2 mins
#2: 4 mins
#3: 6 mins
#4: 8 mins
#5: 10 mins
#6: 20 mins
#7: 30 mins
#8: 40 mins
#9: 50 mins
#10: 1 hour
#11: 2 hours
#12+: add 2 hours extra for each attempt up to 24h max

As long as an invalid domain has been queried in the last 7 days, it 
will remain cached and any further invalid attempts will continue to 
progressively increase the TTL according to the rules above. If a domain 
doesn't get queried for 7 days then it drops out of the cache and its 
invalid attempt counter is reset. A valid HTTP response will reset the 
domains invalid counter and a 7 day TTL is returned. Once a domain is in 
the cache, responses are immediate until the TTL runs out and the domain 
is rechecked again.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

You'll be able to decide how you want to prioritize the fields - I've 
implemented it as a DNS server, so which domain you decide to send to 
the DNS server is entirely up to you.


On 2/28/2019 10:23 PM, Grant Taylor wrote:

On 2/28/19 9:33 AM, Mike Marynowski wrote:
I'm doing grabs the first available address in this order: reply-to, 
from, sender.


That sounds like it might be possible to game things by playing with 
the order.


I'm not sure what sorts of validations are applied to the Sender: 
header.  (I don't remember if DMARC checks the Sender: header or not.)


How would your filter respond if the MAIL FROM: and the From: header 
were set to something that didn't have a website, yet had a Sender: 
header with @gmail.com listed before the Reply-To: and 
From: headers?

Re: Spam rule for HTTP/HTTPS request to sender's root domain

I modified it so it checks the root domain and all subdomains up to the 
email domain.


As for your question - if afraid.org has a website then you are correct, 
all subdomains of afraid.org will not flag this rule, but if lots of 
afraid.org subdomains are sending spam then I imagine other spam 
detection methods will have a good chance of catching it.


I'm not sure what you mean by "working up the tree" - if afraid.org has 
a website and I work my way up the tree then either way eventually I'll 
hit afraid.org and get a valid website, no?


My current implementation fires off concurrent HTTP requests to the root 
domain and all subdomains up to the email domain and waits for a valid 
answer from any of them.


On 2/28/2019 10:27 PM, Grant Taylor wrote:

What about domains that have many client subdomains?

afraid.org (et al) come to mind.

You might end up allowing email from spammer.afraid.org who doesn't 
have a website because the parent afraid.org does have a website.


I would think that checking from the child and working up the tree 
would be more accurate, even if it may take longer.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

I'm pretty sure the way I ended up implementing it everything is working 
fine and it's nice and simple and clean but maybe there's some edge case 
that doesn't work properly. If there is I haven't found it yet, so if 
you can think of one let me know.


Since I'm sending an HTTP request to all subdomains simultaneously it 
doesn't really matter if I go one further than the actual root domain. A 
"co.uk" request will come back with no website so there's no need to 
special handle it. For example, if the email address being tested is 
b...@mail1.mx.stuff.co.uk, an HTTP request goes out to:


mail1.mx.stuff.co.uk
mx.stuff.co.uk
stuff.co.uk
co.uk

The last one will always be cached from a previous .co.uk address lookup 
so it won't actually be sent out anyway. If any of them respond with a 
valid website then an OK result is returned.


On 2/28/2019 3:24 PM, Luis E. Muñoz wrote:

This is more complicated than it seems. I have the t-shirt to prove it.

I suggest you look at the Mozilla Public Suffix List at 
https://publicsuffix.org/ — it was created for different purposes, but 
I believe it maps well enough to my understanding of your use case. 
You'll be able to pad the gaps using a custom list.


Best regards

-lem

Re: Spam rule for HTTP/HTTPS request to sender's root domain

Thunderbird normally shows reply-to in normal messages...is this 
something that some MUAs ignore just on mailing list emails or all 
emails? Because I see reply-to on plenty of other emails.


On 2/28/2019 3:44 PM, Bill Cole wrote:

On 28 Feb 2019, at 14:29, Mike Marynowski wrote:

Unfortunately I don't see a reply-to header on your messages. What do 
you have it set to? I thought mailing lists see who is in the "to" 
section of a reply so that 2 copies aren't sent out. The "mailing 
list ethics" guide I read said to always use "reply all" and the 
mailing list system takes care of not sending duplicate replies.


I removed your direct email from this reply and only kept the mailing 
list address, but for the record I don't see any reply-to headers.


But it's right there in the copy that the list delivered to me:

From: "Bill Cole" 
To: users@spamassassin.apache.org
Subject: Re: Spam rule for HTTP/HTTPS request to sender's root domain
Date: Thu, 28 Feb 2019 14:21:41 -0500
Reply-To: users@spamassassin.apache.org

Whether you see it is a function of how your MUA (TBird, it seems... ) 
displays messages. Unfortunately, it has become common for MUAs simply 
ignore Reply-To. I didn't think TBird was in that class.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

There are many ways to determine what the root domain is. One way is 
analyzing the DNS response from the query to realize it's actually a 
root domain, or you can just grab the ICANN TLD list and use that to 
make a determination.


What I'm probably going to do now that I'm building this as a cached DNS 
service is just walk up the subdomains until I hit the root domain and 
if any of them have a website then it's fine.


On 2/28/2019 2:39 PM, Antony Stone wrote:

On Thursday 28 February 2019 at 20:33:42, Mike Marynowski wrote:


But scconsult.com does in fact have a website so I'm not sure what you
mean. This method checks the *root* domain, not the subdomain.

How do you identify the root domain, given an email address?

For example, for many years in the UK, it was possible to get something.co.uk
or something.org.uk (and maybe something.net.uk), but now it is also possible
to get something.uk

So, I'm just wondering how you determine what the "root" domain for a given
email address is.


Antony.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

But scconsult.com does in fact have a website so I'm not sure what you 
mean. This method checks the *root* domain, not the subdomain.


Even if this wasn't the case well, it is what it is. Emails from this 
mailing list (and most well configured lists) come in at a spam score of 
-6, so they are no risk of being blocked even if a non-website domain 
triggers this particular rule.


On 2/28/2019 2:25 PM, Bill Cole wrote:

On 28 Feb 2019, at 13:43, Mike Marynowski wrote:


On 2/28/2019 12:41 PM, Bill Cole wrote:
You should probably put the envelope sender (i.e. the SA 
"EnvelopeFrom" pseudo-header) into that list, maybe even first. That 
will make many messages sent via discussion mailing lists (such as 
this one) pass your test where a test of real header domains would 
fail, while it it is more likely to cause commercial bulk mail to 
fail where it would usually pass based on real standard headers. 
(That's based on a hunch, not testing.)
Can you clarify why you think my currently proposed headers would 
fail with the mailing list? As far as I can tell, all the messages 
I've received from this mailing list would pass just fine. As an 
example from the emails in this list, which header value specifically 
would cause it to fail?


If I did not explicitly set the Reply-To header, this message would be 
delivered without one. The domain part of the From header on messages 
I post to this and other mailing lists has no website and never will.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

Unfortunately I don't see a reply-to header on your messages. What do 
you have it set to? I thought mailing lists see who is in the "to" 
section of a reply so that 2 copies aren't sent out. The "mailing list 
ethics" guide I read said to always use "reply all" and the mailing list 
system takes care of not sending duplicate replies.


I removed your direct email from this reply and only kept the mailing 
list address, but for the record I don't see any reply-to headers.


On 2/28/2019 2:21 PM, Bill Cole wrote:
Please respect my consciously set Reply-To header. I don't ever need 2 
copies of a message posted to a mailing list, and ignoring that header 
is rude.


On 28 Feb 2019, at 13:28, Mike Marynowski wrote:


On 2/28/2019 12:41 PM, Bill Cole wrote:
You should probably put the envelope sender (i.e. the SA 
"EnvelopeFrom" pseudo-header) into that list, maybe even first. That 
will make many messages sent via discussion mailing lists (such as 
this one) pass your test where a test of real header domains would 
fail, while it it is more likely to cause commercial bulk mail to 
fail where it would usually pass based on real standard headers. 
(That's based on a hunch, not testing.)


Hmmm. I'll have to give some more thought into the exact headers it 
decides to test. I'm not sure if my MTA puts in envelope info into 
the SA request or not. For sake of simplicity right now I might just 
ignore mailing lists, I don't know. What I do know is that in the 
spam messages I'm reviewing right now, the reply-to / from headers 
set often don't have websites at those domains and none of them are 
masquerading as mailing lists. I haven't thought through the 
situation with mailing lists yet.


I'm new to this whole SA plugin dev process - can you suggest the 
best way to log the full requests that SA receives so I can see what 
info it is getting and what I have to work with?


The best way to see far too much information about what SA is doing is 
to add a "-D all" to the invocation of the spamassassin script. You 
can also add that to the flags used by spamd, if you want to punish 
your logging subsystem

Re: Spam rule for HTTP/HTTPS request to sender's root domain


On 2/28/2019 12:41 PM, Bill Cole wrote:
You should probably put the envelope sender (i.e. the SA 
"EnvelopeFrom" pseudo-header) into that list, maybe even first. That 
will make many messages sent via discussion mailing lists (such as 
this one) pass your test where a test of real header domains would 
fail, while it it is more likely to cause commercial bulk mail to fail 
where it would usually pass based on real standard headers. (That's 
based on a hunch, not testing.)
Can you clarify why you think my currently proposed headers would fail 
with the mailing list? As far as I can tell, all the messages I've 
received from this mailing list would pass just fine. As an example from 
the emails in this list, which header value specifically would cause it 
to fail?

Re: Spam rule for HTTP/HTTPS request to sender's root domain


On 2/28/2019 12:41 PM, Bill Cole wrote:
You should probably put the envelope sender (i.e. the SA 
"EnvelopeFrom" pseudo-header) into that list, maybe even first. That 
will make many messages sent via discussion mailing lists (such as 
this one) pass your test where a test of real header domains would 
fail, while it it is more likely to cause commercial bulk mail to fail 
where it would usually pass based on real standard headers. (That's 
based on a hunch, not testing.)


Hmmm. I'll have to give some more thought into the exact headers it 
decides to test. I'm not sure if my MTA puts in envelope info into the 
SA request or not. For sake of simplicity right now I might just ignore 
mailing lists, I don't know. What I do know is that in the spam messages 
I'm reviewing right now, the reply-to / from headers set often don't 
have websites at those domains and none of them are masquerading as 
mailing lists. I haven't thought through the situation with mailing 
lists yet.


I'm new to this whole SA plugin dev process - can you suggest the best 
way to log the full requests that SA receives so I can see what info it 
is getting and what I have to work with?

Re: Spam rule for HTTP/HTTPS request to sender's root domain

You know what I mean. *Many (not all) of the rules (rDNS verification, 
hostname check, SPF records, etc) are easy to circumvent but we still 
check all that. Those simple checks still manage to catch a surprising 
amount of spam.


I could just not publish this and keep it for myself and I'm sure that 
would make it more effective long term for me, but I figured I would 
contribute it so that others can gain some benefit from it.


If it doesn't become widespread and SpamAssassin isn't interested in 
embedding it directly into their rule checks then that's fine by me, I'm 
not going to cry about it...more spam catching for me and whoever 
decides to install the plugin on their own servers. If it does become 
widespread and some spammers adapt then I'll take solace in knowing I 
helped a lot of people stop at least some of their spam.

* Mike Marynowski:


Everything we test for is easily compromised on its own.

That's quite a sweeping statement, and I disagree. IP-based real time
blacklists, anyone? Also, "we" is too unspecific. In addition to the
stock rules, I happen to maintain a set of custom tests which are
neither published nor easily circumvented. They have proven pretty
effective for us.

-Ralph

Re: Spam rule for HTTP/HTTPS request to sender's root domain





Why even use a test for something that is so easily compromised?
-Ralph


Everything we test for is easily compromised on its own.

Re: Spam rule for HTTP/HTTPS request to sender's root domain


And the cat and mouse game continues :)

That said, all the big obvious "email-only domains" that send out 
newsletters and notifications and such that I've come across in my 
sampling already have placeholder websites or redirects to their main 
websites configured. I'm sure that's not always the case but the data I 
have indicates that's the exception and not the rule.


On 2/28/2019 11:37 AM, Ralph Seichter wrote:

* Antony Stone:


Each to their own.

Of course. Alas, if this gets widely adopted, we'll probably have to set
up placeholder websites (as will spammers, I'm sure).

-Ralph

Re: Spam rule for HTTP/HTTPS request to sender's root domain





I would not do it at all, caching or no caching. Personally, I don't see
a benefit trying to correlate email with a website, as mentioned before,
based on how we utilise email-only-domains.

-Ralph


Fair enough. Based on the sampling I've done and the way I intend to use 
this, I still see this as a net benefit. If you're running an email-only 
domain then you're probably doing some pretty email intensive stuff and 
you should be well-configured enough to the point where a nudge in the 
score shouldn't put you over the spam threshold. If you're a spammer 
just trying to make quick use of a domain and the spam score is already 
quite high but not quite over then this can tip the score over into 
marking it as spam.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

Question though - what is your reply-to address set to in the emails 
coming from your email-only domain?


The domain checking I'm doing grabs the first available address in this 
order: reply-to, from, sender. It's not using the domain of the SMTP 
server. I did come across some email-only domain SENDERS in my sampling, 
but the overwhelming majority of reply-to addresses pointed to emails 
with HTTP servers on their domains.


On 2/28/2019 11:14 AM, Ralph Seichter wrote:

* Grant Taylor:


Why would you do it per email? I would think that you would do the
test and cache the results for some amount of time.

I would not do it at all, caching or no caching. Personally, I don't see
a benefit trying to correlate email with a website, as mentioned before,
based on how we utilise email-only-domains.

-Ralph

Re: Spam rule for HTTP/HTTPS request to sender's root domain

Just one more note - I've excluded .email domains from the check as I've 
noticed several organizations using that as email only domains.


Right now the test plugin I've built makes a single HTTP request for 
each email while I evaluate this but I'll be building a DNS query 
endpoint or a local domain cache to make it more efficient before 
putting it into production.

Re: Spam rule for HTTP/HTTPS request to sender's root domain

I've tested this with good results and I'm actually not creating any 
HTTPS connections - what I've found is a single HTTP request with zero 
redirections is enough. If it returns a status code >= 400 then you 
treat it like no valid website, and if you get a < 400 result (i.e. a 
301/302 redirect or a 200 ok) then you can treat it like a valid 
website. You don't even need to receive the body of the HTTP result, you 
can quit after seeing the status.


And yes, as a 100% ban rule this is obviously a bad idea. As a score 
modifier I think it would be highly effective.


I found several "email only" domains in my sampling but all the big ones 
still had landing pages at the root domain saying "this domain is only 
used for serving email" or similar. I'm sure there are exceptions and 
some people will have email only domains, but that's why we don't put 
100% confidence into any one rule.


On 2/27/2019 7:57 PM, Grant Taylor wrote:

On 02/27/2019 03:25 PM, Ralph Seichter wrote:
We use some of our domains specifically for email, with no associated 
website.


I agree that /requiring/ a website at one of the parent domains 
(stopping before traversing into the Public Suffix List) is 
problematic and prone to false positives.


There /may/ be some value to /some/ people in doing such a check and 
altering the spam score.  (See below.)


Besides, I think the overhead to establish a HTTPS connection for 
every incoming email would be prohibitive.


Why would you do it per email?  I would think that you would do the 
test and cache the results for some amount of time.


There is a reason most whitelist/blacklist services use "cheap" DNS 
queries instead.
I wonder if there is a way to hack DNS into doing this for us. I.e. a 
custom DNS ""server (BIND's DLZ comes to mind) that can perform the 
test(s) and fabricate an answer that could then be cached.  ""Publish 
these answers in a new zone / domain name, and treat it like another RBL.


Meaning a query goes to the new RBL server, which does the necessary 
$MAGIC to return an answer (possibly NXDOMAIN if there is a site and 
127.0.0.1 if there is no site) which can be cached by standard local / 
recursive DNS servers.

Spam rule for HTTP/HTTPS request to sender's root domain

2019-02-27 Thread Mike Marynowski


Hi everyone,

I haven't been able to find any existing spam rules or checks that do 
this, but from my analysis of ham/spam I'm getting I think this would be 
a really great addition. Almost all of the spam emails that are coming 
through do not have a working website at the room domain of the sender. 
Of the 100 last legitimate email domains that have sent me mail, 100% of 
them have working websites at the root domain.


As far as I can tell there isn't currently a way to build a rule that 
does this and a Perl plugin would have to be created. Is this an 
accurate assessment? Can you recommend some good resources for building 
a SpamAssassin plugin if this is the case?


Thanks!

Re: Which Net::DSN for SpamAssassin-3.4.1

2016-12-12 Thread Mike Grau


> 
> Net::DNS has had some very good but rather weakly-controlled improvement
> recently, including an API change that got rolled back, so the latest
> (1.06) is probably the best choice (it's what I use.) However, all
> recent versions cause a problem with the released version of SA. The
> patch for that is minor:
> 
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7223
> https://svn.apache.org/viewvc?view=revision&revision=1691992
> 
> --- lib/Mail/SpamAssassin/DnsResolver.pm.orig2015/07/20 18:23:18   
> 1691991
> +++ lib/Mail/SpamAssassin/DnsResolver.pm2015/07/20 18:24:481691992
> @@ -592,6 +592,9 @@
>};
> 
>if ($packet) {
> +# RD flag needs to be set explicitly since Net::DNS 1.01, Bug 7223   
> +$packet->header->rd(1);
> +
># my $udp_payload_size = $self->{res}->udppacketsize;
>  my $udp_payload_size = $self->{conf}->{dns_options}->{edns};
>  if ($udp_payload_size && $udp_payload_size > 512) {


Okay, thanks for the info! -- Mike

Which Net::DSN for SpamAssassin-3.4.1

2016-12-09 Thread Mike Grau

Hello all

I'm confused ... what is the "recommended" version of Net::DNS to use
with an unpatched SpamAssassin-3.4.1? Or are there patches I ought to
apply for, say, Net::DNS 1.06?

Thanks! -- Mike G.

Re: SA From header checks

2016-08-11 Thread Mike Ray

- Original Message -
> On 08/11/2016 06:03 PM, Mike Ray wrote:
> <.snip.>
> 
> >
> >
> > However, after I had sent that message, I decided to play around a
> > bit. I had rearranged existing rules in the file yesterday to make
> > sure that my new rules weren't somehow silently destroying file
> > parsing, but I had never added a new rule that I would have expected
> > work (e.g. rawbody). I added one, ran my same update procedure and
> > found that my new rawbody rule was not working, but my gmail rule
> > was! At this point, I started to work off of Martin's idea that I had
> > screwed up the restart process. I manually started restarting
> > processes and found eventually that I do not need to restart
> > spamassassin, but need to restart amavis instead.
> >
> >
> > At this point, I'm wandering outside of SA territory, but I'll ask
> > anyway. Postfix talks to amavis which uses spamassassin (and clamav).
> > I'd be less surprised if I had to restart both amavis and
> > spamassassin, but it seems very weird that I only have to restart
> > amavis for new rules to start working. Perhaps amavis internally
> > restarts spamassassin? Or perhaps spamassassin is already configured
> > to check local.cf for changes? Anyone have an idea about this?
> 
> Amavis uses SA libraries and doesn't need spamd/spamassassin
> (see Amavis docs)
> 
> If you change any SA file you need to reload via Amavis - anything else
> will be ignored.
> 
> 
> 
> 

That would explain it.

Thanks for the help all!

Re: SA From header checks

2016-08-11 Thread Mike Ray

- Original Message -
> On Wed, 2016-08-10 at 17:04 -0500, Mike Ray wrote:
> > Hello all-
> > 
> > Must be doing something stupid here, but could use a second set of
> > eyes and persons more knowledgeable than myself.
> > 
> > None of my header checks that operate on "From" seem to be working.
> > 
> > SA version 3.4.0-1ubuntu2.1
> > "spamassassin --lint" does not throw any errors
> > "spamassassin --lint -D" shows the rule being parsed (I gave it no
> > description and see the warning).
> > 
> > Rawbody rules or rules on other headers (e.g. Subject) work just
> > fine.
> > 
> > Here is a sample one that I stripped down to the basics just to get
> > it to work, based on a very similar one in the documentation (https:/
> > /wiki.apache.org/spamassassin/WritingRules):
> > 
> > header  PREF_T1  From =~ /gmail\.com/i
> > score   PREF_T1  0.1
> > 
> > I've tried adding a description, setting the score to an integer,
> > removing the regex modifier and adding ".*" to match the whole
> > address with no success.
> > 
> > Anyone see what I'm missing?
> > 
> How is it being executed when its run against a message?
> Where is the file defining it relative to local.cf and what is it
> called?
> 
> Why those questions?
> 
> Here's why: I do all rule development on a different machine to my
> production SA setup. On the development machine I use a call to
> 'spamassassin' to do lint checks, but move the *.cf files etc. to a
> conventional spamd setup on the development system to run tests against
> test messages because:
> (a) that's very similar to my live setup. It uses spamc to submit
>     messages from my spam corpus
> (b) this arrangement gives me better indications of how this rule
>     set will perform on the live system.
> 
> Periodically, I see exactly the same problem you're reporting, but it
> is invariably due to one of two reasons:
> (1) I've not uploaded the new .cf files to where the development spamd
>     expects to find them.
> (2) I did upload the files, but didn't restart the development spamd
>     after doing the upload.
> 
> Under short (< 10 message) test runs spamd will be started by the test
> script and will be stoped when it ends, so the second situation won't
> happen, but if I'm doing something else while a much longer whole-
> corpus test is running and I miss the 'sudo' prompt the test script
> issues when it needs to stop spamd at the end of the test run, sudo
> times out and the test script exits leaving spamd running.
> 
> If I don't notice this and just upload modified .cf file(s) before
> starting another test, spamd won't see any revised rules because its
> still running. This causes more or less exactly the effect you're
> you're seeing: changes to rule(s) seem to be silently ignored.
> 
> 
> Martin
> 
> 
> 

I inadvertently sent Martin a direct message, so I include that here:

"The rules are being put directly in /etc/spamassassin/local.cf, which 
documentation indicates is the proper place for custom rules. I justify it as 
"safe enough" to mutate that "live" rules since I assign such low scores while 
debugging. I am using ansible to manage that file and have it hooked into a 
handler that restarts spamassassin if that file changes, so I am confident that 
is not the issue."

However, after I had sent that message, I decided to play around a bit. I had 
rearranged existing rules in the file yesterday to make sure that my new rules 
weren't somehow silently destroying file parsing, but I had never added a new 
rule that I would have expected work (e.g. rawbody). I added one, ran my same 
update procedure and found that my new rawbody rule was not working, but my 
gmail rule was! At this point, I started to work off of Martin's idea that I 
had screwed up the restart process. I manually started restarting processes and 
found eventually that I do not need to restart spamassassin, but need to 
restart amavis instead. 

At this point, I'm wandering outside of SA territory, but I'll ask anyway. 
Postfix talks to amavis which uses spamassassin (and clamav). I'd be less 
surprised if I had to restart both amavis and spamassassin, but it seems very 
weird that I only have to restart amavis for new rules to start working. 
Perhaps amavis internally restarts spamassassin? Or perhaps spamassassin is 
already configured to check local.cf for changes? Anyone have an idea about 
this?

SA From header checks

2016-08-10 Thread Mike Ray

Hello all-

Must be doing something stupid here, but could use a second set of eyes and 
persons more knowledgeable than myself.

None of my header checks that operate on "From" seem to be working.

SA version 3.4.0-1ubuntu2.1
"spamassassin --lint" does not throw any errors
"spamassassin --lint -D" shows the rule being parsed (I gave it no description 
and see the warning).

Rawbody rules or rules on other headers (e.g. Subject) work just fine.

Here is a sample one that I stripped down to the basics just to get it to work, 
based on a very similar one in the documentation 
(https://wiki.apache.org/spamassassin/WritingRules):

header  PREF_T1  From =~ /gmail\.com/i
score   PREF_T1  0.1

I've tried adding a description, setting the score to an integer, removing the 
regex modifier and adding ".*" to match the whole address with no success.

Anyone see what I'm missing?

Thanks,

Mike Ray

Re: RBL/SPF if header exists

2015-03-31 Thread Mike Cardwell

* on the Tue, Mar 31, 2015 at 12:15:31PM -0400, Joe Quinn wrote:

>>> You can fairly easily write a meta that reverses the score of each RBL
>>> and SPF rule if your condition fires.

>> Any chance you could point me to an example of how to do this?

> Here's an example from when Yahoo's internal Received headers were
> hitting RCVD_ILLEGAL_IP, taken from here:
> http://www.pccc.com/downloads/SpamAssassin/contrib/KAM.cf
> 
> header __KAM_YAHOO_MISTAKE1 From =~ /\@yahoo\./i
> 
> meta KAM_YAHOO_MISTAKE (SPF_PASS && __KAM_YAHOO_MISTAKE1 &&
> RCVD_ILLEGAL_IP)
> describe KAM_YAHOO_MISTAKE Reversing score for some idiotic Yahoo
> received headers
> scoreKAM_YAHOO_MISTAKE -3.0
> 
> This rule undoes RCVD_ILLEGAL_IP, which has a score of 3.0.

Thanks for the example. The only problem with the above is that I believe
I would have to write a rule for every single RBL and keep those rules
up to date whenever a new RBL is added or score updated by upstream.
Is there any way of avoiding that?

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature

Re: RBL/SPF if header exists

2015-03-31 Thread Mike Cardwell

* on the Tue, Mar 31, 2015 at 11:59:39AM -0400, Joe Quinn wrote:

>> Is it possible to enable or disable RBL and/or SPF checks according to
>> the existence or lack of a header?
>> 
>> Without going into too many details, I need a way of transmitting to
>> SpamAssassin at scan-time that it should not run SPF or RBL checks on
>> a particular message, which isn't based on a hardcoded per user or
>> IP setting.

> Do you need the actual testing disabled, or just the score?

Ideally I'd like to disable the tests, but if I can just remove the
score, that would be sufficient.

> You can fairly easily write a meta that reverses the score of each RBL
> and SPF rule if your condition fires.

Any chance you could point me to an example of how to do this?

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature

RBL/SPF if header exists

2015-03-31 Thread Mike Cardwell

Is it possible to enable or disable RBL and/or SPF checks according to
the existence or lack of a header?

Without going into too many details, I need a way of transmitting to
SpamAssassin at scan-time that it should not run SPF or RBL checks on
a particular message, which isn't based on a hardcoded per user or
IP setting.

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature

Re: ancient perl versions

2014-12-05 Thread Mike Grau

On 12/05/2014 09:38 AM, Noel Butler wrote:
> pffft
> 
> I see no problem, as like most developers if you cant reproduce it, then
> its nothing to bother about, after all this time 2 ppl dont like a font
> or whatever, your pissing up the wrong tree if you think I have a care
> factor about changing things when i cant reproduce it. time to move
> along ...

You're reproducing it for me ... e-mails from you have a hard-to-read
small font here also. Not from anyone else - just you.

Re: Hacked Wordpress sites & Cryptolocker

2014-09-05 Thread Mike Grau

>> I'm also getting WP phishing urls that end in "/", like so:
>>
>> ... /wp-includes/logs/
> 
> spample plz?
> 

http://pastebin.com/yBLqTrYP

Re: Hacked Wordpress sites & Cryptolocker

2014-09-05 Thread Mike Grau


> I'm testing versions that insist on .php and am getting very good
> results.  Thanks to the OP for pointing this out!

I'm also getting WP phishing urls that end in "/", like so:

 ... /wp-includes/logs/

Presumably, this is the equivalent of /wp-includes/logs/index.php?

-- Mike G

Re: refusing to untaint

2014-02-27 Thread Mike Grau


> 
>> Please open a new bug.  I'll try and make it a blocker for 3.4.1 if you
>> open it ASAP.
> 
> Done. <https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7020>
> 

For the list - the error appears to have been caused from an old .pre
file that was left in /etc/mail/spamassassin. Removing the .pre files
and re-installing SA eliminated the warning.

No bug. A configuration issue here.

-- Mike G.

Re: refusing to untaint

2014-02-27 Thread Mike Grau


> Please open a new bug.  I'll try and make it a blocker for 3.4.1 if you
> open it ASAP.

Done.

Re: refusing to untaint

2014-02-26 Thread Mike Grau


> Any chance you can try the very small patch in
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7015 and see if
> it's related?

Still the same error after patching:
Feb 26 15:24:07.130 [20964] warn: util: refusing to untaint suspicious
path: "${exec_prefix}/lib"

refusing to untaint

2014-02-26 Thread Mike Grau

Hello,

I've installed SpamAssassin-3.4.0 on a couple of machines via the
tarball and

  perl Makefile.PL
  make
  make test
  make install

When I run a message through spamassassin -t it gives this warning from
Util.pm

  Feb 26 12:19:27.028 [17527] warn: util: refusing to untaint
  suspicious path: "${exec_prefix}/lib"

This is perl 5, version 18, subversion 1 (v5.18.1) built for
x86_64-linux-thread-multi

I'm guessing that the variable ${exec_prefix} should already have been
evaluated? Can someone tell me what might be the problem?

Thanks!
-- Mike

Re: Huh? Variable length lookbehind?

2013-12-27 Thread Mike Grau

> 
> The 'st' is apparently equivalent to some ligature or some other UTF-8
> character, so you end up with an alternation of two different lengths,
> which can't be used for look-behinds.
> 
> Use a character set modifier /a to restrict the matching to ASCII rules.
> Search for "Character set modifiers" in the perlre man page.
> 
> So something like:
>   /(? should do with perl >= 5.14 .
> 
> Better yet, avoid lookbehinds.
> 
>   Mark
> 

Many thanks Marc - both suggestions solve my problem.

-- Mike

Huh? Variable length lookbehind?

2013-12-27 Thread Mike Grau


I have a problem with a rule on a newly installed relay on which --lint
throws a warning on an old local rule, squawking about "Variable length
lookbehind  not implemented". I simplified the rule trying to discover
the problem and it seems to be with the /i modifier:

This rule does _not_ provoke the warning:

  /(?

Re: Help eliminate false positive for Google Code notifications

2013-07-17 Thread Mike Brown

Benny Pedersen wrote:
> its was good since to many still use it :)

In my case it was that the old rulesets were left behind long after the 
updates stopped; they kept getting transferred over through upgrades of 
SpamAssassin and Perl. Once I deleted them, all was well. Well, except that 
more spam started getting through. :)

Re: Site Training via Redirect to a spam and/or ham mailbox

2013-07-11 Thread Mike Brown

W T Riker wrote:
> I suspect someone has already done this somewhere but I can't seem to
> come up with the right key words in my search. I'd like to set up spam
> and ham mailboxes to which all my users can redirect/bounce errors for
> Bayes training for the site. Then I can run sa-learn via cron against a
> single mailbox. Can someone point me to some info on this? Thanks.

You want to know how to configure "aliases" in your MTA, i.e. sendmail, 
postfix, exim, or qmail. They all support files as the destination.
So I suggest doing a web search for the name of your MTA plus "aliases".

See also https://wiki.apache.org/spamassassin/SiteWideBayesFeedback -
at the end it points to a postfix-specific recipe.

Re: Help eliminate false positive for Google Code notifications

2013-07-11 Thread Mike Brown

Axb wrote:
> SARE rules are obsolete/unsupported/ancient/history/etc and shouldn't be 
> used.
> Do yourself a favour and remove those files - will save you CPU cycles, 
> memory and lots of headaches.

Heh, even easier than I thought.

I think I had assumed that if I stopped fetching them, I wouldn't have them 
anymore, especially after upgrading Spamassassin. But they stayed and got 
copied over from upgrade to upgrade.

Thanks!

Help eliminate false positive for Google Code notifications

2013-07-11 Thread Mike Brown

Google Code sends out notifications from @googlecode.com. These 
notifications have Message-ID headers that start with two digits and a dash, 
triggering this rule:

SARE_MSGID_DDDASH Message-ID has ratware pattern (9-, 9$, 99-)

The rule was proposed in 2004:
https://mail-archives.apache.org/mod_mbox/spamassassin-users/200402.mbox/%3c20040204190450.9b96217...@jmason.org%3E

A sample Message-ID (I have an issue starred in the Android project):
<46-1531741276455824-7215198307142895543-android=googlecode@googlecode.com>

Complete mbox message at http://pastebin.com/W5cN4DFd

The false positive is not contributing much to the score (1.666), but I don't 
like it, so I'd like to avoid triggering the rule altogether if I can. I want 
to do it in the preferred way, if there is a preferred way. Any solution I 
would come up with would be pretty kludgy. So, suggestions appreciated! Thanks.

Re: sa-update: MIRRORED.BY is 404 for any channel

2013-06-12 Thread Mike Brown

Martin wrote:
> Do you have a MIRRORED.BY file in you spamassassin update directory? It 
> looks like it doesn't have the file with the mirrors in and instead is using 
> the file name.
> 
> If so you could copy it over from your other box that's working.
> 

Thanks; your suggestion worked.

The way MIRRORED.BY files get used and updated(?) is a complete mystery to me. 

There was nothing in the old file that was wrong, syntax-wise. It was the same 
as the current, working one, but with the last line (the secnap mirror) 
commented out by me after seeing something about that site having problems a 
while back. How this difference results in the 404s I was seeing is not at all 
obvious.

Re: sa-update: MIRRORED.BY is 404 for any channel

2013-06-11 Thread Mike Brown

John Wilcock wrote:
> > Jun 11 00:05:07.327 [43091] dbg: http: GET 
> > http://spamassassin.apache.org/updates/MIRRORED.BY"; request failed, 
> > retrying: 404 Not Found:  
> >  404 Not Found  Not Found 
> > The requested URL /updates/MIRRORED.BY" was not found on this 
> > server.  Apache/2.4.4 (Unix) OpenSSL/1.0.1e Server at 
> > spamassassin.apache.org Port 80 
> > (repeat 3X)
> 
> Note the trailing quote marks on those two URLs. I've no idea where they 
> came from, but it could well be a simple config error...

Whoa, you're right. I'm so used to treating quotes as not part of a URL, I 
didn't even see they're part of what's being requested.

I note they appear in the bugzilla report I referred to, as well:
https://issues.apache.org/SpamAssassin/show_bug.cgi?format=multiple&id=6914

MIRRORED.BY isn't something I fetch myself; I just run sa-update with various 
options set. So where are the quotes coming from?

sa-update: MIRRORED.BY is 404 for any channel

2013-06-11 Thread Mike Brown

I'm running 3.3.2 on two FreeBSD 8.3 systems on different networks. Both
systems are configured roughly identically with regard to SpamAssassin. One
system runs Perl 5.16 (not sure if that matters) and can run sa-update without
error, but the other runs Perl 5.12 and gets 404s when it tries to update
MIRRORED.BY for any channel. Well, updates.spamassassin.org or
sought.rules.yerp.org are the ones I tried, at least:

Jun 11 00:35:14.769 [53689] dbg: http: GET http://yerp.org/rules/MIRRORED.BY";
request failed, retrying: 404 Not Found: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
404 - Not Found 404 - Not Found

(repeat 3X)

Jun 11 00:05:07.327 [43091] dbg: http: GET
http://spamassassin.apache.org/updates/MIRRORED.BY"; request failed, retrying:
404 Not Found:
404 Not Found Not Found The requested
URL /updates/MIRRORED.BY" was not found on this server.
Apache/2.4.4 (Unix) OpenSSL/1.0.1e Server at spamassassin.apache.org
Port 80
(repeat 3X)

Same thing happens even if I try only updates.spamassassin.org without gpg.

These 404 errors look just like those in bug 6914, but neither of my systems
are IPv6-only. IPv6 isn't routing on either system (ping6 -c3 ::1 is the only
thing that works).

Bug 6838 was no help, either, because the mirrors in question are alive and
like I said, the one system can sa-update without any issues.

On both systems I get the same DNS results, nothing out of the ordinary. And
when running curl or whatever, I can (via IPv4) fetch the MIRRORED.BY file
from the same URLs just fine, as well as talk to the mirror sites listed
therein. So connecting to the sites shouldn't be an issue.

Sorry if this is a FAQ, user error, or typical newbie configuration oversight,
but I searched for answers for quite a while before resorting to posting here.
Any assistance would be much appreciated. How can I further diagnose this?

Re: X-Relay-Countries

2013-02-12 Thread Mike Grau


> 
> Hmm  I would do something like this (untested):
> 
> header RELAY_NOT_US X-Relay-Countries =~ /\b(?!US)[A-Z]{2}\b/
> 

I've had to use, IIRC.
X-Relay-Countries =~ /\b(?!US|XX)([A-Z]{2})\b/

Re: Question about TRACKER_ID

2013-02-08 Thread Mike Grau

Martin Gregorie wrote:
> On Fri, 2013-02-08 at 13:26 -0600, Mike Grau wrote:
>> Hello folks.
>>
>> In 20_body_tests.cf (SA 3.3.2) there is this rule:
>>
>> body TRACKER_ID   /^[a-z0-9]{6,24}[-_a-z0-9]{12,36}[a-z0-9{6,24}\s*\z/is
>>
>> What is the "\z" in the regex?
>>
> According to the O'Reilly Camel Book, "Programming Perl", \z matches the
> last character in a string but offers no additional help. Many

Thanks Martin,

But, is that not "\Z" (upper case)?

Question about TRACKER_ID

2013-02-08 Thread Mike Grau

Hello folks.

In 20_body_tests.cf (SA 3.3.2) there is this rule:

body TRACKER_ID   /^[a-z0-9]{6,24}[-_a-z0-9]{12,36}[a-z0-9{6,24}\s*\z/is

What is the "\z" in the regex?

This rule matches "". Is that as
intended?

Thanks!
-- Mike

Re: KB_FAKED_THE_BAT

2012-05-14 Thread Mike Grau


>>
>> # grep Date: HEADERS | od -a
>> 000   D   a   t   e   :  sp  ht   T   h   u   ,  sp   3  sp   M   a
>> 020   y  sp   2   0   1   2  sp   1   6   :   5   3   :   5   9  sp
>> 040   +   0   7   0   0  nl
>> 046vi H*
>>
>> This has been Russian language spam (charset koi8-r) with various
>> flavors of X-Mailer: The Bat!
> 
> What version of SpamAssassin are you running?  Here's a note from that
> rule's definition (rulesrc/sandbox/kb/20_header.cf):
> 
> # NOTE  Depends on some header rule code fixes for 3.3.x to remove
> #   the leading space that was showing up in header rules.  For
> #   3.2.x releases the pattern must be changed to /^ \t/.
> 
> Karsten:  Maybe change it to   /^ ?\t/   as a workaround?
> (Yes, I know we've stopped supporting sa3.2.x)

In 3.3.2
/var/lib/spamassassin/3.003002/updates_spamassassin_org
# grep  __KB_DATE_CONTAINS_TAB 72_active.cf

header   __KB_DATE_CONTAINS_TAB  Date:raw =~ /^\t

KB_FAKED_THE_BAT

2012-05-03 Thread Mike Grau

Hello all,

Just an FYI ...

The meta rule in 72_active.cf "KB_FAKED_THE_BAT" is getting circumvented
here because the meta rule component

 header   __KB_DATE_CONTAINS_TAB  Date:raw =~ /^\t

is being evaded by spam that now has a space character before the tab:

# grep Date: HEADERS | od -a
000   D   a   t   e   :  sp  ht   T   h   u   ,  sp   3  sp   M   a
020   y  sp   2   0   1   2  sp   1   6   :   5   3   :   5   9  sp
040   +   0   7   0   0  nl
046vi H*

This has been Russian language spam (charset koi8-r) with various
flavors of X-Mailer: The Bat!

-- Mike G.

How to get spam score by Windows command-line

2011-11-14 Thread Mike Koleszar

Hi all, I would like to put together a script that will show me the spam 
score of emails that come in. I was hoping that someone could help push 
me in the right direction to do this. I'm hoping there is a simple way 
to do this, using one of the SpamAssassin exe files over command-line. I 
appreciate any advice or suggestions. Thank you.


- Mike

Re: RP_MATCHES_RCVD

2011-07-28 Thread Mike Grau


On 07/28/2011 09:28 AM the voices made RW write:

There seems to be a consensus that SPF and DKIM passes aren't worth
significant scores. So how is it that RP_MATCHES_RCVD, scores -1.2 when
it just a circumstantial version of what SPF does explicitly.

For me it's hitting more spam that ham, and what's worse, it's mostly
hitting low-scoring freemail spam. Is it just me that's seeing this, or
is there maybe  some kind of bias the test corpora?




+1

RP_MATCHES_RCVD hits tons of (snowshoe?) spam here. Different senders 
different IPs, but often the same /16 or /24 networks. I had some local 
meta rules that used T_RP_MATCHES_RCVD, but evidently the name was 
changed to RP_MATCHES_RCVD and the spam started flying in.

Excessive junk mail even after upgrade/update

2011-01-04 Thread Mike Gibson

I have recently inherited a web server with roughly 50 clients.  Last week I 
started getting complaints about excessive amounts of junk mail being 
delivered.  I upgraded my SpamAssassin Rules, Clam AV, MailScanner, and 
SpamAssassin Engine (3.2.5 à 3.3.1), in that order.  At first, this seemed to 
work.  Customers reported only receiving a few junk mail the following day; 
however, since then, they have begun receiving hundreds of junk messages again 
each day.  My server is running RHEL5 and the Ensim Pro 10.3 Interface, if that 
makes a difference.

 

My first question is, how can I verify that I installed SpamAssassin correctly 
and that it is scanning and marking messages properly.  The server shows the 
service as running, but this is all I have found so far to indicate whether it 
is working or not.

 

Second, are there any steps that I should have taken after the installs, or 
perhaps did I install things in the wrong order?  

 

Lastly, I have to manually start the service from the console each time the 
server is rebooted.  It used to start automatically, and allowed me to restart 
the service from the Web GUI if needed.  How can I get it to start 
automatically again?  If I try to start it from the GUI, it returns an error of 
"Unknown option" and provides a list of switch options.  I know it's trying to 
start the correct service from the options listed, but what might be causing it 
not to starting?

 

I am still relatively new to web administration; so, any help is greatly 
appreciated.

 

Thank you,

 

Mike Gibson

Sr. Network Engineer

Select Tel Systems, Inc.

229.434.0540

 

 

Select Tel Systems
On Time, Done Right, Guaranteed!
(229) 434-0540

some custom rules query

2010-11-16 Thread Mike Bro

Hello,

Just wanted to create (or use if they already exist) some rules:
1. Email body contains more than 10 newliners without any text between them
2. Email body contains less than 4 characters
3. Email body contains only a line of text and a line with some URL

Any help appreciated.
Regards,
Mike

Re: Checking envelope sender

2010-09-08 Thread Mike Bro

Hi Bowie,

You wrote:
> The .qf file is not visible to SpamAssassin.  SA only looks at the email
> and headers.  If you want to reject/score based on the envelope sender,
> you will need to either do it at the MTA level or find out if sendmail
> puts the information into a header that SA can see.

Thanks for this information. I just wasn't completely sure that's the case.
Any idea how I can politely (in .mc) ask sendmail to put the whole
MAIL FROM line into
a header?

Re: Checking envelope sender

2010-09-08 Thread Mike Bro

Thanks for your interest in this topic. The part of mail.log and the
qf file is at:
http://pastebin.com/0QzqLxs1

This particular example has been marked as spam, but the sender's
information didn't play a role in this classification.

Re: Joseph Brennan:
> Why doesn't sendmail reject it like it does here? (..) .. Domain name 
> required for sender address
I cannot afford rejecting all null senders as those could be
legitimate Delivery Status Notification messages.

What I am looking is a pattern for line:
MAIL FROM: <"do not mock at your poetenncy - bujyj vjaqrra ppislls" <>>
while I want to allow:
MAIL FROM: <>

So any ideas are appreciated whether on sendmail or spamassassin level.

Checking envelope sender

2010-09-07 Thread Mike Bro

Hello,

Enviroment:
latest sendmail and latest spamassassin

I am just trying to fight with spammer that used to send too many emails.
The pattern I discovered is that during smtp communication with my
incoming mail server in from field he puts something like:
MAIL FROM: <"some rubbish words" <>>

That results in my qf... file as line:
S<"some rubbish words" <>>

Any idea how I could write a rule in spamassassin to test this line?

Thanks in advance,
Mike

Re: Calling SpamAssassin from a Perl Web Form

2010-08-12 Thread Mike Tonks

> I've yet to hear anyone implementing SA for forms in a sensible manner..

Thanks for the feedback.  If people have tried before it's unlikely
I'll do much better :)

> It would make much more sense to me to just apply well known form spam
> specific checks into your code. The standard captcha, too-many-links, bad
> keywords etc. Lots of information for that around. I have a hard time
> believing SA default rules would catch anything serious.

One of the main attractions is the Bayes Learning stuff, which seems
to be nicely implemented (and non trivial) and the URI Blacklist
lookup, plus bad word lists & the general scoring system which is
really cool, plus the ability to tweak the setting and add rules via
plugins, etc.

It seems that SA is fairly mature and has a good user base, so it
would seem a nice idea to hook into this rather than reinventing these
components.

Any suggestions how to achieve this either via SA or otherwise
(existing CPAN modules?) would be much appreciated, if anyone here is
knowledgeable in this area.

Yes, there will be some captcha stuff too but I see that as separate
to the actual 'identifying spam content' issue.

cheers for any help.

mike

On 5 August 2010 15:38, Henrik K  wrote:

>
>
> You can easily make SA-like network checks, just parse URIs from messages
> and check few URIBLs. Check sender IP from applicable lists (not dial-up
> ones) etc.. at simplest it's some regexp and gethostbyname calls.
>
>

Calling SpamAssassin from a Perl Web Form

2010-08-05 Thread Mike Tonks

Hi folks,

I'm looking into hooking the Mail::SpamAssassin module into a perl
processor for a couple of web forms - contact us, comments form, and
publish an article form (open publishing).

The main barrier seems to be the need for a message format rather than
just a plain text body.

I tried two approaces:

1) I tried just passing the body text to SA but this triggers a load
of missing header rules.

2) I tried to 'spoof' the headers to get SA to process each post like
it's a normal mail, but found my spoof headers caused various issues
as well not least long delays - I'm guessing while the spoof email
domais (always the same) are checked via the network.

Also, I probably need to disable a bunch of rules that aren't really
appropriate, e.g. looking up the sender email address since I'm just
using a dummy one anyway (except for the contact form).  Seems like
mainly the header rules would need to be disabled, and the body rules
given more weighting.  Is there an easy way to do this?

Alternatively, perhaps I should just identify particular rules that
are relevant and call the directly.  Is this possible?

Thanks for any help.

mike

Re: FPs on FH_FAKE_RCVD_LINE_B

2010-06-29 Thread Mike Grau


> 
> I believe the issue is that there are no brackets around the IP.  The
> line should look like this:
> 
> Received: from [68.103.178.110] by webmail.east.cox.net; Mon, 28 Jun 2010 
> 18:02:23 -0400
> 
> 

Ah, right! Thanks!

( Drat, sorry about the reply to poster rather than list. )

FPs on FH_FAKE_RCVD_LINE_B

2010-06-29 Thread Mike Grau

Hello,

I'm getting a lot of FPs from FH_FAKE_RCVD_LINE_B RCVD line looks faked
(B) since the default score for this rule is a whopping 4.000.

It's matching on this header:

Received: from 68.103.178.110 by webmail.east.cox.net; Mon, 28 Jun 2010
18:02:23 -0400

This rule matches the ISP Cox Communication residential customers using
their webmail service. For now I've made a rule negating
FH_FAKE_RCVD_LINE_B RCVD for Cox, but will someone educate me as to what
it is that makes this header look faked?

For reference, here's the (probably wrapped) rule:
Received =~
/from\s*\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s*by\s*[a-z0-9.]{4,24}\.[a-z0-9.]{4,36}\.(?:com|net|org|biz);\s*[SMTWF].{2},\s*\d{1,2}\s*[JFMASOND].{2,5}\s*\d{4}\s*\d{2}:\d{2}:\d{2}\s*[-+]\d{4}/i

Thanks!
-- Mike

Re: Blacklists Compared 17 October 2009

2010-04-07 Thread Mike Cardwell


On 07/04/2010 12:01, corpus.defero wrote:


During the last year I don't think I've seen a single FP hit against
barracuda :surprised: That said, I still haven't found the confidence to
implement it at the smtp stage for outright rejection but the numbers
I'm seeing do tend towards telling me the list is of generally high quality.


In reality I make use of Barracuda first at SMTP time, Spamhaus after
and have done so since 2008. I've never seen a FP from Barracuda in that
time.


b.barracudacentral.org is amongst my top three lists these days. Along 
with zen.spamhaus.org and bl.spamcop.net. There is no noticable 
difference in the FP rate between them here and all three hit on a *lot* 
of spam.


--
Mike Cardwell - Perl/Java/Web developer, Linux admin, Email admin
Read my tech Blog -  https://secure.grepular.com/
Follow me on Twitter -   http://twitter.com/mickeyc
Hire me - http://cardwellit.com/ http://uk.linkedin.com/in/mikecardwell

Re: [sa] Re: Yahoo/URL spam

2010-03-24 Thread Mike Grau


On 3/23/2010 2:49 PM the voices made Charles Gregory write:

On Tue, 23 Mar 2010, Alex wrote:

This is what I have:
/^[^a-z]{0,10}(http:\/\/|www\.)(\w+\.)+(com|net|org|biz|cn|ru)\/?[^
]{0,20}[a-z]{0,10}$/msi


My bad. I got an option wrong. Please remove the 'm' above.
I always get it backwards. According to 'man perlre' (the definitive
resource for SA regexes!) the 'm' makes '^' match every newline!
We want it to only match the beginning of the body.

So just remove it, and, as noted by others, add the '^' that was
missing... like so

... ]{0,20}[^a-z]{0,10}$/si


Hello,

You might want to change  (\w+\.)+  to  ([\w-]+\.)+  to account for 
domains like polster-jj.de


-- MG

Re: Zen.spamhous.org score for spam assassin...

2010-03-08 Thread Mike Cardwell


On 08/03/2010 12:34, Brian wrote:


Is zen.spamhous.org new? Personally I'd check your spelling ;-)


m...@haven:~$ host 1.0.0.127.zen.spamhous.org
1.0.0.127.zen.spamhous.org  A   208.73.210.27
m...@haven:~$ host 1.2.3.4.zen.spamhous.org
1.2.3.4.zen.spamhous.orgA   208.73.210.27
m...@haven:~$

Wonder how many people that has tripped up in its time.

--
Mike Cardwell - Perl/Java/Web developer, Linux admin, Email admin
Read my tech Blog -  https://secure.grepular.com/
Follow me on Twitter -   http://twitter.com/mickeyc
Hire me - http://cardwellit.com/ http://uk.linkedin.com/in/mikecardwell

Re: How to find where email server has been blacklisted

2010-03-08 Thread Mike Cardwell


On 08/03/2010 00:24, Rops wrote:


I'm trying to figure out why some emails get lost, which most likely is due
to emails killed by ISP spam filter due to high spam score these lost email
have.

How to find out if some mail server is blacklisted and where?
Is there any central database for queries from all different blacklists?
Also IP based search is required and data when and why.


IP based search may be needed, as server under question has it's mailbox
hosted with ISP, but I believe that still the virtual server can be
blacklisted separately based on it's static IP and not the whole ISP mail
server.

Additional side effect is that emails sent inside company get lost more
often - I believe because  they virtual server is blacklisted somewhere and
therefore emails sent always gather higher spam score.
So the question is to find out where it's blacklisted?

Thanks for any help and guidelines how and where to continue!


I wrote a Perl app a while ago to do lots of DNSBL lookups - 
https://secure.grepular.com/projects/DNSBLSearch


Example usage:

m...@haven:~$ dnsblsearch.pl 92.48.122.147 94.76.192.48/29
m...@haven:~$

If I look up 127.0.0.2 it should be listed by most DNSBLs as it's a test IP:

m...@haven:~$ dnsblsearch.pl 127.0.0.2
127.0.0.2 is listed on dnsbl-2.uceprotect.net 127.0.0.2
127.0.0.2 is listed on blackholes.five-ten-sg.com 127.0.0.2
127.0.0.2 is listed on combined.njabl.org 127.0.0.2, 127.0.0.6
127.0.0.2 is listed on bl.spamcop.net 127.0.0.2
127.0.0.2 is listed on list.dnswl.org 127.0.10.0
127.0.0.2 is listed on b.barracudacentral.org 127.0.0.2
127.0.0.2 is listed on ix.dnsbl.manitu.net 127.0.0.2
127.0.0.2 is listed on psbl.surriel.com 127.0.0.2
127.0.0.2 is listed on hostkarma.junkemailfilter.com 127.0.0.4, 
127.0.0.5, 127.0.1.1, 127.0.1.2, 127.0.1.3, 127.0.2.3, 127.0.0.1, 
127.0.0.2, 127.0.0.3
127.0.0.2 is listed on bl.spameatingmonkey.net 127.0.0.8, 127.0.0.10, 
127.0.0.2, 127.0.0.3, 127.0.0.4

127.0.0.2 is listed on spamguard.leadmon.net 127.0.0.2
127.0.0.2 is listed on spamsources.fabel.dk 127.0.0.2
127.0.0.2 is listed on dnsbl-1.uceprotect.net 127.0.0.2
127.0.0.2 is listed on dnsbl.sorbs.net 127.0.0.4, 127.0.0.5, 127.0.0.6, 
127.0.0.7, 127.0.0.8, 127.0.0.9, 127.0.0.10, 127.0.0.2, 127.0.0.3

127.0.0.2 is listed on ubl.unsubscore.com 127.0.0.2
127.0.0.2 is listed on zen.spamhaus.org 127.0.0.2, 127.0.0.4, 127.0.0.10
127.0.0.2 is listed on no-more-funn.moensted.dk 127.0.0.2
127.0.0.2 is listed on ips.backscatterer.org 127.0.0.2
127.0.0.2 is listed on dnsbl-3.uceprotect.net 127.0.0.2
m...@haven:~$

It does the lookups concurrantly so it's quite quick.

--
Mike Cardwell - Perl/Java/Web developer, Linux admin, Email admin
Read my tech Blog -  https://secure.grepular.com/
Follow me on Twitter -   http://twitter.com/mickeyc
Hire me - http://cardwellit.com/ http://uk.linkedin.com/in/mikecardwell

Re: is this right? uribl_dbl seems to have a very odd number

2010-03-03 Thread Mike Cardwell

On 03/03/2010 21:45, Bill Landry wrote:

>>> tracking down some FP's on Sa 3.3.0, they all hit URIBL_DBL.
>>> (every email hits that rule)
>>>
>>> # DBL, http://www.spamhaus.org/dbl/ .  Note that hits return 127.0.1.x
>>> # A records, so we use a 32-bit mask to match that /24 range.
>>> uridnssub   URIBL_DBL   dbl.spamhaus.org.   A   2130706688
>>
>> Yeah. You shouldn't be using it like that on 3.3.0. Go to
>> http://www.spamhaus.org/dbl and look for SpamAssassin on the FAQ page.
> 
> The DBL entries were added via sa-update yesterday, not added manually -
> at least for me.

That sounds like a big problem to me.

-- 
Mike Cardwell - Perl/Java/Web developer, Linux admin, Email admin
Read my tech Blog -  https://secure.grepular.com/
Follow me on Twitter -   http://twitter.com/mickeyc
Hire me - http://cardwellit.com/ http://uk.linkedin.com/in/mikecardwell

Re: is this right? uribl_dbl seems to have a very odd number

2010-03-03 Thread Mike Cardwell

On 03/03/2010 21:32, Michael Scheidell wrote:

> tracking down some FP's on Sa 3.3.0, they all hit URIBL_DBL.
> (every email hits that rule)
> 
> # DBL, http://www.spamhaus.org/dbl/ .  Note that hits return 127.0.1.x
> # A records, so we use a 32-bit mask to match that /24 range.
> uridnssub   URIBL_DBL   dbl.spamhaus.org.   A   2130706688

Yeah. You shouldn't be using it like that on 3.3.0. Go to
http://www.spamhaus.org/dbl and look for SpamAssassin on the FAQ page.

-- 
Mike Cardwell - Perl/Java/Web developer, Linux admin, Email admin
Read my tech Blog -  https://secure.grepular.com/
Follow me on Twitter -   http://twitter.com/mickeyc
Hire me - http://cardwellit.com/ http://uk.linkedin.com/in/mikecardwell

Re: UPS Delivery problem

2010-03-03 Thread Mike Cardwell


On 03/03/2010 13:22, twofers wrote:


I have 52 of these sitting in my inbox this morning when I came in to
work. this is just the beginning. I get literally hundreds of these a
day and Spamassassin does not even check them.


Suggest you configure SpamAssassin to check them then.

--
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Off Topic - SPF - What a Disaster

2010-02-26 Thread Mike Cardwell


On 26/02/2010 14:20, LuKreme wrote:

On 26-Feb-2010, at 07:13, LuKreme wrote:

SPF_PASS 0.001
SPF_fail 5.0


whitelist_from_spf *...@ebay.com
whitelist_from_spf *...@paypal.com


You forgot "whitelist_from_spf *...@*.apache.org"

--
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Off Topic - SPF - What a Disaster

2010-02-26 Thread Mike Cardwell


On 25/02/2010 23:31, Marc Perkel wrote:


As someone who forwards email what I see is this.

Sender has restrictive SPF.
Recipient server enforces SPF.
Mail coming through me bounces.

Then they call me to complain and I say, I didn't bounce it. Get rid of
your SPF nd your email will be received.


In your scenario, there are two broken systems. Neither of which are SPF.

The first broken system is your user. They've applied SPF to their 
domain. They've set up mail forwarding from your service. Yet they still 
apply SPF checking against your servers? That is stupid. They've 
misconfigured their mail service. They should either remove SPF, get rid 
of the forwarding, or change the forwarding provider to one which 
rewrites the envelope sender.


The second broken system is your forwarding system. It's is forging the 
envelope sender instead of correctly rewriting it. Fix that, or continue 
to offer a sub-standard forwarding service. Your choice.


It's not SPF's fault that your clueless user can't receive some email. 
It's a combination of your broken forwarding configuration, and your 
clueless users misconfiguration of their email.


It's too late anyway. Your opinion is no longer relevant. SPF is 
absolutely here to stay. It is supported by *many* large providers, and 
a large proporition of ham is already using SPF.


Ideally, 100% of Ham and 100% of Spam would use SPF. You don't seem to 
get this though. You think SPF is only useful if 100% of Ham uses it and 
0% of Spam uses it. That's a flaw in your understanding of what it's 
there for.


If a Spam comes from "example.com" and it's SPF protected, then you know 
the domain hasn't been forged, and it's safe to blacklist it. If it 
*isn't* SPF protected, then for all you know it has been forged and 
blacklisting it might cause collateral damage.


The positive aspects of *any* mail being "signed" with SPF, ham *or* 
spam, are so damn obvious, I don't know how you manage to mis-represent 
them so blatantly and so poorly.


--
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Bogus Dollar Amounts

2010-02-25 Thread Mike Cardwell


On 25/02/2010 12:01, ram wrote:


I have been seeing a few spam mails slip past that talk about being
able to get bogus dollar amounts.  What I mean by that is it will
give a large value in the e-mail but where there should be a comma
it puts a period.

I put an example of one of these messages at:

http://pastebin.com/SXuGELUS

Are there any rules that can detect this?  The only rules this hit
on mine are:

1.900   DCC_CHECK
1.449   RCVD_IN_BRBL_LASTEXT
1.000   RCVD_IN_BRBL
-0.001  SPF_PASS
-0.010  T_RP_MATCHES_RCVD
-1.900  BAYES_00

http://pastebin.com/6c9sEEn9
even recently i installed new qmail server
i still see lot of junk mail coming with different charecters, i do not
even read them clearly
how can i stop those kind of emails
Ram


I repasted that at http://spamalyser.com/v/gcrvcnbm/mime in order to get 
the benefit of mime parsing and decoding.


You could score on the "koi8-r" charset. You could score on the fact the 
email came from South Korea. You could use the TextCat language plugin.


--
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

RE: Off Topic - SPF - What a Disaster

2010-02-23 Thread Mike Hutchinson

Hello,

My company attempted to adopt SPF before I started working here. I recall it
was a recent event when I joined, and I looked into what went wrong (as I
became the mail administrator not long after). Basically the exact same
experience was encountered. Customers could not understand the system, which
is basically what killed it. Some Admin's of remote systems sending our
customers important E-Mail did not understand the system, or even want to
deal with it - leaving us without the resources to fix all SPF related
problems. 

Adoption of SPF was dropped after 3 days, and we're never going back.

Same result, SPF is a good idea, but we certainly cannot afford to train
other site's administrators, nor all of our customers, on SPF.

Cheers,
Mike,


-Original Message-
From: Jeff Koch [mailto:jeffk...@intersessions.com] 
Sent: Wednesday, 24 February 2010 9:38 a.m.
To: users@spamassassin.apache.org
Subject: Off Topic - SPF - What a Disaster


In an effort to reduce spam further we tried implementing SPF enforcement. 
Within three days we turned it off. What we found was that:

- domain owners are allowing SPF records to be added to their zone files 
without understanding the implications or that are just not correct
- domain owners and their employees regularly send email from mailservers 
that violate their SPF.
- our customers were unable to receive email from important business
contacts
- our customers were unable to understand why we would be enforcing a 
system that prevented
   them from getting important email.
- our customers couldn't understand what SPF does.
- our customers could not explain SPF to their business contacts who would 
have had to contact their IT people to correct the SPF records.

Our assessment is that SPF is a good idea but pretty much unworkable for an 
ISP/host without a major education program which we neither have the time 
or money to do. Since we like our customers and they pay the bills it is 
now a dead issue.

Any other experiences? I love to hear.



Best Regards,

Jeff Koch, Intersessions

Re: Newest spammer trick - non-blank subject lines?

On 11/02/2010 19:52, Bernd Petrovitsch wrote:

>> I want you to describe a scenario where the sender or recipient are
>> actually worse off because of the particular two features I've
> The point is: The sender is worse off because he needs to invest time
> for the workaround which is caused by the receiver. The receiver is
> worse off because some senders plain simply give up when they are
> expected to pass a Turing test.
> No, I don't have numbers. But I'm pretty sure I'm not the only one.
> 
>> described. You've failed to even attempt that so far.
> You see only two alternatives:
> - keep these two features (and tell the senders that they should
>   actually be happy that they can invest time and effort to work around
>   FPs caused by the receiver spam).
> - or deactivate it.
> I proposed the 3rd solution:
> - repair your spam-detection (change weight/limits, use Bayes,
>   greylistung, etc.) to not generate so many FPs that you actually need
>   an additional workaround.
>   That would actually remove the cause and not fiddle with the symptoms.
> 
>> I know this system works well because I've been using it for a long time.
> To use your own words: Ridiculous. Just because someone uses something
> for a long time doesn't make it good (or is even an indication for it).
> Apart from that: You very probably don't know how many FPs didn't come
> through because of people disliking Turing tests.

Your assuming that my false positive rate is bad. I would be surprised
if it was worse than the average on this list. It's very good. But if my
additions knock 0.1% more off the rate, then I'm happy. Out.

-- 
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Newest spammer trick - non-blank subject lines?

On 11/02/2010 19:29, Ted Mittelstaedt wrote:

> Secondly with regards to this reject-but-save system that Mike is
> expounding on - it is an instance of a system that only works because
> a few people (or one person) is doing it.  It is totally worthless as
> anything that can be scaled to multiple sites for a very simple reason.
> 
> Right now one of the constants in the e-mail universe is that an error
> 5xx means you failed to deliver your mail.
> 
> If many people deploy "reject-but-save-a-copy" then this breaks that
> assumption and the spammers response is extremely predictable - they
> will simply assume that EVERY error 5xx carries with it a chance for
> a successful delivery - so they will then program their spambots to
> continually retry no matter what the error.
> 
> Right now if their spambot gets an error 5xx it schedules the victim
> address for removal - because the spambot only has a limited amount of
> time it can do things on whatever host system it has hijacked, and it
> can't spend time sending to addresses that are rejected when there are
> so many more out there that will accept the spam.
> 
> If you remove that assumption by having a lot of sites deploy this
> hack of Mike's, then the spambots will simply continually send to
> millions of nonexistent e-mail addresses on your server - because
> of the chance that your running the Mike Hack and those nonexistent
> addresses your telling the spambot that are nonexistent are really
> existing.
> 
> The spammers don't care that their spam is delivered to a junk mail
> folder.  If the user isn't automatically deleting their junkmail unread
> (in which case there's no point in the Mike Hack in the first place)
> then they ARE having to periodically read at least the subject lines
> of messages in the Junk Mail folder.  In short, the Mike Hack only has
> value if the users are periodically opening up and reading the subject
> lines of messages in the Junk Mail folder.
> 
> And the spammers thought is that their spam is so attractive that
> all the user has to do is read the subject line and they will open
> it.  They aren't thinking "my spam got delivered to someone's junk
> mail folder, boo hoo"  They are thinking "Zowie, my mail got delivered
> to someone's folder - it's just going to be a few more weeks and I'll
> be rich, yipee yipee!!"  Spammers are the most optimistic people you
> will ever meet.  Only an optimist would think that the sewage they send
> out is something that people want to read.
> 
> Mike I'm not sure why you think this hack of yours is so clever.  It's
> just a cheap hack.  I can think of a dozen more for filtering spam,
> some I've read other people expounding on over the years as the
> greatest thing since sliced bread, all of which work - and all of
> which are totally unscalable.
> 
> If you want to write a clever spam filter than write one that everyone
> can use.  Otherwise the more you defend this, the more you look like
> an inexperienced mail admin who knows just enough to be dangerous.

All I can see above is a long list of dubious predictions of what
spammers would do if everybody used the same system as me. I can't be
bothered with this thread anymore. Feel free to make dubious assumptions
of why that may be. Out.

-- 
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Spam filtering similar to SPF, less breakage

On 11/02/2010 18:52, Matus UHLAR - fantomas wrote:

>>> Imho, SPF does NOT break forwarding. 
> 
> On 11.02.10 19:37, Per Jessen wrote:
>> Hmm, the SRS people seem to disagree:
>>
>> http://www.openspf.org/SRS :  SPF "breaks" email forwarding.
> 
> I think those quotes say it all. SRS is a way to create correct and
> trackable forwarding, SPF or not.
> 
> The forwardind without changing sender is imho already broken, however the
> breakage gets visible with SPF adoption.

Yes. A more accurate statement than, "SPF breaks forwarding," would be,
"Broken forwarding is incompatible with SPF."

-- 
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Newest spammer trick - non-blank subject lines?

On 11/02/2010 17:08, Bernd Petrovitsch wrote:

>> Let me explain this in simple terms.
>>
>> Normal behaviour:
>>
>> Spam filtering causes a 5xx rejection. You get an NDR. You either 
>> contact the user some other way or not at all.
> Spam filtering rejects valid non-spam because it mis-identified it as
> "spam".

Yes. It's called a "false positive".

>> Behaviour on my system:
>>
>> Spam filtering causes a 5xx rejection. You get an NDR. You either 
>> contact the user some other way or not at all. But ... the recipient can 
> Spam filtering rejects valid non-spam because it mis-identified it as
> "spam". Now *I* have to do something to work around *Your* buggy/screwed
> spamcheck.

No different to a normal situation where the features I've described
aren't in place.

> You just have to hope that I´m really, really that interested to my mail
> through. If it's an answer per PM to e.g. typical ML mails (like this
> here), you would loose.

No different to a normal situation where the features I've described
aren't in place.

>> still access the email if it's something they were expecting, *and* if 
>> the sender still wants to contact the recipient they now have an *extra* 
>> option to make their life easier - they can click a URL and fill in a 
>> captcha.
>>
>> So ... my system provides 2 extra little features which makes some 
>> senders and some recipients lives more easy.
> No, you are pushing effort from your side out to others. If you want to
> do something for the (valid) sender, fix the FP rate by changing
> whatever it needs so that my on-spam mail gets through.

Ridiculous claim. In a normal situation the effort relies on the sender
to get their mail through after a false positive occurs, or it wont get
through at all.

With the 2 features I described, the sender is provided with an extra
simple option to get the mail through, and the recipient has also been
provided with an option to get the mail through.

>> Neither sender nor recipient would benefit from me removing those 
>> features from my system.
> Of course anyone can do as they think it´s best. But that doesn´t imply
> that other think the same.

I want you to describe a scenario where the sender or recipient are
actually worse off because of the particular two features I've
described. You've failed to even attempt that so far.

I know this system works well because I've been using it for a long time.

-- 
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Newest spammer trick - non-blank subject lines?


On 11/02/2010 16:23, David Morton wrote:


On this system, not much. On the scale of about 6,000 messages a day.


Very light duty then. :)


Even if SpamAssassin isn't used during SMTP, there's nothing stopping
somebody who wants to DOS you from just setting their DOS tool to hold
open connections and spend lots of time waiting between issuing SMTP
commands... It could even go straight through to the DATA phase and send
a 10MB email at a speed of 1 byte per second.


True, though most MTA's have some defenses built for this, but waiting
to scan for spam by nature takes time, and so these defenses must be
lowered to allow it.


I don't think moving SpamAssassin to after the SMTP transaction has
finished would help prevent someone from performing a DOS.

If you *can* do SMTP time spam scanning, then that's the best place for it.


- From experience with larger ISP settings, and some large enterprise
settings, it doesn't take a malicious attempt - normal traffic can be
bursty and bring a system to its knees.  From a practical standpoint,
it's just a whole lot easier to have the front line smtpd servers
swallow the email as fast as possible (some quick rbl or greylisting
aside) and then you can process in batches behind the lines.

It's scary when email starts piling up faster than all your scanners can
chew... but most admins I've met would prefer that to other mail servers
getting connection errors and possible bouncing or sending problem
reports back to the sender.


I must admit, I have seen this several times before. Looking at the logs 
on our servers at work we've rejected on average 151 emails per minute 
for the past week. We do SpamAssassin scanning during SMTP here as well 
and the vast majority of the time it's fine, but it does cause problems 
during spikes.


To me this just says that we don't have enough servers to deal with the 
spikes, but it happens infrequently enough that it's not worth 
investing. I still think SMTP time scanning is both practical and desirable.


--
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Newest spammer trick - non-blank subject lines?


On 11/02/2010 15:49, David Morton wrote:


At SMTP time I return a 5xx code during the "DATA" phase for messages
classified as Spam. However, I also deliver the message into a read only


What kind of mail load do you service?


On this system, not much. On the scale of about 6,000 messages a day.


It takes a significant amount of
time for spamassassin to process a message, and holding the connection
open during that time can easily allow for a DoS by flooding your mail
server with connections.  This is why amavisd* variants always accept
the mail and then process - it helps keep the incoming smtpd process
from jamming.


SpamAssassin seems to average about 5 seconds per message here. I have 
other light weight checks which take place before spamassassin as well.


Even if SpamAssassin isn't used during SMTP, there's nothing stopping 
somebody who wants to DOS you from just setting their DOS tool to hold 
open connections and spend lots of time waiting between issuing SMTP 
commands... It could even go straight through to the DATA phase and send 
a 10MB email at a speed of 1 byte per second.


I don't think moving SpamAssassin to after the SMTP transaction has 
finished would help prevent someone from performing a DOS.


If you *can* do SMTP time spam scanning, then that's the best place for it.

--
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Newest spammer trick - non-blank subject lines?


On 11/02/2010 11:26, Bernd Petrovitsch wrote:


At SMTP time I return a 5xx code during the "DATA" phase for messages classified as Spam. 
However, I also deliver the message into a read only "Junk E-Mail" folder for the user,


This is just wrong. Either accept the message, or reject the

message. Rejecting the message while secretly accepting it is just
completely wrong.

[...]

Let's say your filter catches a legitimate message to

u...@yourdomain.tld from b...@mydomain.tld.  Bob gets an erro saying
the message was spammy and didn't go through, so he goes to his gmail
account and sends it again, hoping for better results. This time it
goes through.

Bob could also have just clicked the link in the NDR.

Some people - e.g. /me - do not try to pass Turing tests. Obviously you
are not interested in my mails anyway 


If you email somebody and the spam filtering rejects the message, you 
assume they don't want your mail and don't bother trying to contact them 
again? Not even if it's obviously beneficial for you to do so?



Apart from that why should I decode captchas from some random site?
After all, they could come from a third site so that people solve them
to the the other can log automatically into the third one ...


It's not some random email from a "random site". It's an NDR to an email 
that you yourself sent.


Let me explain this in simple terms.

Normal behaviour:

Spam filtering causes a 5xx rejection. You get an NDR. You either 
contact the user some other way or not at all.


Behaviour on my system:

Spam filtering causes a 5xx rejection. You get an NDR. You either 
contact the user some other way or not at all. But ... the recipient can 
still access the email if it's something they were expecting, *and* if 
the sender still wants to contact the recipient they now have an *extra* 
option to make their life easier - they can click a URL and fill in a 
captcha.


So ... my system provides 2 extra little features which makes some 
senders and some recipients lives more easy.


Neither sender nor recipient would benefit from me removing those 
features from my system.


--
Mike Cardwell: UK based IT Consultant, Perl developer, Linux admin
Cardwell IT Ltd. : UK Company - http://cardwellit.com/   #06920226
Technical Blog   : Tech Blog  - https://secure.grepular.com/
Spamalyser   : Spam Tool  - http://spamalyser.com/

Re: Newest spammer trick - non-blank subject lines?