Re: Filtering THIS list (Re: Breaking up the Bot army - we need a plan)

2006-12-13 Thread Andreas Pettersson

Michele Neylon :: Blacknight wrote:

Maybe they're better suited to one of the other lists such as spam-l? 

 


May I suggest news.admin.net-abuse.email

--
Andreas




Re: Writing a new DNSBL rule

2006-11-08 Thread Andreas Pettersson

D.J. wrote:

Hi all.  So I've got a DNSBL I want to use with SpamAssassin that 
wasn't included in the stock install.  My question (and there's an 
alarming lack of anything useful in this area... wiki anyone on the SA 
site?) is if my syntax and placement are correct for what I've done.  
In my local.cf http://local.cf/ file, I've added the following lines:


(see the code at http://www.daringone.net/salines.txt - the list 
bounced this message for spam for some reason with the lines added)


It looks like all the other ones, but I'm not entirely sure what 
everything exactly does in the coding... so I took an educated guess.  
Thanks for everyone's input.


- D.J. 



Try this instead

header __NEWDNSBL  eval:check_rbl('newdnsbl', 'dnsbl.newdnsbl.com.')
tflags __NEWDNSBL  net

header RCVD_IN_NEWDNSBLeval:check_rbl_sub('newdnsbl', '127.0.0.2')
describe RCVD_IN_NEWDNSBL  NEWDNSBL: Received via a relay in NEWDNSBL
tflags RCVD_IN_NEWDNSBLnet
score RCVD_IN_NEWDNSBL 1.5

--
Andreas




Re: R: R: R: Relay Checker Plugin (code review please?)

2006-11-01 Thread Andreas Pettersson

Steven Dickenson wrote:


On Oct 31, 2006, at 6:09 AM, John Rudd wrote:

I've considered the exact opposite (adding static to the check for  
keywords).  My rules are really looking more for is this a  _client_ 
host, not is this a dynamic host.  That one check looks  for 
dynamic, but I'm not interested in exempting anyone because  
they're static.  They've still got a hostname that looks like an  
end-client, and an end-client shouldn't be connecting to other  
people's mail servers.  Any end-client that connects to someone  
else's email server should be treated like it's a spam/virus zombie



I can't agree with this.  Many small businesses in the US get just  
these kind of static connections from broadband ISPs.  Comcast, for  
example, has all of their static customers using rDNS that would fail  
your tests, and they refuse to set up a custom PTR record or delegate  
the record to someone else. 



I disagree on your disagreement. This is my opinion: If you don't have 
control over your rDNS, do NOT run any mail server, unless you relay all 
outbound mail through a server at your ISP.


Most of these static customers are  legitimate business networks 
running their own mail server, and have  neither the need nor desire 
to relay their mail through Comcast's  SMTP servers.  I think your 
general idea is very good, but you're  reaching a little too far with 
this one.



'No need nor desire', that's not really any good excuse. Use a relay or 
find your mail rejected, I'd say.


--
Andreas




Re: Age of a domain name - a new test?

2006-10-31 Thread Andreas Pettersson

Jeff Chan wrote:


Generally speaking whois queries is a poor way to determine

domain age, at least for client applications.  The whois
infrastructure is simply not designed to support the volume of
queries required, even if locally cached.



Perhaps CRISP is part of the answer to this problem.
http://www.completewhois.com/other_projects.htm

--
Andreas




Re: Psst!

2006-10-20 Thread Andreas Pettersson

Chris Santerre wrote:

Just curious, but how many people see spam being sent to usersnames 
with the fisrt letter dropped? I see a ton in my logs. I believe 
spammers figure [EMAIL PROTECTED] will also have a [EMAIL PROTECTED]  Too bad 
for them...they do not. :)



Same here. I've also had lots of spam to addresses with various amounts 
of trailing d or n in local part. Like [EMAIL PROTECTED]

Seems to be fewer of these today though.

--
Andreas




Re: Psst!

2006-10-20 Thread Andreas Pettersson

Andreas Pettersson wrote:

Same here. I've also had lots of spam to addresses with various 
amounts of trailing d or n in local part. Like [EMAIL PROTECTED]

Seems to be fewer of these today though.



I meant tailing.

--
Andreas




Re: Is there any way to score this?

2006-10-13 Thread Andreas Pettersson

Robert Swan wrote:

Is there anyway to get points added if the sending mail server has no 
PTR record *(unknown [196.211.162.65])?*


I am using Redhat Fedora and Spamassassin 3.1.2 and Postfix




I was looking for the same thing some time ago, but I couldn't easily 
find a way to do that in SA.
Instead I use the MTA (Exim) to add a header if the PTR is missing, and 
then I use SA to check against that header.


Perhaps there are better ways to do it.

--
Andreas



Re: Having issue with a type of spam I havn't seen before

2006-10-13 Thread Andreas Pettersson

Thomas Lindell wrote:

but whas is the CID .  Is that some sort of alternate notation for an 
ip address?



It's a reference to an attached image.

--
Andreas



Re: Having issue with a type of spam I havn't seen before

2006-10-13 Thread Andreas Pettersson

Thomas Lindell wrote:

I don't see anything attached to the message though.  

Even when I view the source I don't see a mime attachment. 
 



Well, the attachment is missing then.
Come to think of it, that would be some excellent rule :-]

--
Andreas



Re: sometimes no bayesian filtering?

2006-10-08 Thread Andreas Pettersson

Paul29 wrote:


Hi all,

in the last days there were more and more SPAM mails where I found no
bayesian scoring in the header. This lets me guess it did not take place at
all. Is that conclusion right?
I have not been able to find a common property in these mails to tell which
mails are scanned and which not. What could be the reason? Where would you
start to check?


The spamd log file is a good place to start. Any errors at all?

--
Andreas




Re: What's the best method to use SA?

2006-10-04 Thread Andreas Pettersson
I use Exim with the integrated SA ACL.
I'm really pleased with how it works.

http://www.exim.org/exim-html-4.62/doc/html/spec_html/ch40.html


/Andreas



Re: Stock spam in images

2006-10-02 Thread Andreas Pettersson

Stuart Johnston wrote:


Theo Van Dinter wrote:


On Mon, Oct 02, 2006 at 03:18:58PM +0100, Randal, Phil wrote:

undetected). Wouldn't it be better to inject the detected text back 
to SA? There should be enough variants of spam worlds to let SA 
fuzzily catch the ones from images.


I think so.  Some of the words would be perfectly legitimate in the 
text

of emails but rarely found in attached legitimate images.

Quite apart from the fact that Spamassassin isn't designed for
reinjection.



FWIW, 3.2 adds in support to have rendering of non-text parts.  So a 
plugin
could, for instance, OCR text from an image, and then the normal body 
rules

and such would be able to use that information.



Would it also be possible to create a rule that matches on text 
rendered specifically from a non-text part and not the whole body?  
That way you could get the benefit of Bayes and existing body rules in 
the general case while still taking advantage of the fact the certain 
words in an image have more spammy-weight than the same words in text.




Or perhaps:

tflags   RULE_NAME   ocr


/Andreas



Re: SA gone mad, times out and stucks

2006-09-30 Thread Andreas Pettersson

Jürgen Herz wrote:


What I still get and not understand is
warn: bayes: cannot open bayes databases /var/spool/exim4/.spamassa
ssin/bayes_* R/W: lock failed: File exists
 



Make sure the file permissions hasn't changed when you ran the manual 
expire.


Regards,
Andreas



Re: TQMcube Geo Zone config files

2006-09-30 Thread Andreas Pettersson

Andreas Pettersson wrote:

In case anybody is interrested, I've compiled a config file for the 
geo zone at TQM http://tqmcube.com/worldzone.php
It might not be of great use, but it is interresting to gather some 
statistics of where the mails come from.


Files found here
http://anp.ath.cx/tqmcube/



I have updated tqmcube_world.cf with the -lastexternal setting on the 
set name, so that only the connecting IP address is checked instead of 
the whole chain of relays.


Regards,
Andreas



Re: bayes sync is hogging cpu

2006-09-29 Thread Andreas Pettersson

Bret Miller wrote:


I used to have problems with bayes locking and journaling. When it
finally corrupted the database, I decided it was time to put it into a
real SQL database instead of using DB_File. Haven't had a single problem
with bayes CPU or locking since.

Maybe it's time you consider using MySQL?

Bret
 



I have now simply put an end to the misery by wiping the DB :)
And the issue is of course solved. I'll be looking into MySQL in the 
very near future, I think.


Thanks to everyone who has answered!

Best Regards,
Andreas



[OT] Re: Fw: failure notice / spaassassin.apache.org

2006-09-29 Thread Andreas Pettersson

Ken A wrote:

It looks like you are listed in spamcop and apparently Comcast is 
either using spamcop or they have their own list that is blocking you.



Comcast themselves are using a spam filter?
(Let me taste that line one more time...)
Comcast themselves are using a spam filter?
Then why aren't they using one to block their own customers from 
spamming the rest of the world?


/Andreas



Re: bayes sync is hogging cpu

2006-09-26 Thread Andreas Pettersson

Bret Miller wrote:


I used to have problems with bayes locking and journaling. When it
finally corrupted the database, I decided it was time to put 
 


it into a
   

real SQL database instead of using DB_File. Haven't had a 
 


single problem
   


with bayes CPU or locking since.

Maybe it's time you consider using MySQL?

Bret



 


Well, if it solves the problem I'm ready to try almost anything. :)
The way you put your words tells me that the problem IS a 
corrupt database.
Can we be certain? And is there any way fo fix it until I can 
get MySQL up 'n running?
   



If the database is corrupted, it should say so. In my case, it wouldn't
expire, learn, sync, or use the db_file database because it ended up
corrupted somehow. I could have restored it from backup, but chose to
simply delete it and start over with SQL. 


...

Bret
 



Well, I've let sa-learn --force-expire --showdots run for 19 hours now 
(even on a separate machine), 100% cpu util all the time, and not a 
single dot has appeared on the screen.

If I can't get to understand how to use db_recover, wiping is the next step.

Regards,
Andreas



Re: bayes sync is hogging cpu

2006-09-25 Thread Andreas Pettersson
Me again. Since I'm not getting any responses I better keep posting more 
information as I've made some more investigating today.


Sometimes when I run sa-learn --force-expire I get this response almost 
immediately:

Bus error (core dumped)
When I run again the process just hogs until I break it after about 15 
minutes.


I have also changed bayes_learn_to_journal back to 0 and lock_method to 
flock.


Now I get these in spamd.log:
Mon Sep 25 17:05:18 2006 [8853] warn: bayes: cannot open bayes databases 
/usr/local/share/spamassassin/bayes/bayes_* R/W: lock failed: 
Interrupted system call


I also lowered --max-children from 8 to 6 with this result:
Mon Sep 25 17:11:03 2006 [6702] info: prefork: server reached 
--max-children setting, consider raising it


Here's some top output of a typical situation:
 PID USERNAME PRI NICE   SIZERES STATETIME   WCPUCPU COMMAND
8287 spamd1320 48056K 44220K RUN  8:00 88.43% 88.43% perl5.8.7
8853 spamd 200 40416K 38356K lockf0:11  1.32%  1.32% perl5.8.7
9128 spamd 200 38592K 36544K lockf0:03  0.63%  0.63% perl5.8.7
8879 spamd 200 40804K 38484K lockf0:08  0.59%  0.59% perl5.8.7
9103 spamd 200 39728K 37736K lockf0:04  0.54%  0.54% perl5.8.7

-rw---  1 spamd  wheel45 Sep 25 17:04 bayes.mutex
-rw---  1 spamd  wheel240024 Sep 25 17:15 bayes_journal
-rw---  1 spamd  wheel   1039920 Sep 25 17:04 bayes_journal.old
-rw-r--r--  1 spamd  wheel  83787776 Sep 25 16:09 bayes_seen
-rw---  1 spamd  wheel  85901312 Sep 25 17:04 bayes_toks

# cat bayes.mutex
8287
6708
6708
6708
6708
6708
6708
6708
6708


What is wrong?! What is making spamd go *kaboom* several times an hour?
Is it something with expiring tokens that's not working correctly?
Is it normal to have an bayes_journal.old laying around?
What more can I do to find the cause?

If the core dump (22 MB) is of any interrest, I'll upload it somewhere.



Best regards,
Andreas





Andreas Pettersson wrote:


Ok, more information here.

I found in spamd.log this line when the problem started:
Fri Sep 22 19:55:22 2006 [74581] warn: bayes: expire_old_tokens: child 
processing timeout at /usr/local/bin/spamd line 1082


which was followed by lots of these:
Fri Sep 22 19:55:52 2006 [74581] warn: bayes: cannot open bayes 
databases /usr/local/share/spamassassin/bayes/bayes_* R/W:

lock failed: File exists

In an attempt to find what's wrong I changed bayes_learn_to_journal to 
1. It didn't help, but at least I got rid of the 'lock failed: File 
exist' error messages in spamd.log and bayes also keeps working. For 
the moment I have a script that checks for bayes.lock existance and 
kills the hogging process and removes the lock file. It runs every 
minute..



I have tried change lock_method to flock, problem still there (but 
with a new lock file name).
I also tried a sa-learn --force-expire. It took about 30 sec to 
complete. It didn't solve my problem either.



Any ideas of what might be wrong?

Regards,
Andreas






Re: bayes sync is hogging cpu

2006-09-25 Thread Andreas Pettersson

Here's an interesting observation.
I set bayes_auto_expire to 0 as a temporary solution, I thought, and 
restarted spamd. The hogging occurs at least as often as before. Am I 
looking in the wrong direction or wouldn't this have helped something?


Another observation:
# sa-learn --dump magic:
bayes: cannot open bayes databases 
/usr/local/share/spamassassin/bayes/bayes_* R/W: lock failed: 
Interrupted system call

0.000  0  3  0  non-token data: bayes db version
0.000  0 437041  0  non-token data: nspam
0.000  0 253396  0  non-token data: nham
0.000  04616765  0  non-token data: ntokens
0.000  0 1156977303  0  non-token data: oldest atime
0.000  0 1159200779  0  non-token data: newest atime
0.000  0 1159199860  0  non-token data: last journal 
sync atime

0.000  0 1158904222  0  non-token data: last expiry atime
0.000  0  0  0  non-token data: last expire 
atime delta
0.000  0  0  0  non-token data: last expire 
reduction count


last expiry atime converts to september 22, the same day my problems 
started. But if the hogging continues even with bayes_auto_expire set to 
0, then where should I be looking instead?


Regards,
Andreas



Andreas Pettersson wrote:

Me again. Since I'm not getting any responses I better keep posting 
more information as I've made some more investigating today.


Sometimes when I run sa-learn --force-expire I get this response 
almost immediately:

Bus error (core dumped)
When I run again the process just hogs until I break it after about 15 
minutes.


I have also changed bayes_learn_to_journal back to 0 and lock_method 
to flock.


Now I get these in spamd.log:
Mon Sep 25 17:05:18 2006 [8853] warn: bayes: cannot open bayes 
databases /usr/local/share/spamassassin/bayes/bayes_* R/W: lock 
failed: Interrupted system call


I also lowered --max-children from 8 to 6 with this result:
Mon Sep 25 17:11:03 2006 [6702] info: prefork: server reached 
--max-children setting, consider raising it


Here's some top output of a typical situation:
 PID USERNAME PRI NICE   SIZERES STATETIME   WCPUCPU COMMAND
8287 spamd1320 48056K 44220K RUN  8:00 88.43% 88.43% 
perl5.8.7
8853 spamd 200 40416K 38356K lockf0:11  1.32%  1.32% 
perl5.8.7
9128 spamd 200 38592K 36544K lockf0:03  0.63%  0.63% 
perl5.8.7
8879 spamd 200 40804K 38484K lockf0:08  0.59%  0.59% 
perl5.8.7
9103 spamd 200 39728K 37736K lockf0:04  0.54%  0.54% 
perl5.8.7


-rw---  1 spamd  wheel45 Sep 25 17:04 bayes.mutex
-rw---  1 spamd  wheel240024 Sep 25 17:15 bayes_journal
-rw---  1 spamd  wheel   1039920 Sep 25 17:04 bayes_journal.old
-rw-r--r--  1 spamd  wheel  83787776 Sep 25 16:09 bayes_seen
-rw---  1 spamd  wheel  85901312 Sep 25 17:04 bayes_toks

# cat bayes.mutex
8287
6708
6708
6708
6708
6708
6708
6708
6708


What is wrong?! What is making spamd go *kaboom* several times an hour?
Is it something with expiring tokens that's not working correctly?
Is it normal to have an bayes_journal.old laying around?
What more can I do to find the cause?

If the core dump (22 MB) is of any interrest, I'll upload it somewhere.



Best regards,
Andreas





Andreas Pettersson wrote:


Ok, more information here.

I found in spamd.log this line when the problem started:
Fri Sep 22 19:55:22 2006 [74581] warn: bayes: expire_old_tokens: 
child processing timeout at /usr/local/bin/spamd line 1082


which was followed by lots of these:
Fri Sep 22 19:55:52 2006 [74581] warn: bayes: cannot open bayes 
databases /usr/local/share/spamassassin/bayes/bayes_* R/W:

lock failed: File exists

In an attempt to find what's wrong I changed bayes_learn_to_journal 
to 1. It didn't help, but at least I got rid of the 'lock failed: 
File exist' error messages in spamd.log and bayes also keeps working. 
For the moment I have a script that checks for bayes.lock existance 
and kills the hogging process and removes the lock file. It runs 
every minute..



I have tried change lock_method to flock, problem still there (but 
with a new lock file name).
I also tried a sa-learn --force-expire. It took about 30 sec to 
complete. It didn't solve my problem either.



Any ideas of what might be wrong?

Regards,
Andreas









Re: bayes sync is hogging cpu

2006-09-25 Thread Andreas Pettersson

Bret Miller wrote:


I used to have problems with bayes locking and journaling. When it
finally corrupted the database, I decided it was time to put it into a
real SQL database instead of using DB_File. Haven't had a single problem
with bayes CPU or locking since.

Maybe it's time you consider using MySQL?

Bret

 



Well, if it solves the problem I'm ready to try almost anything. :)
The way you put your words tells me that the problem IS a corrupt database.
Can we be certain? And is there any way fo fix it until I can get MySQL 
up 'n running?


Best regards,
Andreas



Re: bayes sync is hogging cpu

2006-09-25 Thread Andreas Pettersson

Jonas Eckerman wrote:


Andreas Pettersson wrote:


Bus error (core dumped)



This *can* be the symnptom of a hardware problem, such as bad memory 
or a bad disk.


If you have a disk thats going bad, the symptoms often are corrupt 
files and extremeley slow writes (because the disk controller retries 
the write operation (marking sections as bad) until it either succeeds 
or gives up).


/Jonas


The 'hardware' is VMware ESX 2.5
I think bad hard-hardware would show up in ESX rather than the guest OS..?
But I'm not throwing any ideas away. Let me move the bayes files to 
another area on the disk and have a try.


*momento*

Same Bus error (core dumped) as before when running manual expire.
When I make another try it hogs, and is still doing so after 5 minutes. 
But this time I'll wait at least 30 minutes, just to make sure.
And just to make it clear; the spamd daemon is not running while I do 
manual expire.



Regards,
Andreas



Re: bayes sync is hogging cpu

2006-09-25 Thread Andreas Pettersson

Bret Miller wrote:


Are you sure you have enough RAM to handle the number of threads you are
running? 
 


Yes, I'm pretty sure 512MB is enough.
No swapping going on, and I only scan msgs smaller than 500 KB.
Avg scan time is about 3-4 sec and I scan less than 1 a day.

Regards,
Andreas



Re: bayes sync is hogging cpu

2006-09-25 Thread Andreas Pettersson

Logan Shaw wrote:


One thing you could try is running db4_recover (or db_recover,
depending on how it's installed) on the Bayes database.



Seems like something to try. But I don't understand the utility:
usage: db_recover [-ceVv] [-h home] [-P password] [-t [[CC]YY]MMDDhhmm[.SS]]
How can I specify my bayes dbs with -h? Just feeding with the path to 
the files gives nothing.

I'm running FreeBSD 5.4.

Regards,
Andreas



Re: bayes sync is hogging cpu

2006-09-24 Thread Andreas Pettersson

Ok, more information here.

I found in spamd.log this line when the problem started:
Fri Sep 22 19:55:22 2006 [74581] warn: bayes: expire_old_tokens: child 
processing timeout at /usr/local/bin/spamd line 1082


which was followed by lots of these:
Fri Sep 22 19:55:52 2006 [74581] warn: bayes: cannot open bayes 
databases /usr/local/share/spamassassin/bayes/bayes_* R/W:

lock failed: File exists

In an attempt to find what's wrong I changed bayes_learn_to_journal to 
1. It didn't help, but at least I got rid of the 'lock failed: File 
exist' error messages in spamd.log and bayes also keeps working. For the 
moment I have a script that checks for bayes.lock existance and kills 
the hogging process and removes the lock file. It runs every minute..



I have tried change lock_method to flock, problem still there (but with 
a new lock file name).
I also tried a sa-learn --force-expire. It took about 30 sec to 
complete. It didn't solve my problem either.



Any ideas of what might be wrong?

Regards,
Andreas



Some mail seems to hog spamd process

2006-09-23 Thread Andreas Pettersson

Hi.

Since yesterday I am having problem with spamd processes hogging cpu. 
All is fine until suddenly spamd keeps using 95% cpu forever. I noticed 
that bayes.lock also contains the pid of the hogging process. After some 
minutes I kill the pid and removes bayes.lock by hand, but it only takes 
a few minutes until the situation is the same again. I tailed the log 
trying to find some answers but only found


Sat Sep 23 12:50:25 2006 [13787] info: spamd: connection from localhost 
[127.0.0.1] at port 52807
Sat Sep 23 12:50:25 2006 [13787] info: spamd: checking message 
[EMAIL PROTECTED] for nobody:58


Does anyone have an idea on how to solve this?

Regards,
Andreas



Re: Some mail seems to hog spamd process

2006-09-23 Thread Andreas Pettersson
I have completely missed the recent thread SA increasing load average a 
lot and spams getting through, which seems to reflect exactly the same 
problem I'm having.
For completeness I use SA 3.1.5 and haven't changed any cf the last few 
days.
Theres absolute not any high volume of mail. Plenty of time to process 
one mail at a time.


Regards,
Andreas


Andreas Pettersson wrote:


Hi.

Since yesterday I am having problem with spamd processes hogging cpu. 
All is fine until suddenly spamd keeps using 95% cpu forever. I 
noticed that bayes.lock also contains the pid of the hogging process. 
After some minutes I kill the pid and removes bayes.lock by hand, but 
it only takes a few minutes until the situation is the same again. I 
tailed the log trying to find some answers but only found


Sat Sep 23 12:50:25 2006 [13787] info: spamd: connection from 
localhost [127.0.0.1] at port 52807
Sat Sep 23 12:50:25 2006 [13787] info: spamd: checking message 
[EMAIL PROTECTED] for nobody:58


Does anyone have an idea on how to solve this?

Regards,
Andreas






Re: bayes sync is hogging cpu (was: Some mail seems to hog spamd process)

2006-09-23 Thread Andreas Pettersson

Hi, me again ;)

I'm pretty confident that the hogging occurs when SA is trying to sync 
the bayes. The bayes_journal is cleared exactly when the hogging begins. 
And when I run sa-learn --sync I get the very same hogging effect.


The permissions seems ok, doesn't it?

-rw---  1 spamd  wheel20 Sep 23 13:28 bayes.lock
-rw---  1 spamd  wheel  2760 Sep 23 13:28 bayes_journal
-rw-r--r--  1 spamd  wheel  83755008 Sep 23 13:28 bayes_seen
-rw---  1 spamd  wheel  83853312 Sep 23 13:28 bayes_toks


Regards,
Andreas



Re: Fishing

2006-09-13 Thread Andreas Pettersson

Steve Thomas wrote:


/htt(?:p|ps):\/\/.*?\/.*\.com$/i

 



Why not /https?:\/\/.*?\/.*\.com$/i
?



Re: TQMcube Geo Zone config files

2006-09-10 Thread Andreas Pettersson

mouss wrote:


How does/would this compare to using RELAY_COUNTRY?
are they similar (so one should only use one of them) or complementary?




I don't know. I haven't used RELAY_COUNTRY, but now that I'm aware of 
its existense I'll have a look at it :)



Regards,
Andreas




Re: TQMcube Geo Zone config files

2006-09-10 Thread Andreas Pettersson

Andreas Pettersson wrote:



I don't know. I haven't used RELAY_COUNTRY, but now that I'm aware of 
its existense I'll have a look at it :)




Ok, I've had a quick look now. RelayCountry presents the country code of 
the last relay either as a separate header, or as the _RELAYCOUNTRY_ 
header markup. When looking at only one mail it wouldn't make any 
difference using TQM or RelayCountry, but I fancy about statistics, and 
since I already have tools for grabbing the amount of ham and spam each 
rule has triggered on, my vote falls on TQM.


rule  spam   ham
TQMCUBE_W_US 18427   428
TQMCUBE_W_FR  560552
TQMCUBE_W_ES  5040 7
TQMCUBE_W_KR  3794 2
TQMCUBE_W_CN  3600 4
TQMCUBE_W_PL  3235 3
TQMCUBE_W_BR  2582 1
TQMCUBE_W_DE  2184   149
TQMCUBE_W_IT  2165 9
...

Regards,
Andreas



TQMcube Geo Zone config files

2006-09-09 Thread Andreas Pettersson
In case anybody is interrested, I've compiled a config file for the geo 
zone at TQM http://tqmcube.com/worldzone.php
It might not be of great use, but it is interresting to gather some 
statistics of where the mails come from.


Files found here
http://anp.ath.cx/tqmcube/


Regards,
Andreas



Live Messenger Invitation with forged Received header?

2006-09-03 Thread Andreas Pettersson
I need some help with understanding why some of the below rules 
triggered on these headers..



Received: from baym-sm1.msgr.hotmail.com ([207.46.1.190])
   by mail.mydomain.com with esmtp
   (envelope-from [EMAIL PROTECTED])
   id 1GJcP7-00063q-JH
   for [EMAIL PROTECTED]; Sat, 02 Sep 2006 22:47:53 +0200
Received: from mail pickup service by baym-sm1.msgr.hotmail.com with 
Microsoft SMTPSVC;

Sat, 2 Sep 2006 13:47:45 -0700
MIME-Version: 1.0
Content-Type: multipart/alternative; 
boundary=_=_NextPart_001_2QAIHCIKEOG.9E6CG57B

Date: Sat, 02 Sep 2006 13:41:39 Pacific Daylight Time
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
X-MSMessengerInvitationMailTemplateVersion: 2.9.12.5.0.02
Message-ID: [EMAIL PROTECTED]

   2.2 INVALID_DATE   Invalid Date: header (not RFC 2822)
   0.8 DATE_IN_PAST_06_12 Date: is 6 to 12 hours before Received: date
   2.3 FORGED_HOTMAIL_RCVDForged hotmail.com 'Received:' header found
   0.3 MIME_BOUND_NEXTPARTSpam tool pattern in MIME boundary


Why does SA 3.1.3 think that the hotmail.com Received header is forged? 
As far as I can see it seems alright..
Pacific Daylight Time is perhaps not the right way to describe the 
timezone, or is it?

And Spam tool pattern in MIME boundary, what's that by the way?


Regards,
Andreas



Invalid date header

2006-09-02 Thread Andreas Pettersson

Hi. I got a mail with this Date header:
Date: Mon, 28 Aug 2006 09:23:11 +0200

which triggered this rule:
2.2 INVALID_DATEInvalid Date: header (not RFC 2822)

What's wrong with it?  The  ?


Regards,
Andreas



Re: AWL confusion..

2006-08-29 Thread Andreas Pettersson

Anders Norrbring wrote:


I just got rediciously confused..

I sent a mail to myself, testing some stuff, and of course it's in the 
same domain and network as the server.


I got:
9.6 AWL  AWL: From: address is in the auto white-list

Shouldn't mail in the AWL get a *negative* score? Or did I just mess 
my mind up?



http://wiki.apache.org/spamassassin/AwlWrongWay

Regards,
Andreas



Re: [Sare-users] (no subject)

2006-08-22 Thread Andreas Pettersson

SysAdmin wrote:

I wrote the following rule in an attempt to catch these but I've 
obviously made some error.  Can someone give me a little guidance as 
to where I went awry?


rawbody SWF_r_AMPGFX1   /\.(com|net)/\w+/\?90\amp/i



The forward slashes need to be escaped as well.

Regards,
Andreas



Re: [Sare-users] (no subject)

2006-08-22 Thread Andreas Pettersson

Andreas Pettersson wrote:


SysAdmin wrote:

I wrote the following rule in an attempt to catch these but I've 
obviously made some error.  Can someone give me a little guidance as 
to where I went awry?


rawbody SWF_r_AMPGFX1   /\.(com|net)/\w+/\?90\amp/i



The forward slashes need to be escaped as well.

Regards,
Andreas



Sorry, this went to the wrong list..

Regards, Andreas



sa-update and VirusScannerTypeUpdates

2006-08-21 Thread Andreas Pettersson

Hi.

I keep seeing suggestions to use sa-update quite often on this list, but 
I thought it was no use doing so between releases according to this page:

http://wiki.apache.org/spamassassin/VirusScannerTypeUpdates
with these exact words in the end:

Daily and/or weekly updates aren't practical, because it takes weeks to 
evolve a scoreset for a release.


So, how often are there new rules available via sa-update?


Regards,
Andreas



Re: sa-update and VirusScannerTypeUpdates

2006-08-21 Thread Andreas Pettersson

Theo Van Dinter wrote:


On Mon, Aug 21, 2006 at 05:46:19PM +0200, Andreas Pettersson wrote:
 

I keep seeing suggestions to use sa-update quite often on this list, but 
I thought it was no use doing so between releases according to this page:

http://wiki.apache.org/spamassassin/VirusScannerTypeUpdates
with these exact words in the end:

Daily and/or weekly updates aren't practical, because it takes weeks to 
evolve a scoreset for a release.


So, how often are there new rules available via sa-update?
   


...

 Generally speaking, updates could occur ever 15m w/ the current config, but
 realistically it may be once a week or so.  If setting up a cronjob to do
 updates, I'd probably go daily for now.



Thank you very much. Now I have a better idea of what to expect from 
sa-update.


Regards,
Andreas



Re: Using SA to prevent bouncing spam?

2006-08-15 Thread Andreas Pettersson

Ole Nomann Thomsen wrote:

I run a qmail frontend for a FirstClass system. The qmail accepts mail 
for
about 500 domains, hosted on the FirstClass system, and scans them 
with SA.
In then injects them into FirstClass. If the domain is known, but the 
user  is

wrong (as in [EMAIL PROTECTED]) the mail is rejected on
smtp-level by FirstClass. Qmail then generates a bounce back to the  
original
sender. In case of spam, origninal sender is faked and we have 
backscatter.


I know qmail-ldap could be of some use here, but I have no way of setting
up an ldap-server that knows legitimate FirstClass adressess 
(FirstClass  itself
could do it, but it is running at 99% capacity most of the time, so no 
go.
Exporting adresses from FirstClass won't do either, as there are  
forum-adresses
that wont export). This is a classic MTA frontend problem, but I'm  
afraid I'm

stuck with it.



While I don't really see why ldap isn't an option, even with an 99% 
load, callout might be the solution.

However, I don't run qmail but here's how it works with exim

http://www.exim.org/exim-html-4.62/doc/html/spec_html/ch39.html#SECTcallver


hälsningar,
Andreas



Re: Using SA to prevent bouncing spam?

2006-08-15 Thread Andreas Pettersson

Ole Nomann Thomsen wrote:


Den 15.08.2006 kl. 12:01 skrev Andreas Pettersson [EMAIL PROTECTED]:

While I don't really see why ldap isn't an option, even with an 99%  
load, callout might be the solution.

However, I don't run qmail but here's how it works with exim

http://www.exim.org/exim-html-4.62/doc/html/spec_html/ch39.html#SECTcallver 




Yeah, that is pretty neat. But the Firstclass system is running at 99%
capacity on the E-mail injection too. I mean, we are really pumping it 
in,

trying to level the peak-priod and everything.

Performing callouts will probably cause it to emit strange noises and  
smoke.



Why would it?
It would generate the same amount of connect attempts to FC as it 
already does today, but the spam gets rejected instead of accepted and 
then bounced.



Regards,
Andreas



SPF softfail when mail has been forwarded from another domain

2006-08-13 Thread Andreas Pettersson

Hi all.

I've noticed a problem. We receive a few legit mails that has travelled 
through a forwarder. That causes some problems for the SPF check.
Since the mail claiming to be from hotmail clearly doesn't arrive 
directly from one of the machines listed in hotmail's spf record, the 
SPF_SOFTFAIL kicks in another 1.4 points.


What can I do to prevent this from happening?
Are there any generic solution, or am I bound to know from which servers 
I might receive forwarded mails?


I'm running SA 3.1.3 on FreeBSD.
Below is a snip of a mail that got hit by softfail because of forwarding.


Regards,
Andreas




Received: from mail.forwardingdomain.com
 by mail.mydomain.com with smtp
 (envelope-from [EMAIL PROTECTED])
 for [EMAIL PROTECTED]; Fri, 11 Aug 2006 14:54:13 +0200
Received: (qmail 13341 invoked by uid 729); 11 Aug 2006 12:54:00 -
Delivered-To: [EMAIL PROTECTED]
Received: (qmail 13326 invoked from network); 11 Aug 2006 12:53:59 -
Received: from bay0-omc3-s32.bay0.hotmail.com
 by mail.forwardingdomain.com with SMTP; 11 Aug 2006 12:53:59 -
Received: from hotmail.com by bay0-omc3-s32.bay0.hotmail.com;
 Fri, 11 Aug 2006 05:53:57 -0700
Received: from mail pickup service by hotmail.com;
 Fri, 11 Aug 2006 05:53:57 -0700
Received: from 64.4.19.200 by by109fd.bay109.hotmail.msn.com with HTTP;
 Fri, 11 Aug 2006 12:53:54 GMT
X-Originating-IP: [zz.zz.zz.zz]
X-Originating-Email: [EMAIL PROTECTED]
X-Sender: [EMAIL PROTECTED]
From: User [EMAIL PROTECTED]
To: [EMAIL PROTECTED]




Re: SPF softfail when mail has been forwarded from another domain

2006-08-13 Thread Andreas Pettersson

Loren Wilton wrote:

I've noticed a problem. We receive a few legit mails that has 
travelled through a forwarder. That causes some problems for the SPF 
check.
Since the mail claiming to be from hotmail clearly doesn't arrive 
directly from one of the machines listed in hotmail's spf record, the 
SPF_SOFTFAIL kicks in another 1.4 points.


What can I do to prevent this from happening?



What you've described is the basic problem with SPF.  It works fine as 
long as things don't get forwarded, or otherwise come form 
unauthorized sources - like the salesman closing a deal down at the 
corner wireless hotspot and sending the deal in directly from his laptop.


There are only three things you can do if this is causing you a problem:
1 Disable SPF checks
2 Reduce the score on some or all of the SPF checks
3 Whitelist or otherwise provide a positive adjustment for specific 
senders.


None of those are particularly attractive things to do.  However, you 
might have to do one of them.


Now, there is another consideration.  The SPF check is only adding 1.4 
points.  If your limit is the default 5 points, then you need to hit a 
few other rules before the mail becomes a spam.  If you have taken the 
threshold down to something like 2.0 - well, there's your problem.  
The SPF rules (and all the rules) were scored for a threshold of 5 
points.  If you are using a lower threshold you should reduce all of 
the rule scores proportionally. Since that is a big job, it is simpler 
to just leave the threshold at 5.


   Loren


Thanks for an excellent answer, Loren.
I have kept the limit at 5 points, so there's still a pretty comfortable 
margin, but as long as users continues to write subjects with caps and 
exclamationmarks (like IMPORTANT!!!), together with some html-only, 
rfc-ignorants and gif attaches theres also the risk of FP.


Looking at the 3rd option, what would be an effective way to whitelist 
(or subtract some score from) specific relays?



Regards,
Andreas