Re: Reporting scams to fraudwatchinternational

2005-05-02 Thread Chris
On Sunday 01 May 2005 07:46 am, Chris wrote:


 Nope, I finally managed to get an email off to the tech- support contact to
 whom the domain is registered to, I'll have to see what happens from there.

 Chris

Found out the entire fraudwatchinternational site had been down for over 
32hrs.  It appears to be mostly back up now, I've just forwarded a paypal 
phish to them and will see what happens.

-- 
Chris
Registered Linux User 283774 http://counter.li.org
20:07:30 up 4 days, 14:09, 1 user, load average: 0.28, 0.25, 0.19
Mandriva Linux 10.1 Official, kernel 2.6.8.1-12mdk

If you can lead it to water and force it to drink, it isn't a horse.



Re: Folder redirection

2005-05-02 Thread Loren Wilton
 Apologies if I missed this rather simple question in the FAQ's, but I
really did look.

Good reason for you to miss it, it isn't there.

SA doesn't route messages, it only filters them.  Something else looks at
the filtering and decides what to do with the message.

As someone else suggested this may be procmail on your system.  Depending on
exactly what all you are using for mail, it could be also be one of several
other things, too.  You may have to look around in your mail system to find
where the routing is really occuring.

Loren



Re: Folder redirection

2005-05-02 Thread Mark Harwood




Loren, Greg,

Thanks. The problem was that I had set up a single-user config prior
to the system-wide config. Greg, my system-wide config looks almost
exactly as you described. Loren, your comment about procmail doing the
move reminded me that the single-user set up included a .procmailrc in
$HOME. That file, of course, had the redirection in it. I'm set.

Thanks again,
Mark

Loren Wilton wrote:

  
Apologies if I missed this rather simple question in the FAQ's, but I

  
  really did look.

Good reason for you to miss it, it isn't there.

SA doesn't route messages, it only filters them.  Something else looks at
the filtering and decides what to do with the message.

As someone else suggested this may be procmail on your system.  Depending on
exactly what all you are using for mail, it could be also be one of several
other things, too.  You may have to look around in your mail system to find
where the routing is really occuring.

Loren

  


-- 
Mark Harwood

www.MarkHarwood.com





Re: The highest score?

2005-05-02 Thread jdow
I cheat. I have a couple personal rules guaranteed to hit spam and no
ham whatsoever. They hit 100. MOM Agent is guaranteed spam. It seems
to hit 200. So it's not fair. I have, however, seen over 100 with pure
SARE rule sets so many of them were hit.

{^_-}
- Original Message - 
From: Roman Serbski [EMAIL PROTECTED]


Hi all,

What was the highest score you've ever seen? I received a message
yesterday that was scored with 51.9(!). =)

SA in action: ;-)

Sat, 30 Apr 2005 19:45:21 KGST:80593: SA: REPORT hits = 51.9/3.5

4.1 MIME_BOUND_DD_DIGITS Spam tool pattern in MIME boundary
1.2 SUBJ_HAS_SPACES Subject contains lots of white space
3.5 HELO_DYNAMIC_IPADDR2 Relay HELO'd using suspicious hostname (IP addr 2)
3.8 MSGID_SPAM_CAPS Spam tool Message-Id: (caps variant)
0.1 RCVD_BY_IP Received by mail server with no name
0.0 FROM_ILLEGAL_CHARS From contains too many raw illegal characters
2.9 SUBJ_ILLEGAL_CHARS Subject contains too many raw illegal characters
2.1 HEAD_ILLEGAL_CHARS Header contains too many raw illegal characters
0.5 HTTP_ESCAPED_HOST URI: Uses %-escapes inside a URL's hostname
0.2 HTTP_EXCESSIVE_ESCAPES URI: Completely unnecessary %-escapes inside a
URL
2.0 HTML_TAG_EXIST_MARQUEE BODY: HTML has marquee tag
0.0 HTML_TEXT_AFTER_HTML BODY: HTML contains text after HTML close tag
0.1 HTML_TEXT_AFTER_BODY BODY: HTML contains text after BODY close tag
0.0 HTML_MESSAGE BODY: HTML included in message
0.0 HTML_FONT_FACE_BAD BODY: HTML font face is not a word
0.1 HTML_FONT_BIG BODY: HTML tag for a big font size
0.8 HTML_FONT_LOW_CONTRAST BODY: HTML font color similar to background
0.1 MPART_ALT_DIFF BODY: HTML and text parts are different
0.0 HTML_SHOUTING3 BODY: HTML has very strong shouting markup
0.1 RAZOR2_CF_RANGE_51_100 BODY: Razor2 gives confidence level above
50% [cf: 100]
0.0 HTML_NONELEMENT_00_10 BODY: 0% to 10% of HTML elements are non-standard
1.9 BAYES_99 BODY: Bayesian spam probability is 99 to 100% [score: 1.]
0.2 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
0.5 HTML_EVENT_UNSAFE BODY: HTML contains unsafe auto-executing code
0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars
1.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
0.0 RCVD_IN_SORBS_HTTP RBL: SORBS: sender is open HTTP proxy server
[200.89.154.29 listed in dnsbl.sorbs.net]
0.4 RCVD_IN_NJABL_PROXY RBL: NJABL: sender is an open proxy
[200.89.154.29 listed in combined.njabl.org]
3.1 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
[200.89.154.29 listed in sbl-xbl.spamhaus.org]
2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP
address [200.89.154.29 listed in dnsbl.sorbs.net]
3.8 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org
[http://dsbl.org/listing?200.89.154.29]
0.1 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP
[200.89.154.29 listed in combined.njabl.org]
1.0 URIBL_SBL Contains an URL listed in the SBL blocklist [URIs: ourk2.com]
1.5 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist
[URIs: ourk2.com]
3.2 URIBL_OB_SURBL Contains an URL listed in the OB SURBL blocklist
[URIs: ourk2.com]
4.1 RCVD_DOUBLE_IP_SPAM Bulk email fingerprint (double IP) found
0.6 FORGED_OUTLOOK_HTML Outlook can't send HTML message only
2.4 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME parts
0.0 UPPERCASE_25_50 message body is 25-50% uppercase
0.0 MISSING_MIMEOLE Message has X-MSMail-Priority, but no X-MimeOLE
3.9 FORGED_MUA_OUTLOOK Forged mail pretending to be from MS Outlook

Sat, 30 Apr 2005 19:45:21 KGST:80593: SA: yup, this smells like SPAM -
hits=51.9 - rejecting message...




Re: SA + SQL + per-user prefs

2005-05-02 Thread =?ISO-8859-1?Q?Arvinn_L=F8kkebakken?=

Gerald V. Livingston II wrote:
OK, this is probably just an over-cautious MySQL question.
All of the examples I look at for setting up per-user prefs using SQL show
creating a table that looks like:
username  pref  value
So, if I want to allow users to control 5 values I would have a table that
looks like thsi:
user1  pref1  value1
user1  pref2  value2
user1  pref3  value3
user1  pref4  value4
user1  pref5  value5
user2  pref1  value1
user2  pref2  value2
user2  pref3  value3
user2  pref4  value4
user2  pref5  value5
user3 . etc.
When talking about importing a userbase of 6000+ that's gonna be a TALL
table really fast.
 

30.000, 5 * 6.000, rows isn't a tall SQL table at all IMHO.
Arvinn


system-wide AWL in SQL?

2005-05-02 Thread Arvinn Løkkebakken
Hi. Is it possible to keep my system-wide AWL setup when using 'spamc -u 
recipient' with user preference and AWL stored in SQL?
I currently have a system-wide AWL of more than a million rows that it 
would be a pity to loose.
.. or should I switch to personal AWL's anyway?

Arvinn


Re: Question about Bayes training - mozilla specifically

2005-05-02 Thread Jo
Bookworm wrote:
I've read through the archives several times, and hoped that over the 
last year or so someone would build the functionality, or at least 
mention it one way or another - I haven't seen it.

Is there any way to take an already trained Mozilla bayes structure 
and hand it directly off to SpamAssassin?  For me, at least, that 
would eliminate almost all of the spam my server is receiving - 
Mozilla spots it instantly, but SpamAssassin is missing at least half.

Troy Belding
Bookworm Computing
Mozilla stores its mail in mbox format, so you can simply use your good 
folders (one mbox each) for training HAM and your Junk folders for 
training SPAM. Just go and have a look in the file system, where Mozilla 
stores its files. mbox-files typically don't have an extension.

Jo


Raising the score...

2005-05-02 Thread Kevin Morwood
Hello,
I have an old email address that a few contacts still use to reach me. 
I've tried to get everyone up to date on the new address but no luck. 
That's not really the issue though...

The reason I changed addresses was that the spam that was coming in was 
all addressed to the old address.  I see that SA has a concept of 
'blacklist_to' but that will probably be overkill...right?

If I set up whitelists for the people who I know...and who still use my 
old adress...and blacklist all other mail that is addressed to this 
address...will that work?

Is there a better way...besides begging these contacts to finally update 
their address books?  :)

TIA,
Kevin


RE: system-wide AWL in SQL?

2005-05-02 Thread Philipp Snizek
 

 Hi. Is it possible to keep my system-wide AWL setup when 
 using 'spamc -u recipient' with user preference and AWL stored in
SQL?

Yes. Follow the instructions in the readme files.
for user prefs and mysql see
http://wiki.apache.org/spamassassin/UsingSQL

Philipp



Observation on secondary MX

2005-05-02 Thread Kevin Peuhkurinen
About a month ago, there was a discussion on the list about how spammers 
specifically target secondary MX records.   After reading I verified 
that indeed 99% of the mail that flowed through my store-and-forward 
secondary mail server was spam.   So, I removed the second MX record 
from my DNS zone, but did not actually decommission the server itself.

The interesting thing is that now, about a month later, I'm still seeing 
spam going to that server!   I wonder if the spammers have cached the 
old MX entry or if they have some database of mail server addresses and 
what domains they will accept email for.



Re: Observation on secondary MX

2005-05-02 Thread Niek
On 5/2/2005 1:48 PM +0200, Kevin Peuhkurinen wrote:
spam going to that server!   I wonder if the spammers have cached the 
old MX entry
Jup.
Niek


Web based helpdesk tool for SA?

2005-05-02 Thread Kevin Peuhkurinen
Lately I've been thinking that something that would really be useful for 
SA is a web based helpdesk tool.   The idea is to help out companies 
that use SA as a proxy in front of their Notes/Exchange/Groupwise 
servers (such as my own).   The MTA on the SpamAssassin box would 
quarantine spam on the server.   Then, when an end user complained about 
not getting an email, the internal helpdesk could use this tool to 
search through the quarantine for the false positive and have it 
delivered (and possibly even run through sa-learn --ham as well) with a 
click or two.

I don't think that this would be hard to do but before I go dusting off 
my PHP for Dummies book, does anyone know if something like this 
already exists?

Thanks,
Kevin


Re: system-wide AWL in SQL?

2005-05-02 Thread =?ISO-8859-1?Q?Arvinn_L=F8kkebakken?=

Philipp Snizek wrote:

 

Hi. Is it possible to keep my system-wide AWL setup when 
using 'spamc -u recipient' with user preference and AWL stored in
   

SQL?
Yes. Follow the instructions in the readme files.
for user prefs and mysql see
http://wiki.apache.org/spamassassin/UsingSQL
Philipp
 

Allready read it. System-wide AWL is not disucussed there, is it?
Arvinn


RE: Web based helpdesk tool for SA?

2005-05-02 Thread Chris Santerre


-Original Message-
From: Kevin Peuhkurinen [mailto:[EMAIL PROTECTED]
Sent: Monday, May 02, 2005 7:59 AM
To: users@spamassassin.apache.org
Subject: Web based helpdesk tool for SA?


Lately I've been thinking that something that would really be 
useful for 
SA is a web based helpdesk tool.   The idea is to help out companies 
that use SA as a proxy in front of their Notes/Exchange/Groupwise 
servers (such as my own).   The MTA on the SpamAssassin box would 
quarantine spam on the server.   Then, when an end user 
complained about 
not getting an email, the internal helpdesk could use this tool to 
search through the quarantine for the false positive and have it 
delivered (and possibly even run through sa-learn --ham as 
well) with a 
click or two.

I don't think that this would be hard to do but before I go 
dusting off 
my PHP for Dummies book, does anyone know if something like this 
already exists?

IMHO, this is something that is needed. However, it would have to be
seperate user based quarantines. Otherwise there would be privacys issues on
one big quarantine that everyone could sift through. 

This does exhist. Many people have done it, an slapped a commercial price
tag on it :) I was hoping someone would create it for SA in GPA lic. :)

--Chris 


RE: The highest score?

2005-05-02 Thread Chris Santerre


-Original Message-
From: jdow [mailto:[EMAIL PROTECTED]
Sent: Monday, May 02, 2005 1:41 AM
To: users@spamassassin.apache.org
Subject: Re: The highest score?


I cheat. I have a couple personal rules guaranteed to hit spam and no
ham whatsoever. They hit 100. MOM Agent is guaranteed spam. It seems
to hit 200. So it's not fair. I have, however, seen over 100 with pure
SARE rule sets so many of them were hit.

{^_-}
- Original Message - 
From: Roman Serbski [EMAIL PROTECTED]


Hi all,

What was the highest score you've ever seen? I received a message
yesterday that was scored with 51.9(!). =)

SA in action: ;-)

Yeah, I'm running SARE, plus beta rules, plus all the URIBL lists (Including
those not officially announced), and some personal rules. I think my avg
spam score is something like 25 now. 

--Chris


Raising the score...

2005-05-02 Thread Kevin Morwood
Hello,
sorry for the repost...it ended up as a reply to something else...a SUE 
on my part...

I have an old email address that a few contacts still use to reach me. 
I've tried to get everyone up to date on the new address but no luck. 
That's not really the issue though...

The reason I changed addresses was that the spam that was coming in was 
all addressed to the old address.  I see that SA has a concept of 
'blacklist_to' but that will probably be overkill...right?

If I set up whitelists for the people who I know...and who still use my 
old adress...and blacklist all other mail that is addressed to this 
address...will that work?

Is there a better way...besides begging these contacts to finally update 
their address books?  :)

TIA,
Kevin


Re: Web based helpdesk tool for SA?

2005-05-02 Thread Kevin Peuhkurinen
Chris Santerre wrote:
IMHO, this is something that is needed. However, it would have to be
seperate user based quarantines. Otherwise there would be privacys issues on
one big quarantine that everyone could sift through. 

 

I believe that most commercial anti-spam systems provide a means for 
administrators to look at a subset of the information about messages in 
the quarantine but not the text of the email itself.   I agree that 
there is definately still a privacy issue with this, but there are all 
kinds of issues with allowing end users to manage their own personal 
quarantines as well.Corporations need to decide which issues are 
most important to them and then have the tools necessary to implement 
solutions that work for them.   Therefore, a product that could allow 
both individual end-user access to quarantines as well as admin access 
to entire quarantines (but not message contents) would probably be of 
the greatest value.

Someone pointed me to Mailwatch which looks like a good starting point 
but which is specifically tied to mailscanner.   This hypothetical 
product would need to be modular in order to accomodate a range of 
configurations.




RE: INVALID_MSGID hitting improperly?

2005-05-02 Thread Ring, John C
So, looking at:

/GUID:QPywoUg6DZ06+yvqCupCVJw*/G=Cam/S=Dowlat/OU=Corporate-Markham/O=A
lcate l Cable/PRMD=ACAB/ADMD=ATTMAIL/C=CA/@MHS

-GUID:QnGodydG460CKmx35BCOvbw*-G=Cam-S=Dowlat-OU=Corporate-Markham-O=A
lcate l Cable-PRMD=ACAB-ADMD=ATTMAIL-C=CA-@MHS

Looking at the rule, I'm surprised they aren't BOTH declared invalid.
[RFC quoting deleted on why a space isn't legal in msg-id]

Ok, I buy that.  And as another poster pointed out, they both were ruled
that why for him.  You see, we run SpamAssassin on our perimeter MTA so that
we can reject messages that score a 10 or higher at SMTP time.  While 5 or
higher is marked as spam but still delivered.  All the specific rules for
rejected messages are logged, but not for accepted messages.  I'd assumed
that the past message log I looked at, since it wasn't even marked as spam,
wouldn't have had enough of a negative score to overcome the 20 I'd put the
INVALID_MSGID rule at.  But I see that assumption must have been
incorrect...

[Different poster] BTW, why have *any* single rule scored at 20? Especially
this one.

To be able to not accept obvious spam at the perimeter, this machine is
our incoming SMTP gateway.  However, after it accepts a message for
delivery, it still must pass the message off to our Internet firewall for
delivery.  The firewall, as configured from the vendor, has a rule to reject
e-mail with invalid message id's.  Assuming both would reject/accept
identically for a given msg-id, it made sense to reject it right away,
rather then accepting it for delivery and then having the firewall end up
trying to delivery an NDA message to the sender.  Which is what did occur
frequently before I raised that rule to 20.

However, it seems the rule on the firewall doesn't mind spaces in the
msg-id, as it did let the message in once I restored the normal score to
INVALID_MSGID.  Which makes sense from a firewall perspective, I suppose.
To them, they're not trying to prevent spam, but possible malicious headers
which might cause internal e-mail machines to be compromised by such things
as buffer overflows when processing the e-mail.  In that light, it's hard to
imagine a space character causing much of an issue with any MTA.


Re: system-wide AWL in SQL?

2005-05-02 Thread Michael Parker
On Mon, May 02, 2005 at 02:54:31PM +0200, Arvinn Løkkebakken wrote:
 
 Yes. Follow the instructions in the readme files.
 for user prefs and mysql see
 http://wiki.apache.org/spamassassin/UsingSQL
  
 
 Allready read it. System-wide AWL is not disucussed there, is it?
 

Probably not, there is a concept in 3.1 that allows you to do
systemwide or groupwide AWL dbs in SQL, similar to how you can
currently do it in Bayes (via override_username).

Michael


pgpC3jYIMT0tR.pgp
Description: PGP signature


syntax error

2005-05-02 Thread jj-ml
Hi,

I have just install the latest release of spamassassin (3.03 from the
tarball) on a debian
Everything seems to work fine but i 've got a syntax error during URIDNS
test when i run spamd -D:
...
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8496a28)
implements 'parsed_metadata'
debug: dns_available set to yes in config file, skipping test
debug: decoding: no encoding detected
debug: URIDNSBL: domains to query:
debug: is Net::DNS::Resolver available? yes
debug: Net::DNS version: 0.49




debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8496a28)
implements 'check_post_dnsbl'
debug: running meta tests; score so far=2.553
Failed to run meta SpamAssassin tests, skipping some: syntax error at (eval
48) line 356, near ) {
syntax error at (eval 48) line 365, near ;
}
debug: running header regexp tests; score so far=2.553
debug: running body-text per-line regexp tests; score so far=2.553
debug: running uri tests; score so far=2.553
debug: running raw-body-text per-line regexp tests; score so far=2.553
debug: running full-text regexp tests; score so far=2.553
debug: Running tests for priority: 1000
debug: running meta tests; score so far=2.553
debug: running header regexp tests; score so far=2.553

I don't know wich file to check.
If someone would give me a hint, it will be appreciate.

Thank in advance.
Julien




Re: Question about Bayes training - mozilla specifically

2005-05-02 Thread Bookworm
Jo wrote:
Bookworm wrote:
I've read through the archives several times, and hoped that over the 
last year or so someone would build the functionality, or at least 
mention it one way or another - I haven't seen it.

Is there any way to take an already trained Mozilla bayes structure 
and hand it directly off to SpamAssassin?  For me, at least, that 
would eliminate almost all of the spam my server is receiving - 
Mozilla spots it instantly, but SpamAssassin is missing at least half.

Troy Belding
Bookworm Computing

Mozilla stores its mail in mbox format, so you can simply use your 
good folders (one mbox each) for training HAM and your Junk folders 
for training SPAM. Just go and have a look in the file system, where 
Mozilla stores its files. mbox-files typically don't have an extension.

Jo

The issue is not so much that - I've dumped all my ham/spam through
spamassassin - it's still not as good.  The only thing I can see that's
different is that Mozilla MUST have it's own bayes database that isn't
dependant upon the actual email folders themselves. (I stopped storing
all the junk mail when I reached about 15,000).  I have no clue where
that is, but I thought maybe someone here did, and knew how to convert
it to something that spamassassin could use.
Oh well - I'll try the mbox deal later.  I only have about 80,000 emails
I could process through..
Thanks!
Troy



Re: Observation on secondary MX

2005-05-02 Thread List Mail User
...

About a month ago, there was a discussion on the list about how spammers 
specifically target secondary MX records.   After reading I verified 
that indeed 99% of the mail that flowed through my store-and-forward 
secondary mail server was spam.   So, I removed the second MX record 
from my DNS zone, but did not actually decommission the server itself.

The interesting thing is that now, about a month later, I'm still seeing 
spam going to that server!   I wonder if the spammers have cached the 
old MX entry or if they have some database of mail server addresses and 
what domains they will accept email for.

Yes and yes.  I still receive (and trap as spam) email sent for
domains I used to secondary for, but haven't in some cases for almost a
year.  They *must* keep databases, if not for the domains, at least for
the email accounts themselves.  (100% of the email sent to these domains
and/or accounts is spam).

Paul Shupak
[EMAIL PROTECTED]


Re: Web based helpdesk tool for SA?

2005-05-02 Thread Andy Jezierski

Kevin Peuhkurinen [EMAIL PROTECTED]
wrote on 05/02/2005 06:59:01 AM:

 Lately I've been thinking that something that would really be useful
for 
 SA is a web based helpdesk tool.  The idea is to help out companies

 that use SA as a proxy in front of their Notes/Exchange/Groupwise

 servers (such as my own).  The MTA on the SpamAssassin box would

 quarantine spam on the server.  Then, when an end user complained
about 
 not getting an email, the internal helpdesk could use this tool to

 search through the quarantine for the false positive and have it 
 delivered (and possibly even run through sa-learn --ham as well) with
a 
 click or two.
 
 I don't think that this would be hard to do but before I go dusting
off 
 my PHP for Dummies book, does anyone know if something
like this 
 already exists?
 
 Thanks,
 Kevin
 
 
Take a look at Maia Mailguard. Pretty slick
tool, lots you can do with it. Users can control their own quarantine,
and I believe an Admin can also do things for a user, it's been a while
since I've looked at it so I don't recall all the features.

Andy

RE: Blacklist Not Working

2005-05-02 Thread Ron Shuck
First, Thanks for the help.

Craig noticed that the rule ALL_TRUSTED was matched. There was a
potential issue with Trusted Path if trusted_networks was not
configured. I tried that. The final mail server is Exchange, and I am
having a hard time getting the headers back from the users.

I posted a link to the Trusted Path issue in my response to Craig.

Thanks again,


Ron Shuck, CISSP, GCIA, CCSE - Managing Consultant
Buchanan Associates - People. Process. Technology.

-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Friday, April 29, 2005 11:56 AM
To: Ron Shuck
Cc: Craig McLean; users@spamassassin.apache.org
Subject: Re: Blacklist Not Working

Ron Shuck wrote:

Here is the log. I don't have the message, but as you can see it did 
not match the blacklist.

---log--
Apr 24 04:39:43 mail postfix/smtpd[25746]: connect from 
castile.calmra.com[72.11.146.117]
Apr 24 04:39:44 mail postfix/smtpd[25746]: AE20883C:
client=castile.calmra.com[72.11.146.117]
Apr 24 04:39:45 mail postfix/cleanup[26437]: AE20883C:
message-id=[EMAIL PROTECTED]
Apr 24 04:39:45 mail postfix/qmgr[4304]: AE20883C:
from=[EMAIL PROTECTED], size=2034, nrcpt=1 (queue active) Apr 24 
04:39:45 mail spamd[14218]: connection from localhost.localdomain 
[127.0.0.1] at port 48918 Apr 24 04:39:45 mail spamd[14218]: info: 
setuid to filter succeeded Apr 24 04:39:45 mail spamd[14218]: 
processing message [EMAIL PROTECTED] for filter:501.
Apr 24 04:39:46 mail spamd[14218]: clean message (4.8/5.0) for
filter:501 in 1.2 seconds, 2000 bytes.
Apr 24 04:39:46 mail spamd[14218]: result: .  4 - 
ALL_TRUSTED,AWL,BAYES_20,DNS_FROM_AHBL_RHSBL,HTML_50_60,HTML_IMAGE_ONLY
_ 
12,HTML_IMAGE_RATIO_02,HTML_MESSAGE,MIME_HTML_MOSTLY,MPART_ALT_DIFF,URI
B
L_OB_SURBL,URIBL_SBL,URIBL_WS_SURBL
scantime=1.2,size=2000,mid=[EMAIL PROTECTED],bayes=0.062705367
0
923895,autolearn=no

local.cf snippet
blacklist_from  [EMAIL PROTECTED]
  

snip

Ok, now what did the headers in the message look like? The from quoted
in your logfile is the envelope, which might not have been present in
the message at the time SA saw it.

SA doesn't get the envelope directly, so that from is completely
irrelevant unless your MTA or MDA inserted it into a Return-Path: header
before SpamAssassin got called.





spamd log error

2005-05-02 Thread Derril Hedk
Hello,

I am receiving the following errors every time mail is processed by
spamd. Any ideas on a solution or what the problem is?

Derril H

May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
hash element at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 321, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
hash element at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 322, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
hash element at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 322, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
hash element at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 321, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
hash element at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 322, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
hash element at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 322, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
pattern match (m//) at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 210, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
pattern match (m//) at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 212, GEN2 line 542.
May  2 08:04:52 admin2 spamd[19328]: Use of uninitialized value in
concatenation (.) or string at
/usr/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Message/Metadata/Received.pm
line 213, GEN2 line 542.
May  2 08:04:53 admin2 spamd[19328]: error: Can't locate
Net/DNS/RR/A.pm in @INC (@INC contains: ../lib
/usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.1
/usr/lib/perl5/5.8.1/i386-linux-thread-multi /usr/lib/perl5/5.8.1
/usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.1
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl) at (eval
48) line 3, GEN2 line 542._ No such file or directory, continuing


Re: Upgrading to 3.0.3 - *CPAN* indexes stale?

2005-05-02 Thread Tom Q. Citizen
Dan O'Brien wrote:
Now that I'm trying to update my production server from 3.0.2 to 3.0.3 
(since Friday eve), every CPAN mirror I try results in the following 
messages

Going to read /root/.cpan/sources/modules/02packages.details.txt.gz
 Database was generated on Sat, 19 Mar 2005 21:41:38 GMT
CPAN: HTTP::Date loaded ok
Warning: This index file is 42 days old.
 Please check the host you chose as your CPAN mirror for staleness.
 I'll continue but problems seem likely to happen.
CPAN then says that Mail::SpamAssassin [version 3.0.2] is up to date.
Is CPAN acting goofy for anyone else?
 

I was having the same problem last week BUT this morning, the CPAN 
mirrors I used were updated and I was able to upgrade via CPAN just fine.

Try it again this week and see if it works for you now.  :)
Peac...
Tom


Re: SA + SQL + per-user prefs

2005-05-02 Thread Mike Grice
On Mon, 2005-05-02 at 09:34 +0200, Arvinn L?kebakken wrote:
 
 Gerald V. Livingston II wrote:
 
 OK, this is probably just an over-cautious MySQL question.
 
 All of the examples I look at for setting up per-user prefs using SQL show
 creating a table that looks like:
 
 username  pref  value
 
 So, if I want to allow users to control 5 values I would have a table that
 looks like thsi:
 
 user1  pref1  value1
 user1  pref2  value2
 user1  pref3  value3
 user1  pref4  value4
 user1  pref5  value5
 user2  pref1  value1
 user2  pref2  value2
 user2  pref3  value3
 user2  pref4  value4
 user2  pref5  value5
 user3 . etc.
 
 When talking about importing a userbase of 6000+ that's gonna be a TALL
 table really fast.
 
   
 
 30.000, 5 * 6.000, rows isn't a tall SQL table at all IMHO.

Nope, but think of how it would scale.  The design above is bad because
there is no unique data in there, so the table will get slow.  A better
design would be this:

1.  A table with just users on there, each with their unique user ID,
eg:

UsersTable

UID Friendlyname
1   bob
2   joe

2.  A table for each preference, linked back by the UID in the first
table:

pref1Table

UID Value
1   10

SA can then join the tables based on the UID, and the application only
needs to be passed the UID to get all the values.  You can also gain
efficiencies with these smaller tables because you can optimise what
fields are in there (eg on your SpamCutoffTable will only have integer
and tinyint as field types).  Your only problem would be perhaps passing
to the application what values the user has got customised, but you
could fix that up in two(four) ways which would alter the number of
select statements needed:

UsersPrefsTable

UID Preferences
1   pref1, pref2, pref3

A different way of doing this is multiple fields with booleans:

UsersPrefsTable

UID pref1   pref2   pref3
1   1   1   0

Or you can build it into your original users table:

UsersTable

UID FriendlynamePreferences
1   bob pref1, pref2, pref3

The other way:

UsersTable

UID Friendlynamepref1   pref2   pref3   pref4
1   bob 1   0   0   1

I'm looking into integrating user prefs this quarter where I work, and I
do have some concerns on how it will scale (e.g., with mysql replication
you need to send writes to a different machine from reads if you need to
have seperate databases, like one on each machine for reads and a master
for writes).  I wish more apps could be more db-aware :)

Cheers
Mike

-- 
| Mike Grice  Broadband Solutions for
| Systems Engineer  Home  Business @
| PlusNet plc.   www.plus.net
+ - PlusNet - The smarter way to broadband --




Re: SA + SQL + per-user prefs

2005-05-02 Thread Michael Parker
On Mon, May 02, 2005 at 04:33:28PM +0100, Mike Grice wrote:
 
 Nope, but think of how it would scale.  The design above is bad because
 there is no unique data in there, so the table will get slow.  A better
 design would be this:
 

Howdy,

SpamAssassin is an open source project that welcomes contributions
from the community.  If you see a particular itch that you would like
to scratch I highly encourage you to scratch it.  Once you've got some
working code feel free to post it here or on the wiki to get feedback
from folks.  If it's a widespread and useful feature then it may
eventually make it's way into the source base.

This is exactly how I got my start working with SpamAssassin, I wanted
to be able to store the bayes and AWL data in SQL.  I spent many many
months working and perfecting the code and now it's in widespread use
by many SpamAssassin users.

Michael


pgpRB2In8Tnn2.pgp
Description: PGP signature


Re: SpamAssassin 3.0.3 Released

2005-05-02 Thread Matias Lopez Bergero
Hello,
I'm the only one with problems checking the pgp sig of the tarball?
BR,
Matías.



Re: SpamAssassin 3.0.3 Released

2005-05-02 Thread Michael Parker
On Mon, May 02, 2005 at 12:57:54PM -0300, Matias Lopez Bergero wrote:
 
 I'm the only one with problems checking the pgp sig of the tarball?
 

Are you by chance using GPG 1.4.x?  There is this note in the release
announcement:

Note:  GnuPG 1.4.0, and possibly 1.3.x versions, seem to have problems
verifying certain signature files, including the type as used for
SpamAssassin releases. If you are running an affected version, please
verify the code using both MD5 and SHA1 sum values instead.


Michael


pgpFCeMuWIl2G.pgp
Description: PGP signature


Re: system-wide AWL in SQL?

2005-05-02 Thread =?ISO-8859-1?Q?Arvinn_L=F8kkebakken?=
Michael Parker wrote:
Probably not, there is a concept in 3.1 that allows you to do
systemwide or groupwide AWL dbs in SQL, similar to how you can
currently do it in Bayes (via override_username).
Michael
 

Thanks. This shouldn't be all that much changes. Is there a patch for 
getting this in 3.0.3?

But maybe I should concider per-user AWL anyway. Sound a little awkward 
to me though, as I have setup with system-wide bayes.
What are the opinions?

Arvinn


RE: regexp: exclude a string

2005-05-02 Thread Chris Santerre


-Original Message-
From: wolfgang [mailto:[EMAIL PROTECTED]
Sent: Sunday, May 01, 2005 2:03 PM
To: users@spamassassin.apache.org
Subject: Re: regexp: exclude a string


In an older episode (Sunday 01 May 2005 12:49), Loren Wilton wrote:
  /p(?:0|o)rtf(?:0|o)(?:\||l)i(?:0|o)/
  but not portfolio
 
 /(?!portfolio)p(?:0|o)rtf(?:0|o)(?:\||l)i(?:0|o)/

thanks, works fine.

wolfgang


Also I believe Loren has also told me to use this if you want to negate more
then one:

(?!portfolio|portfoil)

But NOT to use:

(?!p(ortfolio|ortfoil))

This has been your funky regex tip of the day :-)

--Chris 


Re: Reporting scams to fraudwatchinternational

2005-05-02 Thread Kris Deugau
John Andersen wrote:
 If you use a competent email client you will be offered the option
 of keeping a local copy, which saves the redundant recipient.

Some people deliberately turn this off.  I'm not sure why.  (I can
*sort* of understand it for mailing list mail, but not for direct
mail.)

 Further, you should never assume that other recipients do not
 see BCCs.  That it entirely up to the settings of the recipient's
 email client.

If your MUA is actually adding a real header with BCC: information,
it's broken.  BCC isn't supposed to be a header in the usual sense; 
it's a way to tell your mail client to add extra SMTP RCPT TO: commands
when sending the message.  The recipients should NEVER see those extra
recipients.

The only way someone might find out about BCC'ed recipients is if they
are the server admin (or have access to the mail logs) and are willing
to spend the effort to wade through the logs tracking the message ID to
see who got a copy.  And that only applies in the case where the
sender's SMTP server is also the destination;  and partially applies if
there are multiple recipients at a remote domain.  If a remote domain
only has one recipient in the list, they will NOT see any information
regarding other recipients.

-kgd
-- 
Get your mouse off of there!  You don't know where that email has been!


Re: SpamAssassin 3.0.3 Released

2005-05-02 Thread Matias Lopez Bergero
Michael Parker wrote:
On Mon, May 02, 2005 at 12:57:54PM -0300, Matias Lopez Bergero wrote:
I'm the only one with problems checking the pgp sig of the tarball?
Are you by chance using GPG 1.4.x?  There is this note in the release
announcement:
Note:  GnuPG 1.4.0, and possibly 1.3.x versions, seem to have problems
verifying certain signature files, including the type as used for
SpamAssassin releases. If you are running an affected version, please
verify the code using both MD5 and SHA1 sum values instead.
Yes. I was using 1.4.0, I tried with 1.2.1 and got the OK :)
Thanks, Michael
BR,
Matías.



Re: OT: The highest score?

2005-05-02 Thread Kris Deugau
Roman Serbski wrote:
 What was the highest score you've ever seen? I received a message
 yesterday that was scored with 51.9(!). =)

Bah.  I've seen a few that scored ~55 with stock 2.64 scores.  With
SpamCopURI, and custom scores, they jumped to ~80.

I *think* I found one that scored ~80 on the stock 2.64 scores once, but
I'm not certain.

One weekend while I was particularly bored, I started putting together
an uberspam that would trip as many stock 2.64 rules as possible.  I got
about a third of the way through the rules before stopping, and the
score was pushing 300.  g

-kgd
-- 
Get your mouse off of there!  You don't know where that email has been!


Re: Observation on secondary MX

2005-05-02 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Niek writes:
 On 5/2/2005 1:48 PM +0200, Kevin Peuhkurinen wrote:
  spam going to that server!   I wonder if the spammers have cached the 
  old MX entry
 
 Jup.

BTW I've seen a few discussions recently where people rediscover
(sorry Kevin) these behaviours.   It might be worthwhile maintaining
some kind of spammer tactics knowledge base, on the wiki maybe?

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCdl2kMJF5cimLx9ARApA8AJ42elSJWP6Z5PI5VbhfcdwEns6TDACfbNai
0NyFJAgwR6XNjRA3nXWtVNA=
=7rBI
-END PGP SIGNATURE-



Re: system-wide AWL in SQL?

2005-05-02 Thread Michael Parker
On Mon, May 02, 2005 at 06:09:32PM +0200, Arvinn Løkkebakken wrote:
 
 Thanks. This shouldn't be all that much changes. Is there a patch for 
 getting this in 3.0.3?

Search bugzilla, there was a review patch for the 3.0 tree but it
never got enough votes to go in so I dropped it.  I think it should
still apply cleanly to 3.0.3.

 But maybe I should concider per-user AWL anyway. Sound a little awkward 
 to me though, as I have setup with system-wide bayes.
 What are the opinions?

I'm a firm believer in per-user dbs.  It's pretty unlikely that anyone
elses mailstream is going to exactly match yours, so you do your users
a disservice in trying to make them all fit the same mold.

Michael


pgpR0wwmeTYhK.pgp
Description: PGP signature


Re: Reporting scams to fraudwatchinternational

2005-05-02 Thread Jay Lee
Kris Deugau said:
 If you use a competent email client you will be offered the option
 of keeping a local copy, which saves the redundant recipient.

 Some people deliberately turn this off.  I'm not sure why.  (I can
 *sort* of understand it for mailing list mail, but not for direct
 mail.)

 Further, you should never assume that other recipients do not
 see BCCs.  That it entirely up to the settings of the recipient's email
 client.

 If your MUA is actually adding a real header with BCC: information,
 it's broken.  BCC isn't supposed to be a header in the usual sense; it's a
 way to tell your mail client to add extra SMTP RCPT TO: commands when
 sending the message.  The recipients should NEVER see those extra
 recipients.

 The only way someone might find out about BCC'ed recipients is if they
 are the server admin (or have access to the mail logs) and are willing to
 spend the effort to wade through the logs tracking the message ID to see
 who got a copy.  And that only applies in the case where the sender's SMTP
 server is also the destination;  and partially applies if there are
 multiple recipients at a remote domain.  If a remote domain only has one
 recipient in the list, they will NOT see any information regarding other
 recipients.

I've also seen broken mail servers that add headers based on the rcpt
to: so you should assume that recipients bcc or not on the same remote
server may be able to discover each other.  But if you're confident your
mail server/client isn't doing something stupid then there should be no
way for [EMAIL PROTECTED] to discover the message was BCCed to
[EMAIL PROTECTED]

Jay
-- 
Jay Lee
Network / Systems Administrator
Information Technology Dept.
Philadelphia Biblical University
--


Re: bayes problem

2005-05-02 Thread Matt Kettler
Payal Rathod wrote:

Hi,
I am looking after a friend's email server till he returns from his 
vacation. In his local.cf (SA 2.61 and yes I know it is time for 
upgrade) file he has,
bayes_path /etc/mail/spamassassin/bayes
use_bayes  1
score BAYES_50 0.001

Also bayes is well trained with,
-rw---1 root root  5263360 May  2 01:58 bayes_seen
-rw---1 root root  4210688 May  2 01:58 bayes_toks

All the spam mails are forwared to an account 'spam'.
Lately his users had started complaining that they received more spam 
than ever, so I checked his spam folder and grepped for bayes in 
headers. Surprisingly, out of 500 mails none showed bayes in headers.  
Does that mean bayes has stopped working?

Almost certainly. Or, it might only be working for root.

How is SA called? from procmail, or something else?

One major problem I see is that the bayes files have permissions of 400,
but the bayes DB is site-wide. You generally need to use bayes_file_mode
0777 when you specify a bayes_path in your local.cf. (If all users are
to use the same bayes DB, they all must be able to read/write the files
and have rwx to directories. Since these are deleted/recreated by SA
constantly you can't just use chmod)

If any non-root userID is used when invoking spamassassin, then the
bayes DB will not be accessible.

If he's using a MTA layer tool that always scans as root, this shouldn't
be a problem. However, if he's letting the user's procmailrc call
spamassassin or spamc this could be very troublesome. It's also trouble
if his MTA layer tool deprivleges itself to a non-root userid.

As for receiving more spam than ever. Well, you're using SA 2.61, which
IS massively outdated. Spam is a moving target, and SpamAssassin does
require reasonably frequent updates to keep abreast of changing trends.

I'll admit I'm using 2.64, but I'm also using the Mail::SpamCopURI
addon, and extensive custom rule tuning to keep up with it. Using an
out-of-the box 2.61 setup, even with bayes, hitrate is going to suffer.



Re: bayes problem

2005-05-02 Thread Matt Kettler
Payal Rathod wrote:

On Mon, May 02, 2005 at 02:11:19PM -0400, Matt Kettler wrote:
  

How is SA called? from procmail, or something else?



For .qmail file with a script ifspamh

  

One major problem I see is that the bayes files have permissions of 400,
but the bayes DB is site-wide. You generally need to use bayes_file_mode


[...]

Right. Do I need 777 or just 744?
  

In general 777. All users that need to access the bayes DB need to be
able to write to it, and create/delete temporary files and lock files.

This happens most extensively in the event of opportunistic expiry or
autolearning.

In your case I might do 744, just because the box isn't yours and the
admin might not want world-writable files (in which case he shouldn't be
using a global bayes DB).

However, 744  is really a half-baked solution and won't eliminate bayes
problems.

  

As for receiving more spam than ever. Well, you're using SA 2.61, 
which
IS massively outdated. Spam is a moving target, and SpamAssassin does
require reasonably frequent updates to keep abreast of changing 
trends.



How safe is it to change to the new version? His is a live server and we 
don't want to risk anything at all.
  

I wouldn't be doing extensive upgrades on a box you don't normally
administer. However, you should let him know that all versions from 2.60
through 2.63 are vulnerable to a DoS attack if a person sends you a
maliciously crafted email (it's a bug in the mime decoder which was
fixed in 2.64, as well as 3.0.0)




Re: autolearn=ham

2005-05-02 Thread Andy Jezierski

Robert Swan [EMAIL PROTECTED] wrote
on 05/02/2005 02:15:45 PM:

 How do I clear, or unlearn the bayes filter it seems that it is 
 picking up wrong. E-mail that is SPAM has autolearn=ham in the 
 header and this is wrong.
 
 I am Running SPAMASSASSIN 3.0.3 on a Linux Red
Hat 9 server. (just 
 upgraded) did this in version 3.0.2 also, unrelated I know.
 
 
 
 Thanks in advance,
 
 Robert

If it's a single message try: sa-learn --forget
 orginal.message.to.unlearn

If on the other hand you want to clear out the entire
bayes db because you think it's corrupted then use: sa-learn --clear

man sa-learn  for more info.

Andy

Re: autolearn=ham

2005-05-02 Thread Matt Kettler
Robert Swan wrote:

How do I clear, or unlearn the bayes filter it seems that it is picking
up wrong. E-mail that is SPAM has autolearn=ham in the header and this
is wrong.

  

Is it?

The autolearner uses the score the message would have gotten if bayes
was disabled, all userconf (ie: white/blacklist) rules were disabled,
and the AWL was disabled.

Post a X-Spam-Status header for the message in question and we can give
you some more specific advice, but just because the final score
indicated spam it doesn't mean the autolearner can't decide it's ham.
This is particularly true for message that got heavily hit on a
blacklist or AWL rule.

IMHO, the default ham learning threshold in current versions of SA is
begging for problems like this. I keep mine set at a tiny negative
score, but also have a collection of nonspam rules with tiny negative
scores. This way, autolearning as ham must be earned by hitting one of
the negative scoring rules, but the negative scoring rules can't be
abused by spammers as they collectively add up to less than -1.0.




Re: OT: The highest score?

2005-05-02 Thread Matt Kettler
Roman Serbski wrote:

Hi all,

What was the highest score you've ever seen? I received a message
yesterday that was scored with 51.9(!). =)


I hate to say it, but I've seen scores over 1000.0. All you need to do
is include a GTUBE :)

USER_IN_BLACKLIST will also jack it up quite a bit with a +100 score.

GTUBE and blacklists aside, my highest spam score in recent history
(past 4 weeks) was 45.74:

score=45.74, required 5, autolearn=spam, AB_URI_RBL 1.00, BAYES_99 5.40,
DCC_CHECK 1.00, DRUGS_ERECTILE 1.00, HTML_70
_80 0.10, HTML_IMAGE_ONLY_04 1.00, HTML_MESSAGE 0.10,
INFO_GREYLIST_NOTDELAYED -0.01, JP_URI_RBL 1.00, LOCAL_RCVD_HELO_XIP
1.50, MIME_HTML_ONLY 0.32, MIME_HTML_ONLY_MULTI 1.10, NO_DNS_FOR_FROM
1.65, OB_URI_RBL 2.10, RAZOR2_CF_RANGE_51_100 0.20, RAZOR2_CHECK 1.05,
RCVD_IN_CHINA_KR 2.50, RCVD_IN_DSBL 0.71, RCVD_IN_NJABL_PROXY 2.34,
RCVD_IN_SORBS_MISC 0.00, RCVD_IN_XBL 4.92, SARE_RAND_2V 1.50,
SPAMCOP_URI_RBL 3.00, SUBJ_VIAGRA 4.10, VIAGRA_ONLINE 4.06, WS_URI_RBL
2.10, X_MESSAGE_INFO 2.00

But I tend to lean towards lowering rule scores from their defaults. I
tend to find some SARE rules, etc are a bit overly aggressive in scoring
for my tastes.




Re: autolearn=ham

2005-05-02 Thread James R
Robert Swan wrote:
How do I clear, or unlearn the bayes filter it seems that it is picking 
up wrong. E-mail that is SPAM has autolearn=ham in the header and this 
is wrong.

 

I am Running SPAMASSASSIN 3.0.3 on a Linux Red Hat 9 server. (just 
upgraded) did this in version 3.0.2 also, unrelated I know.

 

 

 

Thanks in advance,
 

Robert
 

 

 

 

 

 

Peace he would say instead of goodbyepeace my brother.
 

Remove the bayes db. What are you using? File based? SQL based? Need 
more info about that. Also in your case, you may either A) turn off 
autolearn B) change thresholds for spam/ham so this is unlikely to 
happen again.

--
Thanks,
James


Re: autolearn=ham

2005-05-02 Thread Kelson
Matt Kettler wrote:
Robert Swan wrote:
How do I clear, or unlearn the bayes filter it seems that it is picking
up wrong. E-mail that is SPAM has autolearn=ham in the header and this
is wrong.
Is it?
If it's spam being learned as ham, then yes, it is wrong.  Autolearn may 
be doing what it's supposed to, but it's still a false negative.  An 
expected one, but a misclassification nonetheless.

Robert: just running sa-learn --spam will unlearn the message, then 
re-learn it as spam.

--
Kelson Vibber
SpeedGate Communications www.speed.net


RE: autolearn=ham

2005-05-02 Thread Robert Swan
Hello all, I am using file based bayes DB and do Not have autolearn
enabled, I do manual learning using IMAP Spam Begone.

Robert
 
 
 
 
 
 
Peace he would say instead of goodbyepeace my brother.

-Original Message-
From: James R [mailto:[EMAIL PROTECTED] 
Sent: Monday, May 02, 2005 3:24 PM
To: users@spamassassin.apache.org
Subject: Re: autolearn=ham

Robert Swan wrote:
 How do I clear, or unlearn the bayes filter it seems that it is
picking 
 up wrong. E-mail that is SPAM has autolearn=ham in the header and this

 is wrong.
 
  
 
 I am Running SPAMASSASSIN 3.0.3 on a Linux Red Hat 9 server. (just 
 upgraded) did this in version 3.0.2 also, unrelated I know.
 
  
 
  
 
  
 
 Thanks in advance,
 
  
 
 Robert
 
  
 
  
 
  
 
  
 
  
 
  
 
 Peace he would say instead of goodbyepeace my brother.
 
  
 
Remove the bayes db. What are you using? File based? SQL based? Need 
more info about that. Also in your case, you may either A) turn off 
autolearn B) change thresholds for spam/ham so this is unlikely to 
happen again.

-- 
Thanks,
James



Re: Adding addresses to blacklist manually

2005-05-02 Thread Matt Kettler
Gregory P. Ennis wrote:

Everyone,

I installed 3.03 this afternoon and everything looks good.  

I finally decided to set up a user alias e-mail address to take
advantage of the following command:

spamassassin --add-to-blacklist  /tmp/$FILENAME

When I run this command from root I get a response of 1 message
examined.  

I can not figure out where or what blacklist this command is adding the
address; I would like to be able to check on the results in order to
make sure it is working.  Any help is appreciated.

Thanks,

Greg
  

the --add-to-blacklist command only manipulates the AWL statistics for
that sender.  It biases that senders AWL statistics by pretending they
sent a message that scored +100 and recording it in the AWL db. This
effect is somewhat temporary, as over time the number of emails will
reduce the impact this has. It's largely intended for correcting errors
in the AWL, and not intended to be used as a real blacklist mechanism.

There are no command line options that actually blacklist a sender with
a static blacklist_from command.

If you want to truly blacklist an address, you have to do it using a
blacklist_from command in your /etc/mail/spamassassin/local.cf or
similar config file.




Re: autolearn=ham

2005-05-02 Thread Matt Kettler
Kelson wrote:

 Matt Kettler wrote:

 Robert Swan wrote:

 How do I clear, or unlearn the bayes filter it seems that it is picking
 up wrong. E-mail that is SPAM has autolearn=ham in the header and this
 is wrong.


 Is it?


 If it's spam being learned as ham, then yes, it is wrong.  Autolearn
 may be doing what it's supposed to, but it's still a false negative. 
 An expected one, but a misclassification nonetheless.


True. I mis-read Robert's message as implying that the SA autolearn
mechanism was going haywire and randomly learning spam as ham for no
clear reason. Hence my answer.

Sorry for any confusion it may have created.

(The rest of the message is generally correct, albeit topically
misdirected. The facts about how the autolearner works in my message are
correct, albeit some details are omitted for simplicity. Opinions about
the threshold are my personal opinions, but they are my actual opinions.)



Re: Question about Bayes training - mozilla specifically

2005-05-02 Thread Stuart Johnston
Bookworm wrote:
I've read through the archives several times, and hoped that over the 
last year or so someone would build the functionality, or at least 
mention it one way or another - I haven't seen it.

Is there any way to take an already trained Mozilla bayes structure and 
hand it directly off to SpamAssassin?  For me, at least, that would 
eliminate almost all of the spam my server is receiving - Mozilla spots 
it instantly, but SpamAssassin is missing at least half.
Here is a project that will export the Mozilla Bayes tokens which would 
at least be the first step.  I'm not sure how hard it would be to then 
import them into SA.

http://bayesjunktool.mozdev.org/


Re: OT: The highest score?

2005-05-02 Thread Kelson
Roman Serbski wrote:
What was the highest score you've ever seen? I received a message
yesterday that was scored with 51.9(!). =)
Unfortunately I just purged the spamtraps, but that's what log files are 
for.  Here's the highest one from this week:

Score: 63.173
BAYES_99
BIZ_TLD
DOMAIN_RATIO
FORGED_IMS_HTML
FORGED_IMS_TAGS
FORGED_MUA_IMS
FORGED_YAHOO_RCVD
FROM_ILLEGAL_CHARS
HEAD_ILLEGAL_CHARS
HTML_90_100
HTML_FORMACTION_MAILTO
HTML_IMAGE_ONLY_20
HTML_IMAGE_RATIO_02
HTML_MESSAGE
LOCAL_SURBL_MULTI
MIME_HTML_ONLY
MIME_HTML_ONLY_MULTI
MISSING_MIMEOLE
MPART_ALT_DIFF
MSGID_SPAM_CAPS
MSGID_YAHOO_CAPS
RAZOR2_CF_RANGE_51_100
RAZOR2_CHECK
RCVD_BY_IP
RCVD_DOUBLE_IP_SPAM
RCVD_HELO_IP_MISMATCH
RCVD_IN_DSBL
RCVD_IN_NJABL_PROXY
RCVD_IN_NJABL_RELAY
RCVD_IN_SORBS_HTTP
RCVD_NUMERIC_HELO
SUBJ_ILLEGAL_CHARS
URIBL_OB_SURBL
URIBL_SBL
URIBL_WS_SURBL
The only custom rule in there is LOCAL_SURBL_MULTI, which adds an extra 
3 points if 3 or more SURBLs fire.  So technically this should only have 
been 60.173.

--
Kelson Vibber
SpeedGate Communications www.speed.net


Re: Question about Bayes training - mozilla specifically

2005-05-02 Thread Michael Parker
On Mon, May 02, 2005 at 03:44:25PM -0500, Stuart Johnston wrote:
 Bookworm wrote:
 I've read through the archives several times, and hoped that over the 
 last year or so someone would build the functionality, or at least 
 mention it one way or another - I haven't seen it.
 
 Is there any way to take an already trained Mozilla bayes structure and 
 hand it directly off to SpamAssassin?  For me, at least, that would 
 eliminate almost all of the spam my server is receiving - Mozilla spots 
 it instantly, but SpamAssassin is missing at least half.
 
 Here is a project that will export the Mozilla Bayes tokens which would 
 at least be the first step.  I'm not sure how hard it would be to then 
 import them into SA.
 
 http://bayesjunktool.mozdev.org/
 

The bayes backup/restore format is fairly stable and it is pretty easy
to create a restore file from alternate sources (that is one of the
reasons it was written).  It's possibly not documented as well as it
should be, but no one has ever asked before so

You will need the following bits of information:

1) The Raw Token (which needs to be turned into an SHA1 and then into
a hex representation, which is probably too simple of an explanation
for what is actually going on, so probably needs some more detail and
maybe a helper function in the SA code for those that might want to
attempt such a thing, not to mention a period in this sentence
somewhere.)

2) The atime value for that token - SA bayes works off access times
   for tokens, so you need to know the last time it was useful, in a
   pinch you can use current time but it is not optimal.

3) The ham count for the token

4) The spam count for the token

5) Number of spam msgs learned

6) Number of ham msgs learned

7) List of msg ids and if they were learned as ham or spam (this can
   be optional but no optimal since it would allow for re-learning of
   msgs which could throw off your spam/ham counts)

One you have all that, you throw it into a formatted restore file and
then run sa-learn --restore and you are all set.

If someone has a dump of one of these files, and it's got all the
required information I'd be happy to take a look to see how feasible
it would be.

Michael


pgpBqQpKyrzwv.pgp
Description: PGP signature


Re: Question about Bayes training - mozilla specifically

2005-05-02 Thread Stuart Johnston
Michael Parker wrote:
On Mon, May 02, 2005 at 03:44:25PM -0500, Stuart Johnston wrote:
Bookworm wrote:
I've read through the archives several times, and hoped that over the 
last year or so someone would build the functionality, or at least 
mention it one way or another - I haven't seen it.

Is there any way to take an already trained Mozilla bayes structure and 
hand it directly off to SpamAssassin?  For me, at least, that would 
eliminate almost all of the spam my server is receiving - Mozilla spots 
it instantly, but SpamAssassin is missing at least half.
Here is a project that will export the Mozilla Bayes tokens which would 
at least be the first step.  I'm not sure how hard it would be to then 
import them into SA.

http://bayesjunktool.mozdev.org/

The bayes backup/restore format is fairly stable and it is pretty easy
to create a restore file from alternate sources (that is one of the
reasons it was written).  It's possibly not documented as well as it
should be, but no one has ever asked before so
You will need the following bits of information:
1) The Raw Token (which needs to be turned into an SHA1 and then into
a hex representation, which is probably too simple of an explanation
for what is actually going on, so probably needs some more detail and
maybe a helper function in the SA code for those that might want to
attempt such a thing, not to mention a period in this sentence
somewhere.)
2) The atime value for that token - SA bayes works off access times
   for tokens, so you need to know the last time it was useful, in a
   pinch you can use current time but it is not optimal.
3) The ham count for the token
4) The spam count for the token
5) Number of spam msgs learned
6) Number of ham msgs learned
7) List of msg ids and if they were learned as ham or spam (this can
   be optional but no optimal since it would allow for re-learning of
   msgs which could throw off your spam/ham counts)
One you have all that, you throw it into a formatted restore file and
then run sa-learn --restore and you are all set.
If someone has a dump of one of these files, and it's got all the
required information I'd be happy to take a look to see how feasible
it would be.
There are some examples in XML format here:
http://bayesjunktool.mozdev.org/installation.html
Here's a sample:
?xml version=1.0 encoding=ISO-8859-1?
!DOCTYPE tokenfile SYSTEM trainer_xml.dtdtokenfile
good_msgs38/good_msgs
bad_msgs320/bad_msgs
token
name$/name
good4/good
bad18/bad
/token
...
atimes and msgids are not included.


Re: My first rule

2005-05-02 Thread Joe Kletch
On May 2, 2005, at 4:39 PM, [EMAIL PROTECTED] wrote:
Joe Kletch wrote:
So excited--I created my first rule.
Congratulations!
It ran through lint with no
errors and seems to be achieving the requested outcome: move messages
from this sender to the recipients Spam folder, but do not reject (I
have simscan rejecting scores over 18 points). The Threshold for
moving to the Spam folder is 4.0 points.
Could the list take a peak at this and make sure I didn't create a
rule that will screw up everything from AOL or advise a better way to
handle this. Thanks for your help.
mail spamassassin $ cat move_to_spam.cf
headerSLOWHND67 From:addr =~ /[EMAIL PROTECTED]/i
describe  SLOWHND67 Marissa's mail buddy into Spam folder
score SLOWHND67 3.0
You're guilty of a breach-of-etiquette here - publishing an email 
address without permission.

You anchor the string at the end with a $ - which is fine - but you 
don't anchor it at the beginning with a ^, which could help.  If 
anyone ever emails Marissa from an address that ends in 
[EMAIL PROTECTED] - for example, [EMAIL PROTECTED] - this rule 
will false-positive.

To fix, use this regexp: /[EMAIL PROTECTED]/i
As an aside - does SpamAssassin allow things like
lc From:addr eq '[EMAIL PROTECTED]'
?  Probably not...
Are you trying to silently discard all mail from this person?  If so, 
I think a client-side rule would be most appropriate.  That way you 
could claim plausible deniability if he found out. ;-)

Sorry about the breach I should have known better. I'll consider this 
my first foray into regexp as well--been promising to get a handle on 
this for sometime now but alas can't seem to make it happen. Not 
trying to discard--just let the user deal with the hassle of finding 
her joke buddies email in the spam folder as a hint. Client is out of 
control in terms of the joke and personal emails from the CEO down and 
the IT Director is slowly trying to get some point across to the users 
in his own way.

Joe Kletch