from:"David B Funk"

Re: [SAtalk] 'spamassassin -d' not stripping SA reports from email

2004-01-29 Thread David B Funk

On Wed, 21 Jan 2004, Matt Kettler wrote:

 At 10:41 PM 1/20/04 -0600, C. Bensend wrote:
 Is the problem that I'm _forwarding_ the tagged emails from one host
 to the other?  I don't have the capability to bounce, I can only forward.

 A forwarded message is a brand new message. That brand new message is NOT
 sa tagged, even though it may contain some SA markups because the other
 message was tagged.

 Once you've forwarded a message, there's generaly no way to reconstruct the
 original.

 All new headers are created, Mime sections are changed, the body is
 modified with things like forwarded message from, you mailclient may wind
 up re-encoding the HTML, etc. To a reader, it looks a lot the same, but to
 a mailer, it bears little resemblance to the original.

On the other hand, if you forward the message (complete with full
headers), as an attachment, and then extract that attachment at the
other end, you should be able to use that. (IE use the new message
as a 'carrier' for the one that you want to re-learn).

There is a MIME-type Message/RFC822 which is intended precisely
for this kind of job.

This does depend upon the sending mail client being able to generate a
proper MIME attachment forward. You should be able to use a tool
like metamail to automate the extraction at the receiving end.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] too much spam...

2004-01-28 Thread David B Funk

On Mon, 26 Jan 2004, Paul Diaguila wrote:

 No Bayes db yet, but I would think the one rule would score it a 5

 Paul

 Covington, Chris wrote:

 Your Bayes must be hosed if what you think is spam gets BAYES_00.
 
 Chris
[snip..]
 Greetings
 
 Using SA Ver. 2.63 with Mimedefang, and still quite a bit of spam is
 getting through.  Have all the current BigEvil, ect...   As an example,
 a rule is in place in local.cf
 
 header   SUBJECT_ENCODED_MY_TEST  Subject:raw =~ /=\?.*\?=/i
 describe SUBJECT_ENCODED_MY_TEST  Subject begins with =?
 scoreSUBJECT_ENCODED_MY_TEST  5.0
 
 When a message comes in:
 
 Subject:
 =?ISO-8859-1?b?V2UgaGF2ZSB3aGF0IHlvdSBuZWVkIC0gQ2hlYXBlc3QgcHJlc2NyaXB0a
 W8vbnMgb24gdGhlIGludGVybmV0?=
 Content-Type: multipart/alternative;
 boundary==_NextPart_000_0CAC_A6ABA171.138272BD
 X-Spam-Score: 3.422
 BAYES_00,FORGED_OUTLOOK_TAGS,HTML_50_60,HTML_IMAGE_ONLY_02,HTML_MESSAGE,
 HTML_TAG_BALANCE_BODY,RM_rb_ANCHOR,RM_rb_BODY,RM_rb_HTML,SUBJECT_ENCODED
 _MY_TEST
 X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang)
 
 ???
 
 thanks...
 
 Paul

Paul,
Look at the 'X-Spam-Score' report line.
If you read it you will see 'SUBJECT_ENCODED_MY_TEST' so that rule -did-
hit. You will -also- see 'BAYES_00' So there is a Bayes db somewhere and
it said this is not spam, so I will score this message at -5.6 which
nullified your SUBJECT_ENCODED_MY_TEST sore.

Bottom line is you do have Bayes running somewhere on your system
and it is not properly trained. Rather that writing more rules, I'd
suggest that you concentrate on getting it trained and I think that
you'll be happier.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Rules Du Jour v 1.07b

2004-01-27 Thread David B Funk

On Fri, 23 Jan 2004, Smart,Dan wrote:

 Humm

 This command works every time from command line, but not passed as a param
 from SA_RESTART.
 postfix stop ; sleep 15 ; /etc/init.d/spamassassin restart ; postfix start

 It runs the postfix stop and then quits.  Any idea why?  I can create a sed
 that patches the rules_du_jour each time putting the commands in one at a
 time in the restart if block, which does work, but passing it as the
 SA_RESTART parameter would be really nice.

 Dan


Run it in a sub-shell, put the whole thing in parens:

(postfix stop ; sleep 15 ; /etc/init.d/spamassassin restart ; postfix start)

depending upon how that is parsed by your command processor, you may have
to escape them. EG:

\(postfix stop ; sleep 15 ; /etc/init.d/spamassassin restart ; postfix start \)

This is basically a shell-scripting issue.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Using Mail::SpamAssassin to clean a message

2004-01-23 Thread David B Funk

On Wed, 21 Jan 2004 [EMAIL PROTECTED] wrote:

 I want to be able to take an email message that may contain MIME and HTML
 and to strip it down to basically nothing but text. (I know that
 SpamAssassin already does this in large part so that it can analyze the
 message properly.) So I'm not actually using SpamAssassin to detect spam.
 Instead, I just want access to a relatively clean text message to scan for
 another purpose.

 Here is what I have tried, without success:
[snip..]
 I just want access to SpamAssassin's email cleaning code that will make it
 easy for me to access the text of any arbitrary email message (e.g., a
 message with MIME parts).

Why not use a tool specifically designed for that purpose? (MIMEDefang).

SA is usually configured to skip messages over a certain size so to
not waste time on large messages (most spam is under 200k bytes in size
and SA processing time goes up rapidly with increasing message size).

In an arbitrary MIME message it is not unusual to find large attachments
(images, video, sounds, etc) which would cause them to bypass SA.
Thus SA may not be the best tool for this job.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: FW: [SAtalk] How to stop this kind of stuff?

2004-01-23 Thread David B Funk

 |-Original Message-
 |From: Evan Platt [mailto:[EMAIL PROTECTED]
 |Sent: Wednesday, January 21, 2004 15:03
 |To: SpamAssassin
 |Subject: Re: [SAtalk] How to stop this kind of stuff?

Real easy, this is a predictable spamhaus, Empire Towers
Go check the records on this outfit at http://www.spamhaus.org/rokso

Just add two custom rules to your SA kit:

One that hits any uri reference to opt-u1.biz hard (I give it a 8.0)

The other looks for any of optny?.us or optny?.biz (where the ? is a
digit) in the 'From' header and hits that hard too.

uri L_SPAMHAUS1 /\b(?:opt-u1\.biz|certrewards\.tc|funding-advisors\.com)\b/i
describe L_SPAMHAUS1Contains URI refering to Empire Towers Spamhaus
score L_SPAMHAUS1   8.0

header L_SPAMHAUS2  From =~ /\boptny\d\.(?:us|biz)\b/i
describe L_SPAMHAUS2Sent from Empire Towers Spamhaus
score L_SPAMHAUS2   8.0

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Selectivly disabling DYNABLOCK

2004-01-22 Thread David B Funk

On Thu, 22 Jan 2004, Peter McGarvey wrote:

 Greetings all,

 I have a mailserver which handles all my incomming and outgoing mail.

 Outgoing mail (stuff I send) is passed to the server via ASMTP.
 Incomming mail (stuff sent to me) comes in via SMTP.  There is
 absolutely no way my server will relay mail unless it arrives via ASMTP.

 However, when I send mail to another account on my box, the global
 SpamAssassin (2.61) adds 2.546 to my mail score because I happen to
 catch the RCVD_IN_DYNABLOCK rule.  Now, I'm happy for all SMTP mail to
 face up to this rule, but I'd rather my ASMTP mail didn't.

 Other than obsfucating my Received: headers to omit the IP from mail
 which is sent via ASMTP, is there any way to get RCVD_IN_DYNABLOCK to
 backoff?

 I did think about adding trusted networks.  But I don't necisarrily know
 my IP.  It also blunts SA somewhat if I add vast tracts of IP space.

Silly question; if it's something that you're sending out, why
SA scan it at all?

I configured our MTA to recognize locally generated messages and not
bother SA scanning them. (reduces LLuser complaints about message
sending delays ;).

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [WL] Re: [SAtalk] More obfuscation

2004-01-21 Thread David B Funk

On Tue, 20 Jan 2004, Charles Gregory wrote:

 Right now, there would be no statistics, because the text obfu has just
 started. But as a side note, we don't have the disk space to run Bayes for
 all our users though I'm getting awfully tempted to talk the boss into
 an extra disk or two. So for now, no Bayes here :-(

A site-wide Bayes isn't quite as effective as a per-user Bayes,
but still worthwhile. It will help to catch the obfu junk (and
only takes 10~20 Mbytes of disk).

So now you have no excuse, get Bayes-ing.  ;)


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] trusted_networks being ignored at times?

2004-01-21 Thread David B Funk

On Wed, 21 Jan 2004, Justin Mason wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 Will McCutcheon writes:
 I am running SpamAssassin 2.61 with Sendmail 8.12.8 using Procmail 3.22.
[snip..]
 A's IP as being in an RBL of dynamic IP's, despite my setting in
 /etc/mail/spamassassin/local.cf instructing it to trust that IP.  The
 documentation for Mail::SpamAssassin::Conf seems to pretty clearly say
 that RBL checks will never be performed on any trusted IP's, but it
 certainly appears to be occurring here.

 Yep -- this is a case where it will occur.  This is because the
 mail has gone *outside* of the trusted zone -- and the untrusted
 host B could be under the control of spammer who just forged a
 Received header to make it look like it came from the trusted
 host A.  We can't trust that.

 Same for the next case btw.

 - --j.

Thus, if you -can- trust server 'B' add it to your trusted_networks
list, so the whole chain will be trusted and all will be well. ;)


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Turning off Habeas?

2004-01-20 Thread David B Funk

On Tue, 20 Jan 2004, Terry Shows wrote:

Maybe it is good for -16, but in every case I looked at that passed thought
with habeas set, none of them set the violator, and every single one was
flagrantly spam.
[snip..]

The way it is now, it is just another header that can be added by a spammer,
and as long as nobody turns him in to habeas.net, he is guaranteed an easier
^^
path for his junk. (if I am missing something, please feel free to educate
me. Just be nice when you do it)

Terry.

True, have you been turning in those violators that you found?
I've turned in every one that I found and a recheck of the HIL shows
that they were listed by Habeas.

To paraphrase a theologian:

The only thing necessary for evil (spammers) to succeed is for good
to do nothing.

One quick rule hack that has worked wonders for me for this issue:

uri L_FAKE_MED_SITE
//\b(?:valuepointmeds\.biz|pharmacourt\.biz|pharmawharehouse\.biz|mypillsource\.com|gowebrx\.com|rxsourceonline\.com|getwebrx\.com)\b/i
describe L_FAKE_MED_SITEWeb site of fake meds sellers
score L_FAKE_MED_SITE 3.0
meta L_FAKE_MED_ABUSER ( L_FAKE_MED_SITE HABEAS_SWE )
describe L_FAKE_MED_ABUSER Site selling fake meds and abusing habeas
score L_FAKE_MED_ABUSER 10.0

The first part is an adaptation of BigEvil, the second part nullifies
their abuse of habeas.

--
Dave Funk University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{

---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] List moderation and spam removal

2004-01-20 Thread David B Funk

On Tue, 20 Jan 2004, Sean McCrohan wrote:

[snip..]
 The problem is that the moderation request the list sends to me gets
 wrapped in MIME, and SA (as currently installed) doesn't do a very good
 job of analyzing it, in part because there's a set of instructions stuck
 on the front that are the same regardless of whether the message is spam
 or ham. What can I do to point SA at the right parts of the message to
 pay attention to?

 --Sean

Use MIMEDefang to unwrap it and then feed just the message part to SA.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] spamassassin on Gateway server (MX)

2004-01-16 Thread David B Funk

On Fri, 16 Jan 2004, Ross Vandegrift wrote:

 On Thu, Jan 15, 2004 at 02:20:16AM -0600, David B Funk wrote:
  If you SMTP reject the spam, it never hits your queue, so no problem
  with the garbage piling up and no bombarding poor innocent 'joe-job'
  victims. It's better than auto-deleting spam, as a legit message
  that is accidentally mis-identified as spam gets returned to the
  true sender and they can remedy the situation rather than wondering
  what happened to their message.

 I hope you see the contradiction in the above paragraph.

 no bombarding poor innocent 'joe-job' victims
 gets returned to the true sender

 Easily forged, easily joe-jobbed, better to never reject.

No contradiction if you are -truely- using SMTP reject on your
incoming mail gateway.
To make sure that you understand what I'm talking about, I'll explain
what what happens in a true SMTP reject.

The remote client connects to my incoming SMTP gateway machine.
If you read RFC-2821, you'll see the SMTP protocol steps that
the remote machine and my (or any) SMTP server go thru. During
that sequence, if my server returns a '550' response code, the
transaction is terminated, the remote machine is left with the
message and my server has no responsibility for it. (IE the
message never even gets past my 'front door').

This means that it is -NOT- in my mail queue, there is nothing
for my server to even think about trying to return to somebody.

If I reject a message, the sender and recipient addresses (forged
or real) are of no consequence to my server. Think of it like having a
'smart' ip-filter on your mail server. The remote machine is prevented
from handing the message to you at all.

In the case of a legit message that FPs above my reject threshold,
the sender's mail server is left holding the message and it has the
responsibility of returning it to them, so they get their messsage back.


There is one special case where this does not work, that of the
filtering machine not being the border gateway. IE if you (or some
other party acting in your behalf such as your ISP) have some
external server that accepts everything addressed to you and then
forwards the messages in to your filter machine. Then if your filter
machine rejects the trash, the external server is stuck with the
garbage and it will try to bounce them back.
However the OP was asking about an incoming filter machine and that
was the case that I was addressing.


 I had spamass-milter setup to reject for a while, and I found that my
 queues were always full.  As I looked into it, I realized they were
 full of errors, just waiting to be crapflooded onto some poor-sod-like-me's
 mail server.

If you find crap messages (ones above your reject threshold) in your
queue on the filter machine, then you are -not- doing true SMTP rejects
(regardless of what your configuration claims).

Now I have my milter set to tag spam at 6.0 and reject at 20.0.
So there is a range of spam (6-20) that will be in my queue to
be delivered. (I do that in case of FPs.)

I've been hacking sendmail stuff for almost 20 years now, started
doing anti-spam filtering in the mid-90's. At that time, doing SMTP
rejects was the only thing that was easy to do, so it is second
nature to me.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Acronym Update

2004-01-16 Thread David B Funk

 --On Friday, January 16, 2004 12:13:21 -0600 Carl Chipman
 [EMAIL PROTECTED] wrote:

  For the new people on the list, I was wondering what the following
  acronyms mean:
 
  LART
 Luser Attitiude Readjustment Tool
 Reporting the offending user to [EMAIL PROTECTED]
  UBE/UCE
 Unsolicited Bulk Email/Unsolicited Commercial Email
 (SPAM).

 
  Are the acronoyms in the FAQ?

The defs of these and -many- other arcane net-talk terms and
abbreviations can be found in The Jargon File (proper name please)
(AKA The New Hacker's Dictionary). A work-in-progress for over
15 years with a cast of thousands. ;)

It's available in print and on line in many forms.
Home page: http://www.jargon.org/

Googleable directly with the search modifier of site:www.catb.org
EG to search for the def of 'LART' do a google search of:

  LART+site:www.catb.org

It's beginnings are lost in the mists of time (some say at Stanford
SAIL lab, others claim MIT) but for the last 14 years or so it's pretty
much been the project of Eric S. Raymond (a major net persona in his own
right, check out: http://www.faqs.org/docs/artu/ ).

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] spamassassin on Gateway server (MX)

2004-01-15 Thread David B Funk

On Wed, 14 Jan 2004, Matt Kettler wrote:

 At 09:56 AM 1/15/04 +0545, Pankaj wrote:
 I feel I am being a bit misunderstood. I simply need to configure my MX to
 have SpamAssassin running.I do not need any antivirus .
 How do I do it ? Running RedHat Linux 8.1 and Sendmail 8.12.10 in it.

[snip..]
 Thus, you NEED another tool to integrate it into the MTA layer.. most MTA
 layer integrations are capable of calling a variety of spam and virus
 scanners, but you can just neglect the virus scanner part.

 However, the answer of using amavis or mailscanner to handle the
 integration is good advice.. I personally use mailscanner.

 Some people also use spamass-milter, which installs itself as a sendmail
 milter so it's more tightly integrated, but I've heard more complaints
 about it than praise.

However a sendmail milter -does- let you do something that you cannot
do with mailscanner. You can do real-time SMTP rejects of spam, not
just delete or bounce it.

If you SMTP reject the spam, it never hits your queue, so no problem
with the garbage piling up and no bombarding poor innocent 'joe-job'
victims. It's better than auto-deleting spam, as a legit message
that is accidentally mis-identified as spam gets returned to the
true sender and they can remedy the situation rather than wondering
what happened to their message.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] no subject

2004-01-15 Thread David B Funk

On Wed, 14 Jan 2004, Christopher Tarricone wrote:

 The permissions on my bayes_journal and bayes_toks files keep changing. Has
 anyone else encoutered this problem?
[snip..]
 I look in /usr/share/spamassassin/db/ and behold! The permissions are:

 [EMAIL PROTECTED] db]# ll
 total 23012
 -rw---1 root root 3057 Jan 14 16:49 bayes_journal
 -rw---1 vpopmail vchkpw   21061632 Jan 14 16:50 bayes_seen
 -rw---1 vpopmail vchkpw   10211328 Jan 14 16:50 bayes_toks
 [EMAIL PROTECTED] db]#

 If I do a processes listing
 [EMAIL PROTECTED] db]# ps -aux |grep spam
 vpopmail  3215  0.1  5.3 31940 27244 ?   SJan13   1:49
 /usr/bin/spamd -m 20 -s -Q -q -x -d -v -u vpopmail -F 0
 root  1445  0.0  0.1  3760  568 pts/3S16:53   0:00 grep spam
 [EMAIL PROTECTED] db]#

 It seems to me that SpamAssassing is running as the user vpopmail so I am
 not sure how the permissions are getting changed so often.

Are you doing any 'sa-learn' runs as root? (either by hand or
via some kind of automated batch/cron job)?
If the bayes_journal file rolls over during a 'sa-learn' run
it will create a new one, as root.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Bayes NFS safe?

2004-01-15 Thread David B Funk

On 15 Jan 2004, Rocky Olsen wrote:

 I too would greatly appreciate any information - as we have 9 boxes
 doing Spam scanning. Anyone tried this?


 On Thu, 2004-01-15 at 13:31, Mike Jackson wrote:
  If you have multiple SA filtering boxes, is it safe to NFS-mount a partition
  with a system-wide Bayesian database and share it across all the boxes? In
  our setup, we have three boxes dedicated to doing SA filtering, all running
  the same version of FreeBSD, and it sure would be nice to be able to do this
  because the SQL-based Bayesian filtering doesn't quite look ready for prime
  time.
 
  Mike Jackson
  Technical Manager, efn.org
  www.efn.org


I posted a note about this precise question a few months ago,
please check the archives.

Short answer, do-able if you honor the Berkeley-DB requriements for
access over NFS (see the sleepycat site for exact details), that you
have all the clients only reading the DB, use journaling, and do the
updates on the machine that actually has the DB stored on it.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] The CAN-SPAM act....

2004-01-15 Thread David B Funk

On Wed, 14 Jan 2004, Jonathan Nichols wrote:

 Did the CAN-SPAM act really take away a citizen's right to sue spammers?
 I'd like to write to this marketing company and have them provide me
 with absolute proof that I signed up for *anything* at all. (they won't
 be able to) I think the whole Write us to unsubscribe business is just
 a big sham.

No, the spammers would -love- to hear from you. Then they have verified
proof that that particular e-mail address goes to a human who reads
and thus is valuable fodder for selling on addres lists.

I have a couple of spam-trap addreses that have never been used anywhere
-except- plugged into unsubscribe links and messages. It's rather
amusing to see where they end up getting spammed from.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] How to find values assigned to different tests?

2004-01-13 Thread David B Funk

On Tue, 13 Jan 2004, Mitch (WebCob) wrote:

 I thought there was a patch that added the score to the headers... then you
 didn't have to go looking - has anyone seen it lately?

 m/

Depending upon how you have SA integrated into your mail system,
it may only require a change to your 'local.cf', no patch needed.

In my local.cf I have:

# Customize the report that 2.60 introduced in all messages
#
clear_report_template
report Checker-Version SpamAssassin _VERSION_ (_SUBVERSION_) on _HOSTNAME_
report Content analysis details:   (_HITS_ points, _REQD_ required, 
autolearn=_AUTOLEARN_)
report
report  pts rule name  description
report   
report _SUMMARY_

This adds a header to each message that can look like:

X-Spam-Report: Checker-Version SpamAssassin 2.60 (1.212-2003-09-23-exp) on 
hostname.icaen.uiowa.edu
Content analysis details:   (5.2 points, 6.0 required, autolearn=no)
 pts rule name  description
 
 2.1 BAYES_90   BODY: Bayesian spam probability is 90 to 99%
[score: 0.9569]
 0.1 BIZ_TLDURI: Contains a URL in the BIZ top-level domain
 3.0 L_EvilList_29  URI: Generated L_EvilList_29

So you can see which rules hit and what their contribution to the
score was.

Note that Mike Leone is using amavisd-new, which has it's own method
of adding headers to messages and will ignore the SA report template
config. Thus for amavisd-new users, there may need to be some kind
of modification to that code (but that isn't a SA issue).

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Configuring SA to be more aggressive..

2004-01-13 Thread David B Funk

On Tue, 13 Jan 2004, Mail Monitor wrote:

 Hi,

 We have installed SA 2.6 on linux RH 9.0 on a mail
 gateway. The total mail transaction/day through this
 server is 75,000 and spam mails caught by SA is around
 10-15%. But spam mails are still getting through, we
 have not implemented razor, bayesian rules etc. The
 number of spamc processes forked is high(above 200).
 Because of this milter goes into error state bypassing
 SA.

 Please do let us know if SA could be made stable
 for this heavy load and also ways to make it more
 aggressive. Any help/suggestions would be appreciated.

First thing, beef up your hardware configuration.
To handle 75,000 messages/day, you need a 1 second per message
average processing rate.
If you aren't running any network checks (No DSBL, razor, dcc, etc)
and not running any bayes, this should be easily acheivable with
reasonable hardware ( 1Ghz PIII or faster, 512Mb RAM).
My bet is that you don't have enough RAM in that box.

Once you've got a stable base config that can keep up with
your load, add Bayes (get it stable), then turn on net checks.

Bayes will add CPU/RAM demands but it will be a deterministic
increase. Net checks don't take much addtional CPU/RAM but they
introduce non-deterministic delays due to network problems and
load on the remote servers, etc.

When running with no net checks, I usually see sub-second processing
time (base SA + Bayes) but with net checks on, times vary from
sub-second to 30 seconds per message soely due to remote
delay factors (30K messages/day).

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Limitation in SA (Re: [SAtalk] Obfusticated URI?)

2004-01-13 Thread David B Funk

On Mon, 12 Jan 2004, Larry Starr wrote:

 Just noticed a message with an encoded URL, that misses, the BIZ_TLD rule,
 etc.

 The message body contains:
 a href=3dhttp://gf=2eclearmath=2ebiz/jsimp/index=2ehtml;font
 face=3darialscored /fontthis way=2e
   brimg src=3dhttp://K=2eclearmath=2ebiz/images/js02=2ejpg; border=3d=
 0
 /a

 I know this wraps a bit ugly, when pasted into my mailer but, as you can see,
 the punctuation, in the URI, is all hex encoded. =2e, instead of ..

 I have a local rule, in the form of bigevil.cf, with the following
 sub-expression, that catches the above, but there has got to be a simpler way
 to do this.

 uri   uri MyEvilList_001 ( /\b(?:=2e){0,1}clearmath(?:\.|=2e)biz)\b\i

 Does anyone know of a ruleset that handles this sort of thing, perhaps code
 that decodes the =xx expressions prior to the URI matches?

Actually that is a bastardized quoted-printable (QP) encoding of a URL.
In QP the character sequence '=2E' is an encoded period, that spam-tool
is generating '=2e' intending it to be interpreted as a period.

SA is supposed to decode QP before running the various 'body' and 'uri'
rules but there's a limitation in its decoding engine. If the QP
encoding uses lower-case hex digits instead of CAPS hex digits, it
does not recognize them as QP and fails to decode them.

Strictly speaking RFC-2045 demands the usage of CAPS hex digits in
QP (see section 6.7) and the lowercase stuff should be considered
illegal.
However many popular mail clients will decode the bastardized
lowercase version and display the message to the user as the
spammer intends (section 6.7, note (1) permits this).

I can see two different ways to handle this, either make SA more
flexible and decode the bastardized QP so normal rules will hit
or write a rule that hits such bastardized QP coding as a spam-tool
signature.

Does anybody know if there are real (albeit brain-damaged) mail
clients that generate such bastardized QP encoding?

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: Limitation in SA (Re: [SAtalk] Obfusticated URI?)

2004-01-13 Thread David B Funk

On Tue, 13 Jan 2004, Justin Mason wrote:

 David B Funk writes:
 
 I can see two different ways to handle this, either make SA more
 flexible and decode the bastardized QP so normal rules will hit
 or write a rule that hits such bastardized QP coding as a spam-tool
 signature.

 Are you sure about this?  If it's the case, we do need to
 decode it, and it would be great to have it reported as a bug.

 - --j.

I've not dissected the SA code but empirical testing indicates it.
(Take Larry's snippet, change those '=2e' to '=2E' and watch
SA properly parse it).

I had noticed the same phenomenon before but was too lazy to track
it down. ;) With Larry's query as a prod, I took the time to test
it and look up the relevant RFC.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Scoring the Habeas header ...

2004-01-13 Thread David B Funk

On Tue, 13 Jan 2004, Rich Puhek wrote:

[snip..]
 Be patient. Use additional rules/tools to catch the latest spammers
 (clue: most come from spam zombie processes). Report the Habeas
 violators (more $$$ out of the spammers pockets!). Let's keep the Habeas
 marks as a tempting target for the spammers (strong negative score), so
 that they keep chomping at the bait.

 --Rich

Also note that Habeas has an RBL listing all reported sources of
forged Habeas-mark messages (the Habeas Infringers List). SA
automatically queries this RBL and will ignore SWE signatures from
those sources.

Thus it is in our best interest to report all such violations as it
negates the spammers attempts to abuse SWE and will help dig a
deeper hole for them. ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Sendmail/Milter/ProcMail/Spamassassin

2004-01-12 Thread David B Funk

On Mon, 12 Jan 2004, Mike Carlson wrote:

 Right now I am using spamass-milter to send all the email into spamassassin
 but I would like to implement a deletion process where the email gets deleted
 if it gets certain score. As it stands I cannot do that right now with my
 setup.

Read the documentation on the spamass-milter '-r' flag. That will do
what you want. Actually, it won't delete the spam but it will do
something better, it will SMTP reject it.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Oh Joy, another abusable URI redirector

2004-01-09 Thread David B Funk

On Fri, 9 Jan 2004, David B Funk wrote:

 Oh Joy, another abusable URI redirector. Saw this in a
 recent spam:

   http://www.google.com/url?q=http://cardtraffic.com

 Proposed rule:

 uri L_URI_REDIR3/http:\/\/www\.google\.com\/url?q=http:/i
 describe L_URI_REDIR3   open URI redirector #3
 score L_URI_REDIR3  1.5

Me bad, forgot a whack.

uri L_URI_REDIR3/http:\/\/www\.google\.com\/url\?q=http:/i

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Re: Fresh WhoIs data (emails, phones, etc.) on sale!

2004-01-09 Thread David B Funk

On Fri, 9 Jan 2004, Steve Thomas wrote:

 Yay. Yet another a-hole blatantly disregarding the various WHOIS directorys' terms 
 of use and raping it for marketing purposes. Gee, I can't wait to get three more 
 copies of the same spam for every domain I own...

Awww, Gee, I thought that he was being polite and offering us
fodder for practicing our rule writing skills. ;)

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] send mail and spamassasin must be on the same machime

2004-01-08 Thread David B Funk

On Thu, 8 Jan 2004, Douglas Kirkland wrote:

 On Thursday 08 January 2004 09:32, Ceva wrote:
  hi everybody,
  does sendmail and spamassassin must be on the same machine, or they can be
 on diferent machines?

 They can be on different machines.  You will have to call spamassassin with
 spamc to get to the spamd daemon.

 Douglas

Yes and no. To have sendmail  spamassassin be on different machines you
do need to use some kind of intermediary agent to connect them but it
does not have to be spamc.

There are several sendmail milter daemons that can be used to connect
sendmail to spamd. They can use Unix domain sockets for direct local
connections or they can use TCP sockets to connect to other machines.

The milter daemon effectively replaces spamc, it acts as a bridge,
speaks sendmail-milter protocol on the front side and SA-spamd protocol
on the backside to connect the two systems.

Check out 'http://spamlinks.net/filter-server-addon.htm#sendmail'
for a list of sendmail anti-spam milters. Some of them are for
other kinds of anti-spam filtering engines but some of them are
for connecting to SA.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] New rule? Based on domain registry

2004-01-05 Thread David B Funk

On Sun, 4 Jan 2004, oj wrote:

 Hello,

 Recentry i have had problem with spam that consist of html and one image only.
 The image is fetched from different domains each time. The domains have one
 thing in common though. They are all registered by the same registry:
Whois Server: whois.paycenter.com.cn
Referral URL: http://www.paycenter.com.cn
 And they are an accreddited ICCAN registry. It seams like the spammers have
 got their own registry. Thus they can come up with new domains all the time at
 low cost. Is there any rules based on registry of domains urls?

 Maybe this is a bit far fetched, but i really want this spammer gone.

 regards, Ove Jansson

Yet another instance of globalization, spam-haus out-sourcing to China. ;(

I think that if you check the DNS NS records for all those spam-domains
registerd thru paycenter.com.cn, you'll find that they all have the
same DNS Naming-Server hosting. (at least all the ones that I've seen so
far do).
It would be much easier to do a NS record lookup within SA than do a
whois check. Probably take a custom eval rule, but should be do-able.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] BigEvil.cf

2004-01-05 Thread David B Funk

On Mon, 5 Jan 2004, Ed Kasky wrote:

 At 08:56 AM Monday, 1/5/2004, Tom Meunier wrote -=

 With bigevil.cf in /etc/mail/spamassassin, all I see that remotely relates
 to the file is the following:

   spamd[22495]: debug: using /usr/share/spamassassin for default rules dir
   spamd[22495]: debug: using /etc/mail/spamassassin for site rules dir

 Any other way to monitor the file's effectiveness?

 Ed

Create a simple test mail message that contains a URL fabricated using
any one of the hosts listed in BigEvil.cf. Feed the test message to
spamc -R and look to see if the output report hits a BigEvil rule.

For example, I created a test message called 'test.txt' containing:

  From: bill
  To: bob

  http://bigmoneymarketing.com

Now I feed it to spamc:
  spamc -R  test.txt
  5.6/6.0
  Checker-Version SpamAssassin 2.60 (1.212-2003-09-23-exp) on server15.icaen.uiowa.edu
  Content analysis details:   (5.6 points, 6.0 required, autolearn=no)

   pts rule name  description
   -- --
   1.9 DATE_MISSING   Missing Date: header
   0.6 TO_MALFORMED   To: has a malformed address
   3.0 BigEvilList_37 URI: Generated BigEvilList_37

Note that hit on BigEvilList_37

To see how a given rule hits, feed it to spamassassin with rule
debugging turned on.

  spamassassin -D rulesrun=255  test.txt

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Assistance with bigevil.cf

2004-01-05 Thread David B Funk

On Mon, 5 Jan 2004, SAtalk Mail User wrote:

 Hello all,

 I am needing some assistance in regards to the output below, I have added what
 I think should get parsed out of the bigevil.cf file in /etc/mail/spamassassin
 directory.

 Added for testing ---
 uri BigEvilList_193 /\b(?:hotmail)\.com\b/i
 describe BigEvilList_193Generated BigEvilList_193
 score BigEvilList_193   10.0

[snip..]
 Assistance would be great here.  I am also using MySQL to hold the preferences
 rather than the local.cf file for easy of use.  So the only thing is local.cf
 is the access to my MySQL server and account information.

 What I did was wget the bigevil.cf and placed it into /etc/mail/spamassassin
 and in /home/spam/.spamassassin directories and then restarted the spamd
 process - at the end of this email is the processes that are currently running
 on my system here.

 This is the output of maillog from when I receive an email message.

 Jan  5 15:54:54 elmo sendmail[26923]: i05LsrU6026923: from=[EMAIL PROTECTED], 
 size=11575, class=0, nrcpts=1, msgid=[EMAIL PROTECTED], bodytype=7BIT, 
 proto=ESMTP, daemon=MTA, relay=xx.xxx.com []
 Jan  5 15:54:54 elmo spamd[26916]: logmsg: connection from localhost.localdomain 
 [127.0.0.1] at port 54468
 Jan  5 15:54:54 elmo spamd[26916]: connection from localhost.localdomain [127.0.0.1] 
 at port 54468
 Jan  5 15:54:54 elmo spamd[26930]: debug: retrieving prefs for root from SQL server
 Jan  5 15:54:54 elmo spamd[26930]: debug: Failed to parse line in SpamAssassin 
 configuration, skipping: report_header^I0
 Jan  5 15:54:54 elmo spamd[26930]: debug: Failed to parse line in SpamAssassin 
 configuration, skipping: defang_mime^I0
 Jan  5 15:54:54 elmo spamd[26930]: debug: Failed to parse line in SpamAssassin 
 configuration, skipping: blacklist_from
 Jan  5 15:54:54 elmo spamd[26930]: debug: Failed to parse line in SpamAssassin 
 configuration, skipping: blacklist_from
 Jan  5 15:54:54 elmo spamd[26930]: debug: Failed to parse line in SpamAssassin 
 configuration, skipping: bayes_learn_to_journal^I1
[snip..]

Start with fixing the things that SpamAssassin throws errors on.
I see lots of errors in the above log related to parsing your config
data that it's pulled out of MySQL.

It's best to test a complex system one piece at a time. I'd start by
taking MySQL out of the equation. Put your prefs into a plain text
local.cf file, test that, add in bigevil.cf, retest, when you have that
working put MySQL back in.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Having trouble coding a local rule

2003-12-29 Thread David B Funk

On Mon, 29 Dec 2003, Peter Kiem wrote:

 Hi David,

  So you either need to change your rule to match the header from address or
  code it to look for the envelope from address.

 What is the rule for matching envelope from address?

That is mail system dependent, as there is no standard requirement for
envelope from address to be present within a message. The example that
you posted had a 'Return-Path:' header that looked like the envelope
from.
Often times the envelope from is imbedded within a 'Received:' header.

You will have to look at an example of your mesages -as presented to SA-.
(IE SA may not 'see' the same thing that shows up in your INBOX.)

I use SA with sendmail and a milter. I had to modify the milter to get it
to synthesize headers that presented the envelope sender  envelope
recipients so that I could use them in filtering  white/black list rules.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Having trouble coding a local rule

2003-12-28 Thread David B Funk

On Mon, 29 Dec 2003, Peter Kiem wrote:

 Hi,

 I'm trying to add local rules to allow certain senders that always get
 caught by SA to lower their scores and give them a better chance of
 getting through.

 The rule I added was
 header LOCAL_GOOD_SENDER_11 From =~ /[EMAIL PROTECTED]/
 score  LOCAL_GOOD_SENDER_11 -2.0

 The headers on the email are
 Return-Path: [EMAIL PROTECTED]
 Delivered-To: [EMAIL PROTECTED]
[snip..]
 Date: Sun, 28 Dec 2003 16:16:35 UT
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Homes for Sale - Homes for Sale Alert
 X-Spam-Status: Yes, hits=5.2 tag1=-999.0 tag2=5.0 kill=5.0
  tests=BigEvilList_29, HTML_70_80, HTML_FONTCOLOR_BLUE, HTML_FONTCOLOR_RED,
  HTML_FONTCOLOR_UNSAFE, HTML_IMAGE_RATIO_08, HTML_MESSAGE,
  HTML_RELAYING_FRAME, MIME_HTML_ONLY, NO_REAL_NAME
 X-Spam-Level: *


 Why isn't the local rule being activated?

Very simply, [EMAIL PROTECTED] != [EMAIL PROTECTED]

The envelope from is '[EMAIL PROTECTED]' but the Header From is '[EMAIL PROTECTED]'

They are not the same, so the rule does not match. (You coded the rule to
look at the header 'From' address).

So you either need to change your rule to match the header from address or
code it to look for the envelope from address.

Note that depending upon how SA is integrated into your mail system,
the enveope from address may not be readily available to SA.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Re: Having trouble coding a local rule

2003-12-28 Thread David B Funk

On Mon, 29 Dec 2003, Peter Kiem wrote:

  Preferably not as if someone does forge it, then the mail goes straight
  through...
 
  Isn't that what whitelist_from_rcvd is for?  man Mail::SpamAssassin::Conf

 The point is I *DON'T* want to whitelist.  I wanted just to lower the SA
 scores with a local rule.

Actually, 'whitelist_from_rcvd' is the way to go, as it will only apply
if -both- the From address and the DNS host name of the sending system
match the rule. However looking back at your first post I see that the
DNS reverse map for the 'sneezy' system is FUBAR, so you cannot use it.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Single image spams with random info

2003-12-23 Thread David B Funk

On Tue, 23 Dec 2003, Greg Webster wrote:

 We're getting a TON of these, all of similar format.

 htmlbody
 center!--2rdxveiyf7a8--a
 href=http://www.mdv678.com?rid=1098;!--srz4f4qaLBUw--img
 src=http://www.whosout.com/c2.gif; border=0/a/center
 /html/body

 The '2rdxveiyf7a8' and 'srz4f4qaLBUw' some random string of characters
 in the same place all the time. The domains are completely random it
 appears - sometimes with words (like whosout.com) or a random set of
 characters.

 Suggestions on how to block them? I've been adding the domains to a
 special rule, but they must own hundreds of them.

One clue, all those domains are hosted by a registrar in China,
XIN NET CORP with a whois server of: whois.paycenter.com.cn.

They're all served out of a few DNS servers:
  NS0.DNSIN.COM
  NS1.DNSIN.COM


DNS  whois based clues could be used to automate the adding of
domains to a special rule or filter list.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] We have big evil now we need big good...

2003-12-22 Thread David B Funk

On Sat, 20 Dec 2003, Gary Smith wrote:

 So we implemented SA some time ago because our clients were getting too much spam.  
 Lately we have found that several html marked up emails have been getting marked as 
 spam.  These ones are clearly fp's.

 Some of the domains include Morningstar.com, charlesswab.com and several other 
 financial institutions.  Some of the clients get their weekly reportings sent to 
 them, and it has of course the remove me tag at the bottom as well as a bunch of 
 html so it gets marked as spam.

 I know I could just create a simple white list but it might be more useful to create 
 a project of good companies to fix the fp's.

 Looking for feedback on the topic

Check out Habeas mark or bonded sender services. That's their whole
business, providing ways to list or indicate white-hats. Support for
them is already incorporated in SA, and buisnesses such as Ebay use
them.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] new spamming techniques are flooding me. Any suggestions?

2003-12-18 Thread David B Funk

On Thu, 18 Dec 2003, mairhtin wrote:

 I am getting a new flood of spam that appears not to be even selling anything, but 
 merely trying to get through the filters.
 Could they be trying to learn from this?  I don't see how, but someone suggested 
 as much.

 Here's a copy of the spam mail and headers :

  Return-Path: [EMAIL PROTECTED]
  Received: from SERCOSE01 ([200.223.153.82])
  by mail.techsolutionsgroupllc.com (8.12.5/8.12.5) with SMTP id hBIHFDq2004976
  for [EMAIL PROTECTED]; Thu, 18 Dec 2003 09:15:15 -0800
  Received: from [200.223.153.82] by 2004hosting.orgIP with HTTP;
  Thu, 18 Dec 2003 16:15:02 -0200
  From: Draper Merrill [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  Subject: Re: IXCH, the sky over
  Mime-Version: 1.0
  X-Mailer: mPOP Web-Mail 2.19
  X-Originating-IP: [2004hosting.orgIP]
  Date: Thu, 18 Dec 2003 12:12:02 -0600
  Reply-To: Merrill [EMAIL PROTECTED]
  Content-Type: multipart/alternative;
  boundary=--ALT--NUPL13668734694178
  Message-Id: [EMAIL PROTECTED]
  X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on
  mail.techsolutionsgroupllc.com
  X-Spam-Level: *
  X-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_IMAGE_ONLY_06,
  HTML_MESSAGE autolearn=no version=2.60
  Status:

Do you have any kind of network checks (DSBLs) enabled?
That IP address hit half a dozen of my net checks, including
my MTA block lists (RBL-Plus,list.dsbl.org,dnsbl.sorbs.net) so my
SA never even saw that garbage.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Trouble with bayesian classification and autolearn

2003-12-12 Thread David B Funk

On Fri, 12 Dec 2003, J. S. Greenfield wrote:

 I've been experimenting with configuration of spamassassin for sitewide
 use (in particular, using spamassassin 2.60 with sa-exim 3.1 and exim
 4.30, under Solaris 8), and for the life of me, I can't seem to get
 bayesian classification and autolearn working.

 No matter what I do, my X-Spam-Status header indicates that
 autolearn=no, and the bayes files never get created.

 I'm currently invoking spamd as follows:

   /usr/local/bin/spamd -d -c -u spamd -r ${PIDFILE}

 And I have the following in my local.cf:

 use_bayes   1
 auto_learn  1
 bayes_path  /etc/mail/spamassassin/bayes
 bayes_file_mode 0666

 where both the bayes_path directory, and the spamd home directory (it's
 parent), are owned by spamd, and world writeable, at this point, for
 good measure.
[snip..]

Take a piece of spam, transfer it to the server, make it readable by
user 'spamd', login (or su) as user 'spamd' and run the following
command:

  sa-learn -D --spam --file spam-example.txt

Where 'spam-example.txt' is the file that contains your example
spam message. This should produce a whole bunch of lines of output,
one of them being: Learned from 1 message(s) (1 message(s) examined).

If not, look for error messages in that bunch of stuff.

One suggestion, if you're running Bayes site wide (particularly on
a busy server) turn on journaling, it will help to avoid file locking
problems. Add the following line to your config file:

bayes_learn_to_journal 1


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] sa-learn ... R/W: tie failed!

2003-12-11 Thread David B Funk

On Wed, 10 Dec 2003, AthlonRob wrote:

  Just for SG, try doing a 'sa-learn --dump magic' and see if it
  likes what it sees. If you cannot even --dump magic then it's
  truly corrupted, no repair, just delete and start fresh.

 I got some funky output:

 [EMAIL PROTECTED]:~/.spamassassin$ sa-learn --dump magic
 Cannot open bayes databases /var/amavis/.spamassassin/bayes_* R/O: tie failed:
 Use of uninitialized value in numeric lt () at 
 /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line 1281.
 0.000  0  0  0  non-token data: bayes db version
 0.000  0  0  0  non-token data: nspam
 0.000  0  0  0  non-token data: nham
 0.000  0  0  0  non-token data: ntokens
 0.000  0  0  0  non-token data: oldest atime
 0.000  0  0  0  non-token data: current scan-count
 0.000  0  0  0  non-token data: last expiry atime

  You've got a boat-load of stuff in that bayes_journal, so delete
  the bayes_seen  bayes_toks and let that journal seed you a fresh start.

 I'm guessing, from the above output... that won't work for me

The bad values from the corrupt bayes_toks could cause that
perl error. (It seems that the SA database code needs a bit of
update to better defend itself from bad database values ;).

Just try completely removing those bayes_toks  bayes_seen files
and do a 'sa-learn --rebuild'. It should take that bayes_journal file
and use its data to create a new database.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Content Analysis

2003-12-10 Thread David B Funk

On Tue, 9 Dec 2003, Thomas Shoaf (PromoStep) wrote:

 As for correcting the items listed in my original post, I am looking for an
 example of the correct content that should be included in the content of the
 HTML message relating to such items appearing in the Content Analysis when
 checked through the Content Checked at Lyris.

Those were MIME header errors.

Read the MIME rfc (RFC 2045) to get the official statement of how
MIME headers should be used. RFCs can be found at the IETF site.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Content Analysis

2003-12-10 Thread David B Funk

On Tue, 9 Dec 2003, Thomas Shoaf (PromoStep) wrote:


 The answer to your question, Gary... We are an incentives marketing firm
 with an affiliate element.  Our members can send virtual promotions from
 their account to friends, family, colleagues, etc; however, some email
 services such as Hotmail, Yahoo, etc appear to be blocking these promotions.

 So - we have a duty to ensure such promotions are delivered as perceived
 by our members.

 Likewise, we send various updates/newsletters to our members periodically
 and we feel that a majority of such messages are not being delivered to our
 members.  Therefore, it pertinent for our company to check such SPAM scoring
 to ensure the customers of our members receive what they send them and that
 our members receive the communications from us as a company.

 So we are not trying to see how much spam related content can go into an
 email nor are we tring to find a way around SA... We are trying to allow our
 members to communicate with their customers as well as allow our company to
 communicate to our members.


Thomas,
A simple, non-technical solution to your problem is to obtain a Habeas
Warrant mark and use it (see http://www.habeas.com/).
Any site using Spamassassin will honor such a mark and pass the message
on, even if the content is muddy.

This will work for current and future versions of Spamassassin.

There is a cost associated with obtaining a Warrant mark, but since you
are -not- sending spam it should not be prohibitive. You should be
able to consider it part of the cost of doing marketing business on the
internet.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Log Help!

2003-12-10 Thread David B Funk

On Wed, 10 Dec 2003, Ryan Lumsden wrote:

 Hi all.

 how do I get spamd to log to a diffrent file besides messages and mail.log.

 I am up2date with sa and I am running debian woody, any body have any ideas.

 Thanks in advance.

 Ryan

Yes, look at the man pages for syslogd and spamd. Note the usage
of the syslog facility. Pick a facility that is not already in use
on your system, tell spamd to use that particular facility and tell
your syslogd to log that facility to your desired file.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread David B Funk

On Wed, 10 Dec 2003, Gary Funck wrote:

  It might be convenient to view each these transformations as
  operating on the output of the previous. I think you were.
  By doing so, it avoids replicating the description of the
  previous phase.

 I meant to add the following sugested additional
 transformation:

 PHONEMED in this form, the words are either converted into their
 phoneme form and/or spell-checked (perhpas augmented by a custom
 dictionary of popular spammer spellings). The words would be
 de-rooted as well.

 This paragraph suggests that the spelling transformation would
 proceed the ALPHED transformation.

 
  Note that numbers are sometimes substituted for letters. Such
  as Gr8t and zer0, any1, me2, all41 and 14all. This argues for
  phoneming and/or spell-checking before ALPHA-ing.

What might be easier to implement would be an enhanced version of
the soundex transformation (see Text::Soundex module).

The El337 version of soundex would know about the various
grapical character to sounds mappings and return results that
would be appropriate.

The only difficulty I can see would be dealing with the ambiguity
factor. (EG is '14all' - one-for-all or Laall ).


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Writing a DNSBL rule for both SPEWS levels

2003-12-10 Thread David B Funk

On Wed, 10 Dec 2003, Matt Kettler wrote:

 At 02:08 PM 12/10/2003, Justin wrote:
 So that's how check_rbl and check_rbl_sub work?  I always wondered about
 that.  So what happens if an IP exists in two subzones at the same time?

 With SORBS, it's done by returning multiple results for a single query.

 host 138.81.106.218.dnsbl.sorbs.net
 138.81.106.218.dnsbl.sorbs.net has address 127.0.0.2
 138.81.106.218.dnsbl.sorbs.net has address 127.0.0.3

 OPM looks like a bit-mask system, so one result can encode 8 different
 DNSBLs at once.

The MAPS RBL+ is also a bit-mask system. See http://mail-abuse.org/rbl+/
for info on what values they return, then look at the code at the
bottom of 20_dnsbl_tests.cf to see how SA uses it.

So you make one query with 'check_rbl' and then parse out the
results with 'check_rbl_sub'

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] sa-learn ... R/W: tie failed!

2003-12-10 Thread David B Funk

On Wed, 10 Dec 2003, AthlonRob wrote:

 On Wed, 2003-12-10 at 19:50, Adam Denenberg wrote:
  in the same directory as the bayes DB files.

 Unfortunately, there are no .lock files in that directory.

 [EMAIL PROTECTED]:~/.spamassassin$ sa-learn --rebuild -DD

 debug: Final PATH set to: /usr/local/bin:/bin:/usr/bin
 debug: using /usr/share/spamassassin for default rules dir
 debug: using /etc/mail/spamassassin for site rules dir
 debug: using /var/amavis/.spamassassin/user_prefs for user prefs file
 debug: bayes: 10390 tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
 Cannot open bayes databases /var/amavis/.spamassassin/bayes_* R/O: tie failed:
[snip..]
 debug: lock: 10390 link to /var/amavis/.spamassassin/bayes.lock: link ok
 debug: bayes: 10390 tie-ing to DB file R/W /var/amavis/.spamassassin/bayes_toks
 debug: unlock: 10390 unlink /var/amavis/.spamassassin/bayes.lock
 Cannot open bayes databases /var/amavis/.spamassassin/bayes_* R/W: tie failed: File 
 exists
 debug: lock: 10390 created 
 /var/amavis/.spamassassin/bayes.lock.linuxbox.linux.box.10390
 debug: lock: 10390 trying to get lock on /var/amavis/.spamassassin/bayes with 0 
 retries
 debug: lock: 10390 link to /var/amavis/.spamassassin/bayes.lock: link ok
 debug: bayes: 10390 tie-ing to DB file R/W /var/amavis/.spamassassin/bayes_toks
 debug: unlock: 10390 unlink /var/amavis/.spamassassin/bayes.lock
 Cannot open bayes databases /var/amavis/.spamassassin/bayes_* R/W: tie failed: File 
 exists
 [EMAIL PROTECTED]:~/.spamassassin$ ls -lha
 total 6.4M
 drwx--3 vscansweep4.0k Dec 10 20:06 .
 drwxrwxrwx6 vscansweep 12k Dec 10 19:00 ..
 drwxr-xr-x2 vscansweep4.0k Dec 10 18:04 baye
 -rw---1 vscansweep1.3M Dec 10 18:06 bayes_journal
 -rw---1 vscansweep4.7k Dec 10 18:06 bayes_msgcount
 -rw---1 vscansweep1.2M Dec 10 18:06 bayes_seen
 -rw---1 vscansweep4.9M Dec 10 18:06 bayes_toks
 -rw-r--r--1 vscansweep 346 Dec 10 19:00 user_prefs

 :-(

 Rob

Hate to say it, but it looks like your database is hosed.
Permissions are OK, it's looking at the correct files, locks are good,
etc.
those 'File Exists' errors indicate that the DB_File library found
a file with that name which contains the wrong 'magic', IE it's
not a valid dabase file.

Just for SG, try doing a 'sa-learn --dump magic' and see if it
likes what it sees. If you cannot even --dump magic then it's
truely corrupted, no repair, just delete and start fresh.

You've got a boat-load of stuff in that bayes_journal, so delete
the bayes_seen  bayes_toks and let that journal seed you a fresh start.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Fishing lesson (Re: [SAtalk] DB_File

2003-12-09 Thread David B Funk

On Tue, 9 Dec 2003, Jack Gostl wrote:


 I sent you the error message, I'm pretty sure there was no user associated
 with it. There were tens of thousands of those errors in the log. I'm not
 sure how to pinpoint the culprit. I guess I'll have to go to each user and
 rebuild their database.

Yes, you did but did not include the -other- lines associated with that
particular spamd run which had the information that you seek.
I tried to point you to what you are looking for but evidently you didn't
understand.

Intro to Syslog Interp 101

Think of syslog interpretation like trying to follow a converstation
at a crowded party. Everybody is talking at once, you hear 'snippets'
of speech, so you need to be able to thread together the groups of
words to extract a complete and comprehensable converstation.

On a Unix system syslog gathers gathers report messages from programs
and stores them with identifying info as lines of text in a log file.
Each message is only a single line of text so if the program has a lot
of info to log, it will report it as multiple messages. These may
(probably) will end up interspersed with reports from other programs
logging at the same time on a busy system.

So you need to look at the identifying info to find all the lines that
relate to one specific program and it's job, to be able to thread them
together for the full report.

Each line in a syslog file has a standardized format:

date time host program[PID]: text-of-message-from-program

You need to match up the program-name and Process-ID to find all the
lines of text that relate to one program job.

So the line:

 Dec  8 23:44:53 argos spamd[766]: checking message [EMAIL PROTECTED] for 
(jbuser):115.

Was logged on the host argos at 23:44:3, Dec 8, by the 'spamd' program
with the Process-ID of 766.

(Some lines may not follow this precise format; the PID is optional and a
client process may not choose to provide it, if the message comes from a
system level (kernel), there will not be a process name assocaited).

Now looking at a snippet from a real syslog file on a mail server you
would see lines like:

Nov  8 01:09:44 server13 sm-mta[2541]: hA879g3Z002541: from=[EMAIL PROTECTED], 
size=23558, class=0, nrcpts=1, msgid=[EMAIL PROTECTED], proto=SMTP, daemon=MTA, 
relay=discovery.neiu.edu [66.99.13.30]
Nov  8 01:09:44 server13 spamd[2542]: checking message [EMAIL PROTECTED] for 
(unknown):115.
Nov  8 01:09:46 server13 spamd[2540]: clean message (-9.0/6.0) for (unknown):115 in 
13.9 seconds, 7892 bytes.
Nov  8 01:09:47 server13 miltrassassin[1786]: hA879T3Z002531: spamlevel=-90
Nov  8 01:09:47 server13 sm-mta[2531]: hA879T3Z002531: to=[EMAIL PROTECTED], 
delay=00:00:14, mailer=lrelay, pri=3580, stat=queued
Nov  8 01:09:47 server13 spamd[2530]: clean message (-3.3/6.0) for (unknown):115 in 
24.0 seconds, 16016 bytes.
Nov  8 01:09:47 server13 miltrassassin[1786]: hA879M3Y002527: spamlevel=-33
Nov  8 01:09:47 server13 sm-mta[2527]: hA879M3Y002527: to=[EMAIL PROTECTED], 
delay=00:00:21, mailer=lrelay, pri=3441, stat=queued
Nov  8 01:09:55 server13 sm-mta[2543]: NOQUEUE: connect from 
swfirewall1.andersencorp.com [65.217.82.3]
Nov  8 01:09:55 server13 sm-mta[2543]: hA879t3Y002543: swfirewall1.andersencorp.com 
[65.217.82.3] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Nov  8 01:09:58 server13 sm-mta[2544]: NOQUEUE: connect from moon.its.uiowa.edu 
[128.255.56.76]
Nov  8 01:09:58 server13 sm-mta[2544]: hA879w3Y002544: from=[EMAIL PROTECTED], 
size=17075, class=0, nrcpts=1, msgid=[EMAIL PROTECTED], proto=ESMTP, daemon=MTA, 
relay=moon.its.uiowa.edu [128.255.56.76]
Nov  8 01:09:58 server13 spamd[2545]: checking message [EMAIL PROTECTED] for 
(unknown):115.
Nov  8 01:10:02 server13 spamd[2542]: clean message (-3.5/6.0) for (unknown):115 in 
17.7 seconds, 24235 bytes.
Nov  8 01:10:02 server13 miltrassassin[1786]: hA879g3Z002541: spamlevel=-35
Nov  8 01:10:02 server13 sm-mta[2541]: hA879g3Z002541: to=[EMAIL PROTECTED], 
delay=00:00:18, mailer=lrelay, pri=3859, stat=queued
Nov  8 01:10:14 server13 spamd[2545]: clean message (1.4/6.0) for (unknown):115 in 
15.2 seconds, 17494 bytes.
Nov  8 01:10:14 server13 miltrassassin[1786]: hA879w3Y002544: spamlevel=14
Nov  8 01:10:14 server13 sm-mta[2544]: hA879w3Y002544: to=[EMAIL PROTECTED], 
delay=00:00:16, mailer=lrelay, pri=3919, stat=queued

Note that there are a number of messages from different processes
intermingled in there.

If we pick out one particular spamd run (say PID[2545])  we will find:
Nov  8 01:09:58 server13 spamd[2545]: checking message [EMAIL PROTECTED] for 
(unknown):115.
Nov  8 01:10:14 server13 spamd[2545]: clean message (1.4/6.0) for (unknown):115 in 
15.2 seconds, 17494 bytes.

If there had been an error logged it would have been something like:
Nov  8 01:10:10 server13 spamd[2545]: Use of uninitialized value in numeric eq (==) at 
/usr/local/

So you would use the PID to tie it to the line that has the user name,
as I told you how to identify.

So now that

Re: [SAtalk] Generic V-whatever drug with no GV rule hits (fwd)

2003-12-08 Thread David B Funk

On Mon, 8 Dec 2003, Matt Kettler wrote:

 At 10:54 AM 12/8/2003, Christopher X. Candreva wrote:

 I just opened a Bugzilla report for this:
 
 http://bugzilla.spamassassin.org/show_bug.cgi?id=2817
 (SA 2.60, Solaris, perl 5.6.1)

 For the moment, I'd suggest a rule like this one that I just cooked up:

 body LOCAL_GAPPY_VIAG   /\bV\Wi\Wa\Wg\Wr\Wa\b/i
 score LOCAL_OBFU_VIAG   1.0

 Note: not heavily tested yet, but should work nicely.. it looks for the
 v-word with each letter spaced off by non-word characters, such as
 punctuation, spaces, etc. Note that perl considers underscore to be a word
 character so this will miss that variant, but you can quickly cook a rule
 to get that one.

Small enhancement suggestion, modify each one of those '\W' with '?'
thus making successive obfuscating characters optional. With your
rule there -must- be an obfuscating between each regular character,
with the '?', it will catch all permutations of normal and obfuscating
characters.

Something like:

body LOCAL_OBFU_VIAG/\bV\W?i\W?a\W?g\W?r\W?a\b/i


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] Re: Generic V-whatever drug with no GV rule hits (fwd)

2003-12-08 Thread David B Funk

On 8 Dec 2003, Scott A Crosby wrote:

 On Mon, 08 Dec 2003 16:43:15 -0500, Matt Kettler [EMAIL PROTECTED] writes:

  Or *, to catch more than one obfuscating character..
 
  ie: V...i..a.gr..a
 
  As I suggested in my email, there's lots of combinations that spammers
  can do to avoid the original rule. There's also lots of ways to
  construct the rule to get a broader hit-base, at the expense of
  greater processing time.

 In theory, this isn't that much additional matching time, especially
 with an automata. In practice though, these sorts of rules will kill
 performance because Perl cannot apply the literal optimization,
 especially if they're applied widely. (There's more than just Vx
 -- most of the phrase rules need this sort of treatment.)

 Scott

Scott,
If it's a bounded wild-card (.{0,6}) as opposed to unbounded (.*)
is it less of a hit? (IE reasonable thing to do).

Are there any reasonably simple ways to do this with out killing things?

(EG .? == OK, .* == BAD, .{0,n} == acceptable, for small values of 'n')

Are there any studies of the Perl matching engine for efficiency
and rules-of-thumb?

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] (no subject)

2003-12-08 Thread David B Funk

On Mon, 8 Dec 2003 [EMAIL PROTECTED] wrote:

 We've got SA 2.60 running on a Solaris 8 box with SunONE Messaging
 Server.  It is doing spam scanning for all our users (~7000).  In
 order to keep the system as speedy as possible, I've configured the
 bayes journal to sit on /tmp which is a memory file system.

 I'd like to back up the journal periodicly to disk.  Right now I have
 a shell script which shuts spamd down, copies the spamassassin
 directory to disk, then restarts spamd.  However, during this period
 of downtime (about 10-20 seconds), the MTA cannont process email.  I
 didn't write it, I can't change it. :(

 searching through the archives and docs, I've not found anything
 regarding this  issue. How have others solved this problem of backing
 up a live bayes journal database?  Would a perl script using file
 locking be a viable workaround, at least allowing the MTA to continue
 processing email?

 Thanks for any assistance.

 cary

As long as you're using bayes_learn_to_journal, the journal file will
be the only thing that's changing frequently. (actual Bayes database
files are R/O except during expire or journal sync.) So grabbing it
live may miss a few journal entries (but that shouldn't be a major
loss). If you want to be sure that you don't hit a sync/expire during
your copy, create a .lock file before you copy and remove after
done (see SA module 'UnixLocker.pm' for grubby detals ;)

Suggestion, copy the database files to another directory in your
memory file system (RAM-2-RAM snapshot), should go fast. Then you
can copy the snapshot to a real disk at your leisure.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] DB_File

2003-12-08 Thread David B Funk

On Mon, 8 Dec 2003, Jack Gostl wrote:

  My bet is that your Bayes database got trashed.

 Possible, but which database? We have many users all with their own? Also,
 if its a trashed Bayes db, why does the message go away when I restart
 spamd?

Which ever database it was looking at when it logged those error
messages. Assocated with each message process, there should be a line
that looks like:

Dec  8 23:44:53 argos spamd[766]: checking message [EMAIL PROTECTED] for 
(jbuser):115.

where 'jbuser' is the particular user-recipient of that message.
So look at the process ID of the spamd that is logging the error, find
the corresponding 'checking message' log entry and that should point you
at the offending cuprit.

If the entry says '(unknown)' then there was no user specified via the
spamc process (or milter, or what ever process called the spamd). In that
case spamd is not looking at any user's database, just the global one.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] bayes permission errors

2003-12-07 Thread David B Funk

On Sat, 6 Dec 2003, Lukreme wrote:

 spamd[33762]: Cannot open bayes databases
 /home/user/.spamassassin/bayes_* R/O: tie failed: Permission denied
 spamd[33762]: processing message
 [EMAIL PROTECTED] for kremels:5003.
 spamd[33762]: clean message (0.8/5.0) for user:5003 in 0.2 seconds,
 5526 bytes.

 /home/user/.spamassassin $ ls -lstr
 total 6338
 2 -rw-rw-rw-  1 user  staff 1218 Oct  6 15:28 user_prefs
 2 -rw-rw-rw-  1 user  staff  199 Oct  6 15:28 bayes_msgcount
 4160 -rw---  1 user  staff  5111808 Dec  4 09:51 bayes_toks
 2112 -rw-rw-rw-  1 user  staff  2637824 Dec  4 09:51 bayes_seen
62 -rw---  1 user  staff62030 Dec  4 09:51 bayes_journal

 where user is any user on the system.

 $ psa spamd
 postfix   565  0.0  3.3 21908 4132  ??  Is9:17PM   0:04.03
 /usr/local/bin/spamd -a -c -d -u postfix (perl)

 is how spamd is running.

You've got spamd running as the user postfix (that -u postfix
command line argument). Thus the user postfix needs to have write
permissions to the bayes_* files. but in that directory listing
you show:

 4160 -rw---  1 user  staff  5111808 Dec  4 09:51 bayes_toks

So 'postfix' has -no- access permissions to users bayes_toks
Thus the permission errors.

You have two different options:
1) run spamd as root and be sure that you pass the correct user
   name via spamc -u user for each message.
2) Set the global 'bayes_file_mode' option to 0666 so that the
   spamd process always has read-write permission, regardless
   of who it is run as.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Custom Rules

2003-12-04 Thread David B Funk

On Wed, 3 Dec 2003, Fred   I-IS.COM wrote:

 Just a minor correction,

 try this:

 header__BLOCKTOFFICEOUTTo =~ /[EMAIL PROTECTED]/i
 header__BLOCKFOFFICEOUTFrom =~ /[EMAIL PROTECTED]/i
 metaBLOCK_MY_OFFICE(__BLOCKTOFFICEOUT  !__BLOCKFOFFICEOUT)
 describeBLOCK_MY_OFFICENo E-mail to alias from outside
 scoreBLOCK_MY_OFFICE100.0

 The syntax is slightly different in my rule and I used a meta rule to
 accomplish what you want.

 Frederic Tarasevicius

 Nayana Hettiarachchi wrote:
  hi i am trying to setup a rule so that we wont get mail to our local
  alias from an outside address, this is what i wrote but it doesnt seem
  to work as i thought it would, can u give any advice
 
  header   BLOCKTTOFFICEOUT   To = [EMAIL PROTECTED]
  header   BLOCKTTOFFICEOUT   From != [EMAIL PROTECTED]
  scoreBLOCKTTOFFICEOUT   100.0
  describe BLOCKTTOFFICEOUT   No Email To alias from outside
 
  thanks
 
  Nayana

Of course, you should realize that the message header values can be
forged to be -anything- that a spammer wants them to be, and they have
no relation to where the mail actually gets routed.

The thing that acually controls delivery is something called the
envelope recipient and can be completely different from the 'To:'
header. Depending upon how your mail system is configured
Spamassassin probably has no way to see the value of the envelope
recipient.

This sort of thing is far better handled by your MTA, which has to
deal completely with the envelope recipient address.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by OSDN's Audience Survey.
Help shape OSDN's sites and tell us what you think. Take this
five minute survey and you could win a $250 Gift Certificate.
http://www.wrgsurveys.com/2003/osdntech03.php?site=8
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] spamds that don't finish

2003-12-04 Thread David B Funk

On Thu, 4 Dec 2003, Cheryl L. Southard wrote:

 Hi All,

 I've got two spamd processes that just wont go away.  They've been
 running for well over 11 hours and are taking up 100% of my cpu.
 I've run truss spamd-pid but it doesn't report anything.  The same
 user, coincidentally, is the recipient of both e-mails, but this
 user doesn't have any special rules in his user_prefs file.  This user's
 home directory and mail file seem accessable  and there don't seem to
 be any weird messages in the spamd log file

One idea, there's something in the mail that particular user is getting
that is triggering some kind of bug in SA (buffer overflow, etc). Can
you find the offending message and try feeding it to SA by hand?

With one of the RC versions of 2.60, if a message had a weird long header
it would cause the spamd to blow up. I've not seen it with the release
version of 2.60, but it doesn't mean that it couldn't happen.

 I am running spamassassin 2.60 on a Solaris 9 computer with procmail.

  ps -ef | grep spamd
   cc 27379  2447 48 20:36:36 ?   277:37 /usr/local/bin/perl -T 
 /usr/local/bin/spamd -d -a -c -m 5
   cc 19967  2447 48 13:14:29 ?   603:31 /usr/local/bin/perl -T 
 /usr/local/bin/spamd -d -a -c -m 5
 root  2447 1  0   Oct 27 ?   30:17 /usr/local/bin/perl -T 
 /usr/local/bin/spamd -d -a -c -m 5

 Can anyone suggest things I can try to figure out what is going on?
 Since we have a 5 process spamd limit on our computer, these processes
 are really causing a traffic jam on my mail server.

Another idea, are you using Bayes, and if so do you not have
bayes_learn_to_journal enabled?

If you are not journaling, then each spamd wants to update the bayes
database and there could be locking contention. On some types of machine
(particularly SMP) Berkeley_DB uses a spinlock which can use high CPU,
particularly if something gets stuck.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] spamds that don't finish

2003-12-04 Thread David B Funk

On Thu, 4 Dec 2003, Pete Henshall wrote:

 Hi dan, list,

  I think it's simply a function of load. The first system gets the bulk of
 the mail thoughput.  You can see that the  erratic loads
  tail off over the weekend.  It's wierd.  I have tried disabling RBL, bayes
 and even removing all my third party
  rules.  No dice.

 If it is still leaving spamds lying around with bayes disabled then I don't
 know I have just set bayes_learn_to_journal 0 (thanks David Funk) and my
 problem seems to have stopped maybe.

I'm sorry if I gave you the wrong impression, if you are using Bayes with
auto_learn (auto_learn 1), then you most likely -do- want
bayes_learn_to_journal set to 1. (enabled).

If you use auto_learn and disable journaling, then each spamd tries to
update the Bayes database with each new message (thus increasing the
probablilty of lock contention problems).

If you enable journaling then each spamd just appends to the end of the
journal file (no locking needed for a simple text append). Then the
database will perocially get rebuilt and incorporated in the database.
So only that occasional rebuild needs to lock the database.


 As far as I am concerned spamd should NEVER have rouge spamd's coming off it
 that don't have a matching spamc.  (is that right??)

I'm not so sure about this. If you have bayes_learn_to_journal enabled
then a spamd child will need to be run when ever the journal file gets
full (size  bayes_journal_max_size) or it's been around for more
than one day. Also, unless you've explicitly disabled it, a db expire
is done daily (which would be another spamd child).

So unless you disable all automatic Bayes maintanence operations
(learn, expire, etc), then there will be the possibility of spamd
children and potential lock contention.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Interesting about BIG HUGE EVIL RULEs

2003-12-04 Thread David B Funk

On Thu, 4 Dec 2003, Scott Harris wrote:

 Because I don't have sourceforge whitelisted, 6 of the last 20 messages to
 the list were labeled as spam.

 Rules that hit were:

  3.0 BigEvilList_70 BODY: Generated BigEvilList_70
  3.0 BigEvilList_150BODY: Generated BigEvilList_150
  3.0 BigEvilList_175BODY: Generated BigEvilList_175

 70 and 150 hit in every one, 175 only in a few.

 This is # BigEvilList Beta version 1.57a

One way to deal with this is to modify the area that the rules
search. Replace the rawbody with uri and they will only hit
against references in URLs, not just floating random text.

Most of the spammer use of those domains are inside URLs to
direct victims to spamvertizement sites, so this -should- not
reduce the effectivenss of the rules in the good fight. ;)

Of course, the better way would be to set up an effective whitelist
for this list (that's what I did some time ago).

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] What is this? Bayes poison?

2003-12-04 Thread David B Funk

On Thu, 4 Dec 2003, Kenneth Porter wrote:

 I'm getting a bunch of these. Are these just intended to poison Bayes DB's?
 What's the sender's objective?

  Forwarded Message 
 Return-Path: [EMAIL PROTECTED]
 Received: from 212.199.108.10.forward.012.net.il
 (212.199.108.10.forward.012.net.il [212.199.108.10])
   by smtp.kensingtonlabs.com (8.12.8/8.12.8) with SMTP id hB52hnEE032538
   for [EMAIL PROTECTED]; Thu, 4 Dec 2003 18:43:54 -0800
 Date: Tue, 18 Jun 2002 07:44:03 -0500
 From: Betty Tumlinson [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Get a better homeloan
 Message-ID: [EMAIL PROTECTED]
 MIME-Version: 1.0
 X-Accept-Language: en-us, en
 X-Security: MIME headers sanitized on uugw.kensingtonlabs.com
   See http://www.impsec.org/email-tools/sanitizer-intro.html
   for details. $Revision: 1.139 $Date: 2003-09-07 10:14:23-07
 Content-Type: multipart/alternative;
 boundary=224_21A3C8F2.FD632120
[snip..]
 -- End Forwarded Message --

Note that message was MIME multipart/alternative, but yet I saw only
the part that was obvious Bayes poison. Is it possible that your
MIME 'sanitizer' removed the spam 'payload' component?
(Or it's just as likely that the spammer's software fubared and
didn't add the payload. ;)



-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278alloc_id=3371op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Problem with email=no content

2003-12-03 Thread David B Funk

On Tue, 2 Dec 2003, Robert Menschel wrote:

 headerRM_hx_from  exists:From
 describe  RM_hx_from  From header found
 score RM_hx_from  0.001
 meta  RM_hn_from  !RM_hx_from
 describe  RM_hn_from  From header not found
 score RM_hn_from  1.00

 The first rule tests for the existence of a FROM header. Score minimal
 (could be a non-scoring __RM_hx_from rule for that matter).

 The second rule then reverses that test, checking for the lack of a FROM
 header. (There may be a better way to do this -- anyone?) Results:

 RM_hx_from -- 45925s/16069h of 63136 corpus
 RM_hn_from -- 1136s /0h of 63136 corpus

SA has a special 'missing-match' syntax to detect missing headers.
Here's a rule that I wrote to test for missing Message-Id: headers:

header   L_MESSAGEID_MISSING  Message-Id =~ /^UNSET$/ [if-unset: UNSET]
describe L_MESSAGEID_MISSING  Missing Message-Id: header
scoreL_MESSAGEID_MISSING  1.5

So to cretate your rule:

headerRM_hn_from   From =~ /^UNSET$/ [if-unset: UNSET]
describe  RM_hn_from   From header not found
score RM_hn_from   1.00

I do not know if it is any more efficient than the way that you
did it, I just copied somthing that I found in one of the distributed
rules.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by OSDN's Audience Survey.
Help shape OSDN's sites and tell us what you think. Take this
five minute survey and you could win a $250 Gift Certificate.
http://www.wrgsurveys.com/2003/osdntech03.php?site=8
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] BIG HUGE EVIL RULE NEWS!!!!

2003-12-03 Thread David B Funk

On Tue, 2 Dec 2003, Chris Santerre wrote:

 BIG HUGE NEWS

 A major breakthrough has taken place

 ALL EVILRULES FILES HAVE BEEN COMBINED!! 2622 domains into 178 rules!!!
 Ramdon/tracking hosts tags removed!

 They only increase spamd memory by 1 meg!!! 1 meg!

 You read correctly! Every evil domain since august has been added! Remove
 all you old evilrules files. Grab BigEvil.cf and place it in either your
 /etc/mail/spamassassin dir and restart spamd; or into your
 $home/.spamassassin dir.

 I plan to just keep adding to this file!!!

 http://www.merchantsoverseas.com/wwwroot/gorilla/bigevil.cf

rule 43 hits on cauce.org ;(

That's ironic.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by OSDN's Audience Survey.
Help shape OSDN's sites and tell us what you think. Take this
five minute survey and you could win a $250 Gift Certificate.
http://www.wrgsurveys.com/2003/osdntech03.php?site=8
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Adding another RBL

2003-12-03 Thread David B Funk

On Thu, 4 Dec 2003, Richard Bewley wrote:

 Hi,

 Now, I have the following:
 header RCVD_IN_MY_BNBLeval:check_rbl('bl', 'bl.blueshore.net.',
 '2')
 describe RCVD_IN_MY_BNBL  Listed by bl.blueshore.net
 tflags RCVD_IN_MY_BNBLnet
 score RCVD_IN_MY_BNBL 5.0

 And it still doesn't work.  I also tried it without the '2', any more ideas?

 Thanks,
 Richard

Richard,
Which version of SA are you using? There were changes in the DNSBL stuff
between 2.55  2.60.

If you are using 2.60, try this:

header RCVD_IN_MY_BNBL rbleval:check_rbl('bl', 'bl.blueshore.net.')
describe RCVD_IN_MY_BNBL   Listed by bl.blueshore.net
tflags RCVD_IN_MY_BNBL net
score RCVD_IN_MY_BNBL  5.0

Also make sure that your perl has the NET::DNS module loaded, your
config has not disabled the network tests, your DNS server is
working correctly, etc.

Do a:
  spamassassin -D --lint

and make sure that lines that look like this show up somewhere in
the output:

  debug: is Net::DNS::Resolver available? yes
  debug: is DNS available? 1
  debug: RBL: success for 2 of 2 queries

Note that there will be a bunch of other stuff intermixed, so you
may have to look closely.

Have you tried doing a DNS lookup on that RBL by hand to make sure
that you can resove from it?

Try doing:
  nslookup 2.0.0.127.bl.blueshore.net.

or:
  dig 2.0.0.127.bl.blueshore.net.

Make sure that you get back a valid IP resolition.

Note that your score is rather stiff, be absolutly sure that
that RBL is -alway- good, otherwise you're going to get FPs.



-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by OSDN's Audience Survey.
Help shape OSDN's sites and tell us what you think. Take this
five minute survey and you could win a $250 Gift Certificate.
http://www.wrgsurveys.com/2003/osdntech03.php?site=8
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] How to whitelist this mailinglist

2003-11-27 Thread David B Funk

On Thu, 27 Nov 2003, Martin Lyberg wrote:

 Hi,

 I want to whitelist the SA mailinglist. Is this the right way to do it:

 whitelist_from_rcvd [EMAIL PROTECTED] sourceforge.net

 Thanks in advance

 / Martin

Almost,

  whitelist_from_rcvd [EMAIL PROTECTED] sourceforge.net

will work PROVIDED your mail system makes the envelope sender
address available to SA in a form that it understands.

Ordinarily the 'From:' address is checked, but for this list that
is set to the originator of the message and thus unpredictable.
The envelope sender is always the same [EMAIL PROTECTED]
(most mailing-list systems work this way).

You need to configure your mail system to put the envelope sender address
in one of the following headers before it hands it to SA:

  Envelope-Sender
  Resent-Sender
  X-Envelope-From
  Return-Path
  Resent-From

(see EvalTests.pm for the details).

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Bayes expiry

2003-11-26 Thread David B Funk

On Tue, 25 Nov 2003, Yevgeniy Miretskiy wrote:

 Hello,

 sa-learn stopped learning messages.  Debugging shows that it can
 successfully tie Bayes db, extracts tokens, etc, but never actually
 writes data to the database.

 I had a db corruption issue some time ago, so, this could very
 well be remnants of that.

 Anyway, I'm trying to run  sa-learn -D --force-expire because my db
 grew to be very large (~100MB).

 No matter what bayes_expiry_max_db_size is set to (I tried anything from 100K to 
 3Mil),
 sa-learn reports, after running for quite some time:
   bayes: couldn't find a good delta atime, need more token difference, skipping 
 expire.

 here is the output of sa-learn --dump magic
 0.000  0  2  0  non-token data: bayes db version
 0.000  0  50976  0  non-token data: nspam
 0.000  0   1050  0  non-token data: nham
 0.000  02895991  0  non-token data: ntokens
 0.000  0 1013932420  0  non-token data: oldest atime
 0.000  0 115170  0  non-token data: newest atime
 0.000  0 1069789057  0  non-token data: last journal sync atime
 0.000  0 1069787463  0  non-token data: last expiry atime
 0.000  0  0  0  non-token data: last expire atime delta
 0.000  0  0  0  non-token data: last expire reduction count


 Any suggestions on how to force expiration run in this case?

You've got a corrupted or poisoned database.

Looking at that 'non-token data: newest atime' line, you've got some
tokens in there with a date from the future. That 'non-token data: oldest
atime' date is a fair ways in the past too, so my guess is that those
bogus dates are throwing off the expire code.

Something similar to that happened to me, (dates extreme and whacko)
couldn't expire it and finally had to dump the whole kit.
Maybe worth writing some kind of DB editing tool to manually
remove bogus records, easiest just to dump it.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] bayes database size

2003-11-26 Thread David B Funk

On Wed, 26 Nov 2003, alan premselaar wrote:

 I've recently noticed something I think is a little strange but I'd
 like to confirm it with the list.

 My bayes database seems excessively large at 967M:

 -rw-rw-rw-1 defang   defang61k Nov 26 16:34 bayes_journal
 -rw-rw-rw-1 defang   defang   624k Nov 26 15:58 bayes_seen
 -rw-rw-rw-1 defang   defang   967M Nov 26 15:58 bayes_toks

 sa-learn --dump magic
 0.000  0  2  0  non-token data: bayes db version
 0.000  0   3236  0  non-token data: nspam
 0.000  0   2628  0  non-token data: nham
 0.000  0 121176  0  non-token data: ntokens
 0.000  0 1066969971  0  non-token data: oldest atime
 0.000  0 1069829904  0  non-token data: newest atime
 0.000  0 1069829905  0  non-token data: last journal
 sync atime
 0.000  0 1069735390  0  non-token data: last expiry
 atime
 0.000  02764800  0  non-token data: last expire
 atime delta
 0.000  0  38065  0  non-token data: last expire
 reduction count


 is this really larger than it should be? or am i delusional?

 i'm running redhat 7.3 , sendmail 8.12.10 , mimedefang 2.37 and
 spamassassin 2.60

 any ideas are welcome

Yes, that size seems way out of line. It should be using about 30~50
bytes per token, assuming typical token size.
According to your 'non-token data: ntokens' that bayes_toks file should
be using about 5~6 Mbytes; unless something is whacko, or you have some
-very- large tokens in there.

One possibility, the --dump magic may be looking at a different set
of files. Just to double-check do a sa-learn -D --dump magic to see
which set of files it is looking at.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] lock failed: little help PLEASE

2003-11-26 Thread David B Funk

On Wed, 26 Nov 2003, German Staltari wrote:

 On Wed, 26 Nov 2003 14:30:02 -0500, JC [EMAIL PROTECTED] wrote:

  I'm going to guess that you are running spamassassin as an unprivliged
  user... Based on that, I would like to suggest that you run spamd and
  spamc
  on a port higher than 1024. That's what worked for me. Give it a try and
  let
  me know how it works out for ya.

 Can anybody give me real help please
 I really appreciate your help JC, but I think you don't understand my
 problem very well.
 It's a simple question:
 It's possible that SA runs opportunistic expiry and sync at the same
 time(so lock failed error occurs)?
 TIA
 German

You are running on a SMP box, so it may be that it is truely trying to
expiry and sync at the SAME time (litteraly same time). The locking code
may not be robust enough to cope with that. (SMP introduces all kinds
of interesting problems.)
Try turning off the bayes_auto_expire and run an expire by hand or cron
at slack times (say 5:00 AM?)

Also check your version of Berkeley_DB, there may be an issue with that
on SMP machines.

  I'm running SA 2.60, RedHat 8.0/9.0 (up2date), fresh installs, on SMP
  servers with 1GB of RAM. The bayes_journal_max_size and
  bayes_expiry_max_db_size are in default state (102400/15).
  bayes_learn_to_journal and bayes_auto_expire is set to 1. With this
  settings, SA is doing an opportunistic journal sync aprox every 10 mins,
  and an opportunistic expiry every 12h.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RD: Re: [SAtalk] Lint and SaUriCustomRules

2003-11-26 Thread David B Funk

On Wed, 26 Nov 2003, Matt Kettler wrote:

 At 11:10 PM 11/26/03 +, Alan Munday wrote:
 If I lint with this in the SaUriCustomRules
 
 uri MY_YAHOO_BOUNCED  /http:\/\/srd\.yahoo\.com\/drst\/.*\*
 http:\/\/
 describe MY_YAHOO_BOUNCED Trying to hide real URL through Yahoo redirect
 score MY_YAHOO_BOUNCED0.5
 
 I get a shed load of errors of which the first is:
 
 Bareword found where operator expected at
 /etc/mail/spamassassin/SaUriCustomRules.cf, rule MY_YAHOO_BOUNCED, line 12,
 near usr
(Might be a runaway multi-line // string starting on line 1)
  (Missing operator before usr?)

 That rule is missing a trailing /.. at casual glance it looks like it has
 one, but it does not.
 The end part should be: http:\/\//

One thing that you can do which makes writing this kind of rule easier
is to specify an alternative match delimiter character (somthing other
than / ).
For example if you use ! that rule could then be written as:

uri MY_YAHOO_BOUNCEDm!http://srd\.yahoo\.com/drst/.*\*http://!

MUCH easier to see where things fit with out the flying Ws \/\/
Note that if you are going to use an alternative delimiter, the
explicit 'm' match operator becomes necessary.

FYI, my version of that rule looks like:

uri __L_URI_REDIR   m!https?://.{1,170}/\*http://!i
uri __L_YAHOO_REDIR m!https?://us\.ard\.yahoo\.com/.{1,170}/\*https?://!i
meta L_URI_REDIR( __L_URI_REDIR  !__L_YAHOO_REDIR )
describe L_URI_REDIRURI redirector
score L_URI_REDIR   4.5

I don't like unbounded wildcard searches (.*), as a potential
time-eater and DOS attack point, so I like to bound them with a reasonable
limit (EG: .{1,170}).

This should get anybody's abused redirector and honor valid YAHOO ones.
I hope ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Way to disable auto-learn for messages with specific subject lines?

2003-11-25 Thread David B Funk

On Tue, 25 Nov 2003, Brian Knittel wrote:

 Hi,

 Is there a way to inhibit auto-learning when any specific rule
 is matched? I noted that the auto-whitelist filters inhibit auto-
 learning, but can this be extended to arbitrary other rules?

 I'm my network's postmaster and I get nondeliverable message
 notifications quite frequently. These messages have a specific
 subject line, and I can see how to build a rule to guarantee
 they're never marked as spam:

 header   DELIVERY_FAIL_REPORT  Subject=Delivery Failure
 describe DELIVERY_FAIL_REPORT  Message is a delivery failure report
 scoreDELIVERY_FAIL_REPORT  -99

 But -- since these messages also include a chunk of the
 nondeliverable message, I need to disable auto-learning, otherwise
 lots of juicy spam words will be seen as ham.

 Any suggestions will be greatly appreciated. I suppose I could
 redirect the notifications to an account that isn't filtered by
 SpamAssassin, but I'd like to keep things as they are, if possible.

 Thanks,
 Brian Knittel

In the man page for the SpamAssassin configuration file, in the
section on bayes_auto_learn it says:

Note that certain tests are ignored when determining whether a
  message should be trained upon:
   - auto-whitelist (AWL)
   - rules with tflags set to 'learn' (the Bayesian rules)
   - rules with tflags set to 'userconf' (user white/black-listing
  rules, etc)

So all you need to do is to add a 'tflags' line to your rule definition:

  tflagsDELIVERY_FAIL_REPORT  userconf


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Bayesian 100% on all my mail

2003-11-25 Thread David B Funk

On Tue, 25 Nov 2003, Robert Menschel wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hello Aaron,

 Tuesday, November 25, 2003, 8:58:58 AM, you wrote:

 AY ... Recently I started getting a lot of false positives with SA 2.60.
 AY  I noticed that all my mail was getting a bayesian score of 99 to
 AY 100%. ...My best guess is that since the bayes database only holds a
 AY limited number of tokens, my DB was filling up with spam tokens and
 AY not enough non-spam tokens.  Maybe this happened because I only get
 AY about 10-20 legitimate emails a week versus about 100+ spam emails a
 AY day.

 In November to date, I've trained my Bayes on 683 ham and 6816 spam.
 Ratio therefore seems to be about the same as yours. I haven't seen any
 evidence of the problem -- Bayes is working wonderfully here.

 Bob Menschel

Having had an experience similar to Aaron's I can believe that he could
be having problems with a poisoned Bayes. For example, suppose that you've
received a large number of Nigerian spams that were learned as such.
That would put spam scores on a large number of converstational words.

In a fit of pique, I had tossed a whole bunch of Nigerian spams in
my bayes. It got so bad that a test email that contained only one word
(Hi) got a Bayes 99% spam score. I had to trash the DB and start from
scratch.

So the quality of Bayes scoring does depend upon how it is trained.
It is a tool not a magic bullet, and like any tool can be misused
or abused. Spammers seem to be learning this, I'm seeing an increasing
number of spams that contain Bayes poison.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] SpamScore check

2003-11-25 Thread David B Funk

On Tue, 25 Nov 2003, Nick Tong wrote:

 If so is this possible on a windows platform?

 Nick Tong

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Nick
 Tong
 Sent: 25 November 2003 17:12
 To: [EMAIL PROTECTED]
 Subject: [SAtalk] SpamScore check

 Dear community,

 I send out email news letters each week for my clients but I want to
 offer them the ability to see if there emails will be blocked due to
 there email containing a large amount of spam scoring text?

 Does anyone out there know of a plug-in for spamassassin or any other
 product where I can pass the text to the object and get a score back?

 Many thanks
 Regards,
 Nick Tong

Nick,
Very easy answer, just enroll in Sender Warranted Email (SWE)
http://www.habeas.com. SpamAssassin gives a strong ham bias
to such warranted messages, thus they will not be blocked.

There is a cost for acquiring such a warrant but it should not
be a problem for somebody in the business of sending messages
in the case that those message ARE NOT SPAM.

As you appear to be in the business of sending out news letters
this could be considered a cost of doing business and thus
manageable, as you are NOT sending spam.

There, easy business solution, no technological problems, it will
work with the current and FUTURE versions of SpamAssassin, even
works from windows platforms.

You are welcome.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] SpamAssassin/Milter not scan outgoing emails

2003-11-24 Thread David B Funk

On Mon, 24 Nov 2003, Chris Cook wrote:

Hello,
We are currently using Spamassassin + sendmail + spamass-milter to tag
our mail, but we would like to not have outgoing mail scanned. lda and
outgoing mail are on the same box. I have tried to just use
spamassassin+procmail but the load spawned a million procmail instances
and mail was not getting delivered.

Is there a way to tell spamassassin not to scan outgoing mail when using
spamass-milter?

Thanks,

A couple of different ideas:
1) set up a dual-config system; have the MTA listen for incoming messages
with the spamass-milter to filter them, have the MSA listen for locally
submitted messages and have it configured without the milter so it does
not filter them.

2) customize the milter as proposed by Chris Adams in his post to
comp.mail.sendmail:
http://groups.google.com/groups?q=macro++Milter+bypasshl=enlr=ie=UTF-8oe=UTF-8scoring=dselm=vpts2mdibgq53e%40corp.supernews.comrnum=2
Then set up a sendmail rule to recognize locally generated messages
and set the 'skip_check' macro.

I use the dual config setup, one daemon for incoming messages and one for
outgoing ones. I use miltrassassin and am in the process of hacking it
to add support for the 'skip_check' macro idea. (we have a local campus
listserv that spews out copious messages for the students with great
regularity and when it runs it hammers my SA server. I want to skip_check
its babbeling. ;)

I've already customized miltrassassin to add support for envelope
sender recipient passing into SA, makes whitelist construction much
easier. ;)

Dave

---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Bayes File Ownership

2003-11-21 Thread David B Funk

 From: Gorm Jensen [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 21, 2003 12:04 PM

 I run sa-learn as root using SA 2.55 and 2.6 on two redhat systems.
 Both systems run spamd and call spamc from procmail with -u user1 (or
 user2).  Because there are only two users, each system has a common
 bayes database with file access permitted to both users.

 Occasionally, I have discovered that the ownership of one of the bayes
 files has been changed from spamd.spamd to root.root.  This change
 renders my bayes database unreachable because I run spamd as user
 spamd.

 I can't find a workaround in the docs.  Is there one, or do I have to
 change the ownership somehow?

Workaround that might qualify for 'kluge' catagory, but does work.
Put in your local.cf

 bayes_file_mode 0666

Please no devilish jokes. ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] header reports missing??

2003-11-21 Thread David B Funk

On Fri, 21 Nov 2003, Dan Tappin wrote:

 I have recently installed SA 2.60 from the source code on OS X along side Tenon's 
 Post.Office mail server.  This was a manual
 upgrade from the Tenon supplied SA 2.55 release to be used with their supplied SA 
 'plug-in'.  All is well and e-mail is being
 filters via SA except SA is not adding the score headers to all mail being processed?

 I was told by Tenon that the SA package that they released to their users would not 
 re-write headers but just simply score e-mail
 for PO's own filter system.

 Now that I installed SA from scratch I was hoping to fix this.  I have 
 'always_add_headers' and 'always_add_report' set to 1 and my
 'spamassassin --lint -D' output looks normal with no errors.

 Does SA require a perl module to re-write the headers or am I missing something 
 simple here?

 Thanks,

 Dan

It probably depends upon how the PO+plugin stuff works.
There are two general ways to use SA; in a filter pipeline or as a
scoring stub on a tee.

In the first, the MTA feeds the message to SA on stdin and takes the
result from stdout as the new version of the message to pass on. In
this case any changes SA makes to the message will show up in what gets
delivered. All the SA config options will have an effect on the
appearance of the final message. (this is usually how postfix, qmail,
etc are configured).

In the second way, the MTA feeds a -copy- of the message to SA, looks
at what SA returns and then uses selected parts of the SA output to decide
what to do with the original message. (EG looks at just the score, think
about how a spamc -c  works and would be used).
In this case it's up to the MTA (or it's agent/plugin) to decide what
if any changes to make in the message appearance, regardless of
SA config options.

I'm not familiar with PO but do know sendmail+milter systems.

That system uses the second way of connecting with SA. The milter
gets a copy of the message to pass into spamd. It can take some
of the spamd output (eg a particular header such as X-Spam-Report)
and tell sendmail to incorporate that into the original message, but
it does not have to. Thus it is up to the milter author what the
final message looks like, it could totally ignore all headers
returned by spamd and generate it's own based upon the spamd
results. (usual pracice tho -is- to pass in the SA report headers).

My guess is that the PO+plugin works like sendmail+milter and
thus that Tenon plugin may not be passing the headers back to PO.

The sendmail libmilter architecture has specific provision for
the milter to return message modifications (EG add headers or
replace the message body with a new one). If the PO plugin
architecture does not have similar functionality (EG only returns
an exit status) it may not be possible for it to add the
SA headers.

Check with the Tenon people. There should be lists associated with
MacOS X stuff, maybe somebody can tell you how the PO plugin system
works.

MacOS X -is- a Unix based system, you could use one of the more
traditional methods of incorporating SA into your environment. ;)

Dave

PS, anybody recognize the expression JLRU?

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Bayes File Ownership

2003-11-21 Thread David B Funk

On Fri, 21 Nov 2003, Gorm Jensen wrote:

 I run sa-learn as root using SA 2.55 and 2.6 on two redhat systems.
 Both systems run spamd and call spamc from procmail with -u user1 (or
 user2).  Because there are only two users, each system has a common
 bayes database with file access permitted to both users.

 Occasionally, I have discovered that the ownership of one of the bayes
 files has been changed from spamd.spamd to root.root.  This change
 renders my bayes database unreachable because I run spamd as user
 spamd.

 I can't find a workaround in the docs.  Is there one, or do I have to
 change the ownership somehow?

Revisiting your message, you say: I run sa-learn as root
So you may be doing it to yourself.

When you run sa-learn it rebuilds the database as part of its operation
unless you add the option --no-rebuild. Sometimes when rebuilding
the database it creates a new bayes_toks file rather than just updating
the existing one. If that happens when you (root) are running it, the
new file is owned by root.

So I see a few possible workarounds:
1) always run sa-learn as spamd not root
2) always give sa-learn the --no-rebuild option and let spamd do the
   rebuild
3) always check the bayes file ownership after a sa-learn run.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] DCC and Pyzor Problems

2003-11-20 Thread David B Funk

On Thu, 20 Nov 2003, Josh Dayberry wrote:

 I would appreciate any help that can be offered to me.  I was using spamd and spamc, 
 and everything was working fine.  For some reason I upgraded spamassassin to the 
 newest version and now spamd and spamc won't run the dcc and pyzor tests.  If I use 
 spamassassin however the tests are run correctly.  I am not sure why it would work 
 with spamassassin, but not spamd.  Is there anything like spamassassin -D or a way 
 to log what is going on when running spamc so I can figure out what the problem is?

 Josh

Yes. If you had checked the man page for spamd you would see that it
too has a '-D' option for debugging.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] sa-learn and SpamAssassin headers

2003-11-20 Thread David B Funk

On Thu, 20 Nov 2003, Carlos Jorge Santos wrote:

 Thanks a lot for your answer.

 The problem now is that SpamAssassin (spamc to be more precise) is
 called by Qmail-Scanner, which in turn adds this headers to emails:

 Received: from [EMAIL PROTECTED] by mail.host-services.com by
 uid 101 wi
   (spamassassin: 2.60.  Clear:RC:0:SA:1(11.3/8.0):.
   Processed in 0.328309 secs); 19 Nov 2003 23:34:45 -
 X-Spam-Status: Yes, hits=11.3 required=8.0

 The header X-Spam-Status shouldn't cause any trouble to sa-learn, right?

 What about the other header ? Is there any problem removing all the
 Received headers ? Something like this :

 bayes_ignore_header Received

 Thanks again
 Carlos Jorge Santos

In general you don't want to remove/ignore the Received: headers as they
often provide clues about the spammyness or hammyness of a message.
For example if a Received: header that contains a reference to
'optinsender.com' then that's a spam clue, but if it contains a reference
to your CEO/Dean/president's machine then it's a ham clue (we hope ;).

If there is one specific Received: header that is added by your local
mail system you could remove that but it is probably not necessary.
If the header is alway added to each message (both ham  spam) and
you learn an adequate amount of both ham  spam, then that header
should get a 'neutral' scoring value and thus be effectivly ignored.

One question, is that header added before or after the SA processing?
If after, then this is a total non-issue. ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: Re[2]: [SAtalk] Sanity checking new uri rules?

2003-11-19 Thread David B Funk

On Wed, 19 Nov 2003, Chris Santerre wrote:

 Thanks, I got it now. I updated my evilrules last night, and they tested
 great overnight! I shall post them shortly. This should speed them up
 greatly for everyone! Would this help even more?

 /?:\bsomedomain\.com\b/i

 would the addition of the ?: make it even faster?

No, that is specific to the capturing/clustering use of parenthsis, EG:
  (?:this|that)

So if you're not clustering (|) it doesn't make any difference.
See page 182-186 of forementioned book.

 LOL, the only reason I recognise the name Larry Wall is because of Theo's
 sigs! :)
 I guess I need to go but that book and help support the man.

 --Chris Santerre

Um, have you ever done a perl -v ?
Try it and look at the Copyright line.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] Perl history (was: Re: Sanity checking new uri rules?)

2003-11-19 Thread David B Funk

On Wed, 19 Nov 2003, Chris Santerre wrote:

 LOL, the only reason I recognise the name Larry Wall is because of Theo's
 sigs! :)
 I guess I need to go but that book and help support the man.

 --Chris Santerre

Mumble Mumble, kids these days, no respect for their elders.

OK, mandatory homework assignment:
  http://www.faqs.org/docs/artu/ch02s01.html

Search for references to 'Larry Wall' on that page.

OBH, I spent many hours as a graduate-student-slave sweating
in front of a PDP-11/45 that looked almost exactly like the
PDP-11/35 pictured in the middle of that page.

Dave
( GreyBeard ;)

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] bayes works!

2003-11-19 Thread David B Funk

On Wed, 19 Nov 2003, Bryan Hoover wrote:

 The reason I mention it - aside from being pleased - is to point out
 that it appears the problem was either the old spambouncer headers, or
 forgetting, wasn't (the latter being what I started out suspecting,
 until I discovered the spambouncer headers).  Another imperfection in my
 experience analysis is that I suppose the database could have been
 corrupted, as I'd gotten knocked off line several times while running
 sa-learn.

Assuming that the spambouncer headers are unique to your site you
can put them in bayes_ignore_header lines in your local.cf
and not worry about them.


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Negative score for SAtalk messages

2003-11-19 Thread David B Funk

On Wed, 19 Nov 2003, Marc Steuer wrote:

 Hi list members,

 I want to score messages from [SAtalk] with a negative score so examples
 posted to the list won't be tagged as spam.  This is my first venture into
 regex and I've tried:

 header MY_SATALK  Subject =~ /\[SAtalk\]\b/
 describe MY_SATALKMessage from [SAtalk]
 score MY_SATALK   -10

 Messages with [SAtalk] in the subject aren't always matched by this rule.

 Two questions:
 1.  Do I have the rule syntax correct?
 2.  Is there an alternate way to ensure SAtalk messages won't be tagged as
 spam?

 Thanks,
 Marc

whitelist_from_rcvd  [EMAIL PROTECTED] lists.sourceforge.net

Works for me and is harder for spammers to abuse.
Note that for this to work your mail system must be set up in such
a way that spamassassin can see the envelope-sender address.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Need help with a simple custom rule

2003-11-19 Thread David B Funk

On Wed, 19 Nov 2003, Robert Davidson wrote:


 Hi everyone,

 I set up a SpamAssassin system for a company but they alerted me to a
 problem today.

 They have content filtering rules to stop people from abusing their
 employees.  Basically any e-mails with naughty words are given 50 points
 and are stuffed into quarantine.

 The below rule is causing problems:

 body CUSTOM_RULE_01 /\bFUCK/i

 They contacted me because they were sent an MS Word document which looks
 like it has previously been infected with a virus that puts ALT-F11
 it's the fuck! or something into the document somewhere.

 I am running SA 2.55

 My question is how can I get SA to only scan the message text and not
 look into attachments like .doc files and so on?

Check out MIMEDefang. It can be used to split a message into seperate
parts and then process them appropriately. (EG feed text parts to
a content filter like SA, feed binary attachments to a AV-scanner, etc).

It will add to the processing overhead of each message but it should
do what you want.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: Re[2]: [SAtalk] Sanity checking new uri rules?

2003-11-18 Thread David B Funk

On Tue, 18 Nov 2003, William Stearns wrote:

  anchoring with \b = fast

   OK, cool.  As I'm doing full domains, I'll change:
 uri  WLS_URI_1 /0-go.org/i
   to
 uri  WLS_URI_1 /\b0-go.org\b/i
   in the next version.

Also escape that '.' so that it's taken as a litteral not a wild-card.
(reduces need for back-tracking ;)

IE:
  uri  WLS_URI_1 /\b0-go\.org\b/i


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: Re[2]: [SAtalk] Sanity checking new uri rules?

2003-11-18 Thread David B Funk

On Tue, 18 Nov 2003, Chris Santerre wrote:

   uri  WLS_URI_1 /^http:.*\b0-go.org\b/i

 Regex confusion on my part! '\b' is bounding, but I thought that meant bound
 by space??? wouldn't this above regex _NOT_ hit :

 http://stuff.0-go.org/stuff

 Isn't it looking for:
 http://stuff. 0-go.org

 I'm confused! (it's not the first time, won't be the last!)

 --Chris

The \b match operator is a bit special in that it does not
match a specific character but the gap between two adjacent
characters. Think of it like the insertion cursor of a word
processor, it points between the characters, not on a character.
Sort of like ^ points to the beginning of a line, not at the
first character of the line.

If you know what the perl \w and \W character classes are,
then \b points to the boundary between two characters that are
matched by either the regex \W\w or the regex \w\W

See page 180 of the O'Reilly Programming Perl book (Third edition).
(Good book, written by a guy named Larry Wall ;)

So that WLS_URI_1 regex is looking for:
start-of-line, followed by the litteral character string http:
Possibly followed by some number of unidentified characters, the
last one of which -must- match the \W character class
(note that there could be zero of the above critters as the :
at the end of http: nicely matches the \W requirement).
Followed by the litteral character string 0-go, followed by
one random character (note that . is a wildcard), followed by
the litteral character string org followed by something that
matches \W. (Thus orga would not match here).

Boy, it takes a bunch of words to explain what that little jumble
of regex does, powerful stuff these regexes ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] SA LIST PROBLEM? Quoted Printable problem?

2003-11-18 Thread David B Funk

On Tue, 18 Nov 2003, Charles Gregory wrote:


 Hello,

 Lately on several e-mails from the list, I've been seeing an error message
 in my Pine mail program that says:
   [Error: Formatting error: Non-hexadecimal character in QP encoding]

 More importantly, the message is *truncated* in the display.
 Oddly enough, when I quote the message to reply, I see the full text.
 The line in error appears to have an = at the end of line, which
 looks like it was used to mark line wrapping. But Pine doesn't seem to
 recognize this usage.

 Here is an example line - first the quoted printable:
  while(-1 !=3D (opt =3D getopt(argc,argv,-BcrRd:e:fhyp:t:s:u:xSHU:))=
 )
 Then the rendered version:
  while(-1 != (opt = getopt(argc,argv,-BcrRd:e:fhyp:t:s:u:xSHU:)))

 In Pine, it decodes the line properly up to the parentheses before the
 last '=', then gives that error. Clearly Pine expects a hex code, not a
 NL/CR. Now was there a code there originally, but mailman stripped it out?
 Or is Pine failing to recognize a legitimate code sequence?

 Red Hat 9  with whatever Pine is default for that disty.

 - C

Based upon the headers of your message, it looks like you're using pine
v4.05. (look at the top-left corner of your pine display for the version
string, or invoke it as pine -v).

That's a pretty ancient version, I think that the U-Washington site is
up to version 4.58. IIRC, that got fixed somewhere around v4.3*, I'm
using 4.44 and it doesn't have that problem.

So the answer is to upgrade your pine.
http://www.washington.edu/pine/

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Proposal for a delete option to spamc

2003-11-18 Thread David B Funk

On Tue, 18 Nov 2003, Øystein Halvorsen wrote:

 Our only MTA for externally received email is sendmail, which again
 forwards user emails to an internal exchange server.  In fact, we have
 tried this out, and it works quite nicely (at least our local users are
 delighted).  In order to make spamc drop emails, the -D option has to be
 explicitely specified, otherwise the spamc client will work exactly as
 before.  However, if another MTA than sendmail tries to make use of this
 option, then it should of course be throroughly tested out before put into
 real production.

If you are using sendmail+spamassassin as a filtering gateway in front
of some backend server (EG exchange), then I'd suggest using a sendmail
'milter' daemon rather than your spamc configuration (such as
miltrassassin).

This has the advantage of being more fault-tolerant (what happens with
your config if something breaks and causes spamc to choke on each
message?)
You can configure it to do a SMTP-reject of a spam message rather than
just silently deleting it.

That  has the major advantage that the sender is notified of the
rejection (assuming that the sending system is legitimate).
Thus if something unexpected happens and you mark a legitmate message
as spam there's hope for discovery and repair rather than things just
mysteriously dissapearing. ;(

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Validate Sender Users

2003-11-17 Thread David B Funk

On Mon, 17 Nov 2003, Eduardo Alfonso  wrote:

 Hi

 I've been trying to configure SpamAssasin to check for the existence of the user on 
 the local machine that is
 trying to send the message and I couldn't find how to do this.

 Is it possible ??

 Thanx

 I'm using sendmail MTA in a RedHat9 box

If I understand you correctly, what you want is something that will
examine an incoming message and try to dynamically verify that the sender
of record (mail from: [EMAIL PROTECTED] ) is indeed a valid user
at the remote system.

What you're looking for is something like this:
  http://www.snert.com/Software/milter-sender/index.shtml

Altho it sounds like a good idea, it has potential problems.

If you're a low traffic site, it wouldn't be too bad, but if you
handle  10k messages a day, it would increase your network load and
could be potentially abusive of remote sites that are innocent
bystanders. (spammers often use randomly generated fake
hotmail/yahoo/msn addresses.)

Some legitimate lists send out messages with a deliberatly
invalid 'from' address (they don't want to be bothered with bounces).
This kind of system would block such messages.

Some systems have an incoming gateway that will accept -everything-
and let an internal mail server decide if the user is valid, thus
making the sender verification system useless.

Some heavily loaded systems (such as hotmail) occasionally get backed
up for hours at at time and not accept connections. This would slow
down your incoming mail from those sites, not to mention putting
addtional load on their servers.

What might be a better solution would be to interlock your MTA with
some kind of database  SA. When a message gets rejected by a remote
site with a DSN, put that address in the database as invalid and
then have SA take that into consideration when scoring more messages
from that address.

You can do that by hand editing your sendmail access-db to add rejects
for specific bogus from: addresses. It might be a worthwhile project
to try to automate the process.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] whitelist_from_rcvd

2003-11-17 Thread David B Funk

On Mon, 17 Nov 2003, Martin McWhorter wrote:

I am having a problem with whitelist_from_rcvd not working.

I have Spamassassin running on a redhat 9 box with sendmail 8.12.8 as
our companies gateway MTA. I have MIMEdefang running as well, but with
the Spamassassin portion of the defang.conf commented out and
spamassassin running with the spamass-milter.
Our imap mailboxes are hosted internaly on an NT box running Iplanet
4.15 named mercury. All mail in and out goes through the
sendmail-spamassassin MTA.
In the spamassasin conf I have:

whitelist_from_rcvd [EMAIL PROTECTED] mercury.prairiegroup.com
[snip..]

whitelist_from_rcvd needs appropriate values for the trusted_networks
parameter before it works.
Given that you have a firewall configuration with internal/exteral
addresses on that SA box, it might be confused when trying to
automagically determinte trusted_networks.
Try explicitly setting trusted_networks.

Other possiblity, that Add to Address Book stuff in the 'From:'
field might be confusing SA.
Silly question, why are you scanning your outgoing messages?
(Don't you trust your users not to spam? :)
You've got two seperate interfaces on that redhat box. Set up a
sendmail w/ SA on the external IP for incoming mail and another
one w/o SA on the internal IP for outgoing mail.
Dave

--
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] Re: URI database lookup feature (was Sanity checking new uri rules?)

2003-11-17 Thread David B Funk

On Mon, 17 Nov 2003, Justin Mason wrote:

 BTW, given that a URI DB cannot use regular expressions, or patterns,
 would this really be useful?

 Basically with a DB you only gain efficiency when looking up exact
 strings.  So for this to be useful against URIs, you'd have to pick out
 *just* the domain part of the URI and look it up. e.g.:

 http://www.stearns.org/sa-blacklist/sa-blacklist.2003111402.uri.cf

 would be looked up as www.stearns.org or stearns.org.)

The parser in the Bayes routine (tokenize_line in Bayes.pm) creates 'UD:'
lookup tokens for each component of the domain name. So for the above
example, it would create:
UD:www.stearns.org
UD:stearns.org
UD:org

Thus the DB would only need to contain one entry for the lowest common
denominator [1]. IE: stearns.org.

 I suspect doing this with a DB lookup may not be such a win, compared
 to using a local eval test that parses a config file and creates an
 in-memory hash table.

 - --j.

Au contraire, a DB lookup is a big win compaired to a regex match for
speed/memory consumption. The Bayesan engine does hundreds of lookups
per message against a database that has tens (or hundreds) of thousands of
(50k~200k) entries. Other people on this list have found that using regex
matches, (EG 'evilrules') a set of just a few thousand patterns make a
major hit in processor load.

One of the big advantages of using a DB type system is that it can be
updated 'hot' on a running system. A system based upon parsing a config
file and creating an in-memory hash table would require restarting spamd
every time an update was made.

If we want to have any hope of automating such a system, it needs to be
updatable 'hot' (note how Bayes operates).

Yes, you are right in that a URI DB cannot use regular expressions or
patterns. However, if we're just looking for a 'catcher' for spammer
sites in URIs, that's probably not necessary. We just want to grab a
host/site name out of a spam and slam it in there. Ask people such as
Chris how much time he spent regexing each entry in his 'evilrules'
set. Speed of update and search are far more important IMHO.

I envision this working in a couple of possible ways, either updated from
a central site (EG the rules emporium) via wget/rsync etc, or by a local
engine that would use some kind of heuristics on suspect host names found
in potential spam (do DNS lookups, use IP that point to spammer nets,
look at 'whois' data for spammer hosting, look at DNS TTLs, etc).

Part of my motivation is a local competition. Our central campus IT
group looked at SA and then decied that it was too much work to manage,
so they spent money and bought Activestate's PureMessage product.
(Which is based upon a commercialization of SA. Many of the header
tags even match ;).
Part of our mail streams thru the central servers so I get to compair
the SA scoring against the PMX scores. Most of the time SA does a better
job (fewer FP/FN) but sometimes PMX wins and when it does it is
us usually becase of a 'sparse' spam that has just a few URL
images (and a bunch of Bayes fodder). The PMX score will be often
pushed up by a rule that is labled: KNOWN_ADVERT_URL

So my guess is that PMX already has something like this. I want it TOO!

Dave

[1] In a mathematical context, 'lowest common denominator' makes no sense.
The number 1 is always the lowest common denominator for any value.
Mathematically we're looking for the GCD ('greatest common divisor').

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] Re: URI database lookup feature

2003-11-16 Thread David B Funk

On Sat, 15 Nov 2003, Carl R. Friend wrote:

On Sat, 15 Nov 2003, David B Funk wrote:
 
  I've been thinking about that exact topic. The Bayes engine
  already parses and tokenizes hostnames from URIs (the UD: tokens).
  If there were a hash DB made with the spam-site hostname as key and
  score,description as value (something like the sendmail access db)
  then it should be pretty easy to take those UD: tokens and do a
  lookup and add results to total score.

As near as I can tell, there's no current way for a rule
 to return a score -- they're boolean in nature.  That said,
 even that would be a win for looking up domains in URIs and
 whatnot.

I had assumed that it would need new code to support. That's
why I've been trying to figure out how the scoring code works.
(and trying to get the developers to notice this idea ;).

I don't think that it should be too hard to do, most of the
hooks are already there, but the devil is in the details. ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Problem with spamd spinning on bayes_toks

2003-11-15 Thread David B Funk

On Fri, 14 Nov 2003, Bob Amen wrote:


   We've been seeing a problem with spamd that happens at random times.
 Occasionally, a spamd thread will spin, clocking up CPU time and never
 finish. This causes other spamd processes to hang and eventually all
 memory and swap is used up by multiple spamd and sendmail processes.
 We've set limits on the number of threads of spamd and sendmail but that
 doesn't help. Eventually everything stops, either because their are no
 resources available or there are no more threads available and they're
 all waiting for some to exit, which never happens.

   We've determined that it's the Bayes DB but are not sure why. This
 happens on a RH 6.2 system with Perl 5.6.1 and a Gentoo (2.4.20 kernel)
 with Perl 5.8. Below is the output (very shortened) of strace and then
 lsof of a spamd process that exhibited this behaviour:

[snip..]
   Does anyone have any ideas or suggestions for other diagnostics that we
 can try?

 TIA,
 Bob

You didn't say what version of SA you were running, so I'll assume it's
2.60. (or 2.55 using DB_File). DB_File is based upon the Berkekey_DB
kit (www.sleepycat.com). Usually Berkekey_DB is robust but occasionally
it has shown itself to be a bit sensitive to certain OS dependent
factors (threads implementation, file locking, shared memory, etc).

I'd go to the sleepycat site and check on the version of Berkekey_DB
that you're using to see if there are any known issues for your OS.

The sleepycat Berkekey_DB source kit comes with  an extensive test
suite, you could get the kit and do a 'make test' to see if it finds
anything with your libs.
Alternatively just build a new set of libs from the source kit and
try those.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] URI database lookup feature (was Re: Sanity checking new uri rules?)

2003-11-15 Thread David B Funk

On Fri, 14 Nov 2003, Carl R. Friend wrote:

For the assembled group -- is it possible to do a DB lookup,
 either in an eval() or some other mechanism, in a uri rule?
 If we could do a DB lookup on URIs (or, more properly, the
 domain portion of URIs) I think that'd be a win (at, of course,
 the expense in human time).


I've been thinking about that exact topic. The Bayes engine
already parses and tokenizes hostnames from URIs (the UD: tokens).
If there were a hash DB made with the spam-site hostname as key and
score,description as value (something like the sendmail access db)
then it should be pretty easy to take those UD: tokens and do a
lookup and add results to total score.

It would be much faster and use less memory than the various
collections of regex spam-host rules that have been discussed
here (such as William's or Chris's evilrules ;).
(I have a 15,000 line sendmail access db that doesn't bother
it a bit ;).

Another advantage is that it would be possible to update the
database 'hot' (IE without having to kill and restart spamd,
the way that you have to do to update regex rules).

It might even be possible to automate the updating of the
database. (take hostnames found by Bayes in spam, do DNS lookup
and add if IP in spamhaus nets, in trusted DSBLs, has short TTL,
etc).

I can see one of two different implementations:
1) Have the value be just score,description and synthesize the
rule name from the hostname (EG:

spammer.com 1.2,Spamhause business site
-  rule == L_URI_SPAMMER_COM
score == 1.2
description == Spamhause business site

2) have the value be a triple, name,score,description and
explicitly store all attributes:

spammer.com MEDS_SPAMHAUS,1.2,Spamhause business site

1) would be simpler to update and use up less memory,
2) would be more flexible and let you combine several different
sites into one class of rule.
Asumption, once a host matches a rule, subsequent matches on that
rule name would be ignored.

Probably should also add some kind of time-stamp to each entry to
facilitate automated updating.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Run spamd as root ?

2003-11-15 Thread David B Funk

On Thu, 13 Nov 2003, MIKE YRABEDRA wrote:


 I have found that spamd will not use razor on my system because of
 permissions. Is it safe to run spamd as root?

Mike,
spamd will not run as root, it is a security risk.
If you start it as root and you do not tell it who you want to run as
(IE leave off the '-u' option) it will automagically switch to user
nobody.

If you have your permissions set correctly, spamd  razor can
happily run as non-root. Why not just fix your permissions?


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Razor does not work with spamd?

2003-11-13 Thread David B Funk

On Thu, 13 Nov 2003, MIKE YRABEDRA wrote:



 I have been trying to get razor to work with spamd.

 I know it works with ./spamassassin --lint -D

 It also works with CGPSA (calls SA directly).

 But it does not want to work with spamd?

 Where are some places I can look , things I can try,  to narrow down the
 problem?

razor wants a ~/.razor directory to store its configs and
working data (servers lists etc). You've probably tested it
as either root or your personal account.

Have you tested it as the user that spamd runs as?
(EG the user-id that you have in -u user-id for your
spamd command line).

su to the spamd user-id and test razor again. It's probably
either a path or permissions error.

Other thing to try, run spamd with the '-D' flag to turn on
debugging output. (But don't run it that way for too long unless
you have a -large- disk to hold the output ;).

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] SMTP gateway/filter

2003-11-12 Thread David B Funk

On Tue, 11 Nov 2003, Larry Gilson wrote:

 The preferred method is any way you prefer. ;)  That is really an honest
 answer.  Everyone has their own preferred method and a lot of times it
 depends on your specific situation.  Some people will pipe to a filter shell
 script, Procmail, maildrop, or spamc directly.  I prefer Procmail as it
 allows me to do more post SMTP processing with the message than the shell
 script or a direct pipe to spamc.  maildrop works well for some people but I
 honestly am not familiar with it.  I would like to hear from someone who has
 chosen maildrop rather than Procmail just to have a comparison though.

 --Larry

  -Original Message-
  From: Robban
  Sent: Tuesday, November 11, 2003 2:58 PM
  To: [EMAIL PROTECTED]
  Subject: [SAtalk] SMTP gateway/filter
 
 
  I'm pretty new to spamassassin and I've only done a few
  spamassassin/postfix installations. My next task is to sett
  up some sort of STMP gateway that filters e-mail for spam and
  if approved, forwards the mail to the real mail server. The
  real mail server will probably be an exchange server but we
  might also end up with godd ol' sendmail. What would be the
  preferred practice in setting up such a thing. Any ideas?
 
  //robban

Larry,
I agree with the first part of your advice to Robban but completely
disagree with the Procmail part.

Robban is asking specifically for a filtering front-end to some
kind of back-end mail server (such as Exchange). Procmail would
require him to fake a delivery to each account on the SA processing
machine, which would mean that they would have to create user accounts
for every Exchange user on the SA box.

I think that Robban is looking for some kind of filtering appliance
that mail flows thru as a SMTP stream and the back end server handles
the delivery/user part.

Something like sendmail+milter, sendmail+mailscanner, postfix+spamc
or postfix+MIMEDefang would be better suited to this application.
It can process  tag mail with out needing any specific user account
information.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Attachments

2003-11-12 Thread David B Funk

On Wed, 12 Nov 2003, Matt Kettler wrote:

 At 01:38 PM 11/12/2003, Scott Antonivich wrote:
 but can attachments be tagged as spam per user? If
 so, what do I need to place in this users config file?

 You'd have to create a custom rule to look for mime boundaries..

 However, to do it per-user, you'll need to have per-user configs, and
 per-user rules, something that most site-wide SA configurations have no
 capability to do.

You'll have to be discrimiating in what kind of mime boundaries
you look for.

For example, many modern mail clients (such as Eudora, Outlook, Mozilla)
have the ability to send combo text/html or text/rtf mail as Mime
multi-part-alternative messages. Most modern clients will show such a
message as just a single-part message and give no clue as to the
internal structure.
Some systems us Mime parts for such things as PGP signatures or .vcard
signatures.

Even such things as sendmail error bounce messages often come as
multi-part mime messages.

Now what about a message that has only one part, but that part is
a Base-64 encoded jpg of a spam-ad? It would not necessarily have
any mime boundary other than the content tag in the header.
Or for that matter, a single Base-64 encoded virus, (I've seen that
too. ;(

So there's no simple definition of what constitutes an Attachment.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] order of preferences in white / black listing

2003-11-11 Thread David B Funk

On Sun, 9 Nov 2003, Tristan Nixon wrote:

 Hello all,

 I have a question regarding the way in which SA deals
 with whitelisting  blacklisting.  If I want to whitelist
 all but a few select entries from a domain, how would I do it.
 Should the following work?

 whitelist_from [EMAIL PROTECTED]
 unwhitelist_from [EMAIL PROTECTED]
 unwhitelist_from [EMAIL PROTECTED]

I think that should work but never tried it


 I can't seem to find a way to do this.  Here is my predicament,
 I have been getting a whole pile of viagara spams a day, all of which
 have my email address as both the To: and From: address.  I have
 whitelisted my own domain, but would love to be able to unwhitelist my
 own email address ( I don't have much call for sending messages to
 myself ).  Anyone know how to do this?

Silly question, why are you whitelisting your own domain?
Is it to make sure that locally generated mail doesn't get filtered?
In that case you probably should be able to make 'whitelist_from_rcvd'
do the job, which is much harder for remote spammers to abuse.

Another option would be to set up a MSA for locally generated mail
and have it totally avoid SA processing.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Re: Trouble restarting spamd

2003-11-11 Thread David B Funk

On Sat, 8 Nov 2003, Debbie D wrote:

 Thanks.. ut yes Ishold have stated that.. stop  start..

 [root admin]# /etc/rc.d/init.d/spamd stop
 Shutting down spamd: ok
 [root admin]# /etc/rc.d/init.d/spamd start
 Starting spamd:  spamdCould not create INET socket: Address already in use
 IO::Socket::INET: Address already in use

 Any other ideas???

The error INET: Address already in use means that there's a process
that already has that network socket open. Given that you had just done
a spamd stop the offending process is probably a spamd child process
that hasn't yet finished up a message.

Make sure that -all- spamd processes are gone before you try the start.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] [MailServer Notification]To recipient: Message matched eManager setting and action was taken.

2003-11-11 Thread David B Funk

On Mon, 10 Nov 2003 [EMAIL PROTECTED] wrote:

  eManager Notification *

 The following mail was blocked since it contains sensitive content.

Love the stupid -PC- double-talk here. Gee, what was the content
sensitive to? (is it sensitve to light, heat, shock...)

How about saying was blocked because it contains objectionable content
and be done with it.

Sad. A spam-blocking list cannot even discuss the kinds of things
that it's supposed to block.

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] rule to whitelist Listserv (tm) list traffic

2003-11-10 Thread David B Funk

On Mon, 10 Nov 2003, Chris Barnes wrote:

 I am in need of a rule that will tell SpamAssassin to whitelist all
 email traffic which comes from our local Listserv (tm - www.lsoft.com)
 lists.

 The problem is that messages from the Listserv list have the original
 author's email address in the From: line.  The Listserv list address is
 in a header tag of:

 Sender: Name-of-list [EMAIL PROTECTED]


 In other words, SA needs to look at a header tag of SENDER:, not FROM:
 How would this rule look?

 (my guess)
 header LISTSERV_GOOD_SENDER Sender =~listserv.tamu.edu
 score  LISTSERV_GOOD_SENDER -100

 Would that work?

Almost. It needs to be a valid perl pattern-match regex:

  header LISTSERV_GOOD_SENDER Sender =~ /listserv.tamu.edu/

Only problem with that is that it will be suceptable to spammer abuse
if they ever find out about it. (note that emperical evidence points
to spammers reading this list ;().

What would be better is if you could use 'whitelist_from_rcvd' as it's
much more difficult for an external agent to abuse.
However this would require the predictable envelope-from address
being accesssable to SA.
In addition to the From: header SA looks for from address info in
the headers:

  Envelope-Sender:
  Resent-Sender:
  X-Envelope-From:
  Return-Path:
  Resent-From:

Any chance you could get your listserv to put it's Sender info into
one of these?

If you are only concerned about local SA filtering of these messages,
you could customize the 'EvalTests.pm' file in your SA instalation
and add Sender: to that recognized from header list.

One other possibility depends upon how you call SA. If your method of
processing the mail has access to the envelope-sender, you could hack it
to synthesize a 'Envelope-Sender' header to pass that info in to SA.

I use spamd with sendmail and miltrassassin. I hacked the miltrassassin
code to synthesize a 'Envelope-Sender' header and it makes whitelisting
mailing lists va whitelist_from_rcvd much easier.

If you do go the whitelist_from_rcvd route be sure to set your
trusted_networks parameter.

FWIW, I prefer to use def_whitelist_from_rcvd instead of
whitelist_from_rcvd. Makes mistakes and successful forgeries less
damaging. ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] [RD] Re: Abused redirector URLs ?

2003-11-07 Thread David B Funk

On Fri, 7 Nov 2003, Chr. von Stuckrad wrote:


 Hi
[snip..]
 The mail also contained a broken variant of the wrong/forgotten
 Parameter of their Spam-Mailer:  ' $RANDOM IZE '

 So:

 body RANDOM_IZE / \$RANDOM IZE /
 score RANDOM_IZE2
 describe RANDOM_IZE contains broken spamrobot parameter


Even better, sprinkle in \s? between each of those letters and
you'll catch all permutations of that garbage.
EG:
 body RANDOM_SPAM_TOOL  /\s\$RA\s?N\s?D\s?O\s?M\s?I\s?Z\s?E\s/

Using the \s will catch it at line breaks too. ;)

I've combined that with other popular spam-tool 'RANDOM' signatures:

body L_SPAM_TOOL_3  
/\[RANDOMIZE\].{0,3}\[RANDOMIZE\]|\s\$RA\s?N\s?D\s?O\s?M\s?I\s?Z\s?E\s|\b%%RANCHAR%%\b|\b%random_word|\b%random_text/
describe L_SPAM_TOOL_3  Spam tool pattern
score L_SPAM_TOOL_3 4.4

(Now we'll see if my own rules fire on this message. ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{
---BeginMessage---

achieve householder icosahedra corrugate bourgeoisie poets bender create postfix hustling bacchus imaginatively talking courses cranking matters hostelry bowel excommunicated polytope acclaim cries microscope abyssinia metronome scooping adjustors ternary scraps bobbins $RANDO
MIZE counsellors adject cribs miller braking medial plugs bothered ethically pounding bombard poetical braided mealtime hurdle pneumococcus tearfully branches avis exams saws expositor bakhtiari memorandum merchandising tank scholastic creature excretions menhaden $RANDOM
IZE brazenness meddle thallium hotelman playgrounds evens excise houseflies sears crate expunged huts scarcity popped boners expositions methylene plumage meandered possessor advice
 


 schemas coverlets saxophone crosshatch coursing addictions branches pluperfect at bodhisattva craftsman brainwashes teacher hue meretricious horrible meticulous at tassel cowpoke adumbrates saturater port extemporaneous teletypewrite bodybuilder bangor pleading boaster crosswort $RANDOM
IZE hydraulic andrea plugs bosom berlin acorn hygiene practitioner excited brambly postulated excise critique plied corridors ac cottonwood adores scrolls evince bombarding teletype sardonic savers meekness bamako potatoes sates telling admiralty $RANDOMI
ZE adjustment poison excitement acculturated evenings migrated hygiene imagined additives hurrah cousins schedulers australia crossroad bahama sclerosis accoutrements banks advanced bluegrass bodhisattva

---End Message---

Re: [SAtalk] problem with Razor and spamd

2003-11-07 Thread David B Funk

On Thu, 6 Nov 2003, Peter Buonora wrote:

 Ok, I am running on Solaris 8, latest version of Spamassassin, perl 5.8.

 For some reason when using the 'spamassassin' executable, razor works.
 When I try to switch over to spamc/spamd everything works except razor.
 There isnt even anything in the logs referring to razor.

 I am running spamd with the -H option and in debug mode.

 I even tested just from the command line as the same user. Here are the
 results from the exact same email. Any help appreciated.


It's probably either a permissions or a path issue.

'spamd' does not run as user root. It runs as the user that you have
specified on the command line with the -u username option or as nobody
if you didn't specifiy.

On the machine running spamd, 'su' to the appropriate user and try testing
again, do a:
  spamassassin --lint -D

See what that shows you about razor.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Re: 'random' character sets

2003-11-07 Thread David B Funk

On Fri, 7 Nov 2003, Justin Mason wrote:

 BTW, SpamAssassin originally started with accumulating rules.  But I took
 it out, as it meant a long hammy mail had a much higher chance of FP'ing,
 due to containing more text.

 I'd be worried that accumulating hits would reintroduce the same
 problem...

 - --j.

Easy way to deal with that, divide the accumulated score by a factor
based upon the message size.

EG, if I see 20 'BR' tags in a 1Kbyte message it has a high probability
of being spam but if that message is 30Kbytes then it's lower.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] installation fails [perl]

2003-11-07 Thread David B Funk

On Fri, 7 Nov 2003, Maarten J H van den Berg wrote:


 I eventually found the reason for this behaviour after I noticed that root
 could start it okay, but the unpriv user spamd couldn't...:

 machine:~ # ls -la /usr/lib/perl5/site_perl/5.6.0/Mail/SpamAssassin
 drwx--2 root root  942 Nov  7 11:45 .

Try setting your umask to 022 before doing any installs. ;)

Other good rule of thumb, always test out a new installation as the
user who will ultimatly run the stuff. Good way to catch permissions 
path problems. (I ran into the exact same problem when I did my first SA
upgrade. ;)

[snip..]
 I'll try to find out which step exactly triggers this.  In the meantime, I
 get bitten by some very lame dependancies which invariably lead to
 CPAN.pm attempting to install perl 5.8 (I have 5.6).  I don't understand
 this. In the first place it doesn't give me any choice to skip the
 install of perl 5.8 as it does with all other packages, secondly the
 Spamassassin faq says perl 5.8 is to be avoided, right ?
 In any case, no matter what, I'll not allow such a perl upgrade on a
 production system cause I can not afford to break things.

IIRC, this was due to a bug in one of the versions of CPAN.
Try updating your CPAN


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] Accumulator rules (Re: 'random' character sets)

2003-11-07 Thread David B Funk

On Fri, 7 Nov 2003, Robert Menschel wrote:

 Or better: what if we specified in the rule a maximum score to accumulate
 to? Maybe something like:

 accumbody  T_SAMPLE  /(?:word1|word2|word3|word4|word5)/i,max=2.5
 describe   T_SAMPLE  Message has medical words frequently used in spam
 score  T_SAMPLE  0.5

 Each time any of the five words was used, it'd score 0.5, to a maximum
 score of 2.5. No matter how long the message was, this rule could not by
 itself cause an FP, and would work in conjunction only with other rules
 to flag something as spam.

A slight modification of the above idea, rather than 'max=2.5' have
'maxhits=5'. IE that particular rule fires no more than 5 times and then
the matching engine can drop it and move on to the next rule.

The final score would be 'nhits' * score. That way the matching engine
does not need to worry about any score calculations, just tallying up
number of matches.
There should also be a default implicit 'maxhits' value to keep the
matching process moving along and not slow things down too much. ;)

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Re: Accumulator rules (Re: 'random' character sets)

2003-11-07 Thread David B Funk

 DBF On Fri, 7 Nov 2003, Robert Menschel wrote:

  Or better: what if we specified in the rule a maximum score to accumulate
  to? Maybe something like:
 
  accumbody  T_SAMPLE  /(?:word1|word2|word3|word4|word5)/i,max=2.5
  describe   T_SAMPLE  Message has medical words frequently used in spam
  score  T_SAMPLE  0.5
 
  Each time any of the five words was used, it'd score 0.5, to a maximum
  score of 2.5. No matter how long the message was, this rule could not by
  itself cause an FP, and would work in conjunction only with other rules
  to flag something as spam.

 DBF A slight modification of the above idea, rather than 'max=2.5' have
 DBF 'maxhits=5'. IE that particular rule fires no more than 5 times and then
 DBF the matching engine can drop it and move on to the next rule.

 DBF The final score would be 'nhits' * score. That way the matching engine
 DBF does not need to worry about any score calculations, just tallying up
 DBF number of matches.
 DBF There should also be a default implicit 'maxhits' value to keep the
 DBF matching process moving along and not slow things down too much. ;)

OK, here's the next revision (I've put my programmer's hat on ;).

When parsing the config files and generating the rules structures,
for each rule add two new variables: 'maxhits', default to the value 1 
'nhits' init to 0.
If the rule has a maxhits=n argument, set the maxhits to that value.

When running the matching engine eval for a rule, each time there's a
hit, increment nhits and decrement maxhits. if maxhits  1 terminate that
rule.

In the scoring and running the meta-rules, consider 'nhits' to be the
value of that rule. IE if == 0, then for boolean sake false, if != 0
then true, for arithmetic metas, it's the actual value of 'nhits'.

The score part, added to 'hits' is 'nhits' * score for that rule.

So if you leave maxhits = 1 for each rule (the default),  you have
everything working as it does right now. The accumulating part only
kicks in if a maxhits=n argument is added to a particular rule.

Thus you would only need to modify one user-visible part of the
conf stuff, add the maxhits= argument. All other rule stuff would
still look  act the same (in default).

Theoretically you need only modify two parts of the whole kit, Conf.pm
to parse the optional maxhits=n argument and the matching eval engine
to keep searching if maxhits  0 (if I understand the code correctly ;).

Probably also want to modify the report stuff.

Looking at Robert's  'T_SAMPLE' example from his previous message, you
would implement it as:

body  __T_SAMPLE  /(?:word1|word2|word3|word4|word5)/i,maxhits=6
meta  T_SAMPLE (__T_SAMPLE)
score T_SAMPLE  0.5
describe  T_SAMPLE  Message has medical words frequently used in spam
meta  T_SAMPLEA ( __T_SAMPLE  5 )
score T_SAMPLEA 2.0


Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] How to balance spamd max children?

2003-11-04 Thread David B Funk

On Tue, 4 Nov 2003, Steven W. Orr wrote:

 My ISP went down, so when it came back up aqnd my secondary mx record
 kicked it all back to me I got a lot of messages from sendmail saying

 Nov  4 06:58:26 saturn spamd[952]: hit max-children limit (5): waiting for
 some to exit

 I bumped it up to 10 (my server can well handle it) but can someone tell
 me how I can tell what the current number of max sendmail connections is
 so I can balance the spamd limit properly?

The standard default sendmail config puts no limit on the number
of connections. In fact, older versions of sendmail did not have
any connection limit based control, only load-average based controls.

Newer versions of sendmail have a config parameter 'MaxDaemonChildren'
that can be used to effectively limit the number of SMTP connections.
(It defaults to unset, thus no limit, you must explicitly set it
in your config to activate that feature).
Once that limit is reached sendmail will close the 'LISTEN' on port
25 thus refusing new connections.

Note that this will refuse -all- connections to port 25. Depending upon
your configuration it may affect outgoing as well as incoming messages.
If you use only one sendmail daemon and a mail client (such as Eudora
or Outlook) that connects to port 25 to send messages, sending will
be blocked while sendmail is refusing connections.
(Particular consideration if you're an ISP with customers ;).

Unix clients that call sendmail directly (such as elm, mutt, or pine)
will not be affected.

One way to work around this limitation is to run another sendmail daemon
listening on an alternate port or alias-IP-address and configure your
SMTP based clients to use that for sending out messages. (Use that as
an MSA, rather than a MTA).
Using a MSA has other advantages, you can config it to use authentication
to provide service for remote clients with out fear of open-relay abuse,
customize its filtering (don't bother to SA scan your outgoing mail),
etc.

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?   SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

1 2 >

1 - 100 of 150 matches

Mail list logo