bayes_ignore_from with wildcard ?

2005-07-11 Thread lists
Hello, Does anyone know if this will work: bayes_ignore_from [EMAIL PROTECTED] The docs don't say specifically if this kind of directive is allowed. They do say that this kind of thing will work for whitelist_from. Regards, Devin

Re: How can I filter this kind of spam?

2005-07-11 Thread Michael Moyse
Kai Schaetzl wrote: Michael Moyse wrote on Fri, 08 Jul 2005 17:55:32 +0100: To me it looks like a duck and sounds like a duck I'm probably wrong and missing something here because I'm no expert so I'm happy to be enlightened. Ok, I enlighten you ;-) I hope I'm not wrong. Now that

Re: simultaneous sa-learn processes

2005-07-11 Thread JamesDR
Chavdar Videff wrote: Hi List, Our mailserver server serves about 100 users. Our config: Sendmail+Procmail+SpamAssassin. The question is: If I got it right, we should run sa-learn for each user in order to benefit from bayes. We intend to run a cron job for each user and do it at night by

RE: simultaneous sa-learn processes

2005-07-11 Thread Sander Holthaus - Orange XL
JamesDR wrote: Chavdar Videff wrote: Hi List, Our mailserver server serves about 100 users. Our config: Sendmail+Procmail+SpamAssassin. The question is: If I got it right, we should run sa-learn for each user in order to benefit from bayes. We intend to run a cron job for each user and do

Re: simultaneous sa-learn processes

2005-07-11 Thread Chavdar Videff
On Monday 11 July 2005 14:50, JamesDR wrote: Chavdar Videff wrote: Hi List, Our mailserver server serves about 100 users. Our config: Sendmail+Procmail+SpamAssassin. The question is: If I got it right, we should run sa-learn for each user in order to benefit from bayes. We intend to

Re: simultaneous sa-learn processes

2005-07-11 Thread Kai Schaetzl
Chavdar Videff wrote on Mon, 11 Jul 2005 13:40:14 +0300: If I got it right, we should run sa-learn for each user in order to benefit from bayes. We intend to run a cron job for each user and do it at night by supplying a daily snapshot of our spam and ham collections to sa-learn. Do I

Re: simultaneous sa-learn processes

2005-07-11 Thread Chavdar Videff
On Monday 11 July 2005 15:31, Kai Schaetzl wrote: Chavdar Videff wrote on Mon, 11 Jul 2005 13:40:14 +0300: If I got it right, we should run sa-learn for each user in order to benefit from bayes. We intend to run a cron job for each user and do it at night by supplying a daily snapshot of

Re: (repost) bayes_ignore_from with wildcard ?

2005-07-11 Thread Matt Kettler
At 04:43 AM 7/11/2005, [EMAIL PROTECTED] wrote: Hello, Does anyone know if this will work: bayes_ignore_from [EMAIL PROTECTED] The docs don't say specifically if this kind of directive is allowed. They do say that this kind of thing will work for whitelist_from. We all got your message the

Re: SURBL SA 3.0.4

2005-07-11 Thread Matt Kettler
Dr Robert Young wrote: Is there a particular port and/or protocol (TCP/UDP) that must be opened on any firewalls that might be on the network for the plugin to work? You don't need to open any ports, however you must be able to resolve DNS queries. In general you can test it by using host

Bypass URI check

2005-07-11 Thread Martin.Carnegie
Title: Bypass URI check Hi All, I have received a few messages like the following. This asks the receiver to copy and past the link into their web browser. Since the href is missing, there is no URI check. That sucks, because the URIBL is my best friend right now (love black). We are close

Re: Rule: envelope to header to - help?

2005-07-11 Thread Matt Kettler
Michael W Cocke wrote: Does anyone have a rule to chech the envelope To: against the header to: ? I'm sure that there's a reason why it's allowed to be different, but it doesn't apply here, and almost half of the spam that gets thru everything else would get stopped by that. No. It's

RE: Bypass URI check

2005-07-11 Thread Chris Santerre
Title: Bypass URI check I'm thinking it may be time for SARE to look at this phrase: "then copy // paste the below page into your window: " I'll see what I can do with it. --Chris (I also love the black ;) -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Matt Kettler
Joe Flowers wrote: I don't know if this will help anyone or not, but I wanted to report back just in case. In early April, I completely unhinged the dividing line between what SA score is used to mark a message as spam or ham (5.00 = default). This allows the system and this dividing line

RE: spamassassin with GORDANO

2005-07-11 Thread Bret Miller
Does anyone know If I can use Spammain with GMS (Gordano Mail Software for Linux) In theory, you could use MailScanner as a proxy in front of GMS to run SpamAssassin before the message gets to GMS. And, if I recall correctly (I haven't used GMS for several years), I think you can use their

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
Matt Kettler wrote: The only problem I see with this approach is that it treats false positives and false negatives as being equally bad. We do get many more false negatives than false positives, even though we don't get false positives very often - they are rare. We certainly don't get

Re: sa-learn on a wide site HOWTO ?

2005-07-11 Thread Julien Reveret
On 16:56, Mon 11 Jul 05, Karl.Oulmi wrote: Hi, I always have a box with postfix/amavis and Spamassin running. Now, I'd like to run sa-learn in order my users (~500) learn Spam Ham to Spamassassin. The idea is the following. On every mail passed through my mailserver, a header or a

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 There's another thing worth noting -- the SpamAssassin score distribution for hams and spams isn't even. If you draw a graph of hams and spams, plotting the number of mails in each category as the vertical axis and the score they get as teh

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
Thanks Jason! That's good, new info for me. That'll help me *at the very least* visualize what I am trying to do a little better. I've been very curious to know what the rough shapes of those graphs look like. Joe Justin Mason wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Matt Kettler
Joe Flowers wrote: Matt Kettler wrote: The only problem I see with this approach is that it treats false positives and false negatives as being equally bad. We do get many more false negatives than false positives, even though we don't get false positives very often - they are rare.

Re: simultaneous sa-learn processes

2005-07-11 Thread Kai Schaetzl
Chavdar Videff wrote on Mon, 11 Jul 2005 16:13:44 +0300: If there is a way to set up a single bayes database I would prefer that There is one, just look in the SA documentation. (documentation for local.cf should do.) Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet

Re: How can I correctly detect these spams?

2005-07-11 Thread Kai Schaetzl
I repeat myself ;-) It seems you are not using *any* custom rules. You may want to check out RDJ and SARE. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de http://msie.winware.org

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 the real-world figures can be seen for various thresholds in the rules/STATISTICS*.txt files... - --j. Matt Kettler writes: Joe Flowers wrote: Matt Kettler wrote: The only problem I see with this approach is that it treats false positives

Re: simultaneous sa-learn processes

2005-07-11 Thread jdow
From: Chavdar Videff [EMAIL PROTECTED] On Monday 11 July 2005 14:50, JamesDR wrote: Chavdar Videff wrote: Hi List, Our mailserver server serves about 100 users. Our config: Sendmail+Procmail+SpamAssassin. The question is: If I got it right, we should run sa-learn for each

RE: sa-learn on a wide site HOWTO ?

2005-07-11 Thread Aaron Grewell
Forget about this. Most of you users will only report spams, not ham, they're going to screw the bayes database. As a consequence, you'll have more spam, or more fp. You should find another solution or educate your users (but it takes too much time) so they feed correctly the bayesian

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread jdow
From: Matt Kettler [EMAIL PROTECTED] Joe Flowers wrote: I don't know if this will help anyone or not, but I wanted to report back just in case. In early April, I completely unhinged the dividing line between what SA score is used to mark a message as spam or ham (5.00 = default). This

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Loren Wilton
score of -2.1532284. I have the divding line set at 30% of the distance between the average ham score and average spam score (30% above the average ham score). So, the dividing line is currently floating around 0.55416414. The only problem I see with this approach is that it treats

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
Matt: I know you know a lot more about this than I do, but for what it's worth, you're impressions/intuitions are very close to mine. Originally back in April, I started off using the average of the means, but that let through way too much spam. So, what I have now is it set to 30% above the

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Loren Wilton
There's another thing worth noting -- the SpamAssassin score distribution for hams and spams isn't even. I don't necessarily see that those particular curve shapes necessarily in any way invalidate this method, although they do bias the method somewhat. The two curves are essentially smooth

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
jdow wrote: The greater the separation choke the better the results for a decision point between them. But anything you can do that widens the typical score distribution between ham and spam is a good thing. Amen

RE: SURBL, SA 3.0.4, and firewalls

2005-07-11 Thread Stewart, John
All it needs is port 53 TCP and UDP open (outbound), depending on what firewall product you use, depends on how. A bit of Google with what ports on what product will yield what you should need. One thing to note... if your firewall is proxying for you, make sure it doesn't think it's

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread jdow
A few weeks ago I'd have said Easy, Ducky! Then I ran into DoveCot that uses an indexed almost mbox file. There is no way to do it other than good guess. However, for a traditional UNIX mbox file you should be able to nail it perfectly simply looking for the From feature. The dirt stupid mail

Performance: files or SQL?

2005-07-11 Thread Mike Jackson
On my personal server, I'm running SA 3.0.4 with the user prefs, Bayes, and AWL in a MySQL database (mostly because it would be cooler that way). On my employer's server, I'm running the same SA version, but with file-based DBs and user prefs. We're going to be rolling out doing filtering for

procmail: Could not create INET socket on 127.0.0.1:783: P ermission denied

2005-07-11 Thread prosolutions
Hello, I set up spamassassin to work with procmail according to instructions. Here is what is in ~/.procmailrc: #SPAM ASSASSIN SECTION :0fw: spamd.lock * 256000 | /usr/sbin/spamd :0: * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* almost-certainly-spam :0: *

Re: SA 2.63 vs 2.64

2005-07-11 Thread Matthias Fuhrmann
On Sun, 10 Jul 2005, Matthias Fuhrmann wrote: [...] # jm: do not... the lines from Bayes.pm fits to the error messages. didnt checked PerMsgStatus.pm, but i guess its the same issue. can someone explain the difference or the impact to the problem, described above? what about replacing

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kai Schaetzl
Joe Flowers wrote on Mon, 11 Jul 2005 12:09:29 -0400: We are very glad and happy about this concept and implementation. Well, the big question is: How many of your spam messages score between the default 5 and your floating score? If it is many there's obviously something wrong with your

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kai Schaetzl
Loren Wilton wrote on Mon, 11 Jul 2005 11:30:07 -0700: Which of course means that by picking the ratio value you can pick pretty much any fp/fn ratio you want. Only if the distribution was equal. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services:

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kris Deugau
jdow wrote: A few weeks ago I'd have said Easy, Ducky! Then I ran into DoveCot that uses an indexed almost mbox file. There is no way to do it other than good guess. However, for a traditional UNIX mbox file you should be able to nail it perfectly simply looking for the From feature. The dirt

Re: Performance: files or SQL?

2005-07-11 Thread Michael Parker
Cami wrote: SQL simply doesnt scale very well for bayes. We have a serverfarm of 12 spamassassin servers and storing bayes in SQL. We see on average about 4000 queries per second. The MySQL server has been optimized to hell and back and is running on high-end hardware,but just simply

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kai Schaetzl
Kai Schaetzl wrote on Mon, 11 Jul 2005 22:31:29 +0200: With the default of 5 we get almost none, not even one per day. That was about FPs. Wrong. We don't get *any* FPs. We do not get even one *FN* per day. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services:

Help debugging spamc/spamd

2005-07-11 Thread email builder
Hi, We recently changed some of our network topology so that we are temporarily connecting with spamc to spamd over a regular external network connection (we usually keep it inside our LAN, but this is a temporary thing... don't ask). Unfortunately, spamd stops (mostly) responding it seems.

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
BTW, if anyone knows a command line program that can easy run thu a bunch of mbox files and tell how many messages are in them, I will report back how many ham and how many spam messages that I have fed to bayes. Well, I thought this might give some good stats on the FP:FN ratio, but I

Re: (repost) bayes_ignore_from with wildcard ?

2005-07-11 Thread Daryl C. W. O'Shea
Matt Kettler wrote: Although by looking at _check_whitelist, I wonder if it works the way the docs say. The docs claim it's file glob and not regex, but _check_whitelist looks a lot like it does a regex. _check_whitelist does use a regexp to do the matching but the config parser

Fedora changed SpamAssassin default level to 7?

2005-07-11 Thread Justin Mason
fyi, if you're using Fedora Core -- http://blog.dave.org.uk/archives/000715.html totally unconfirmed, but worth noting in case that really is the case. --j.

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kelson
Joe Flowers wrote: BTW, if anyone knows a command line program that can easy run thu a bunch of mbox files and tell how many messages are in them, I will report back how many ham and how many spam messages that I have fed to bayes. It's far from perfect, but it may offer some interesting info

Re: Bypass URI check

2005-07-11 Thread Daryl C. W. O'Shea
[EMAIL PROTECTED] wrote: Hi All, I have received a few messages like the following. This asks the receiver to copy and past the link into their web browser. Since the href is missing, there is no URI check. That sucks, because the URIBL is my best friend right now (love black). We are close

Re: Fedora changed SpamAssassin default level to 7?

2005-07-11 Thread Kelson
Justin Mason wrote: fyi, if you're using Fedora Core -- http://blog.dave.org.uk/archives/000715.html totally unconfirmed, but worth noting in case that really is the case. My copy of Fedora Core 4 has required_hits 5 in local.cf using the distribution's RPM for Spamassassin. rpm -Va made no

Re: simultaneous sa-learn processes

2005-07-11 Thread Robert Menschel
Hello Chavdar, Monday, July 11, 2005, 3:40:14 AM, you wrote: CV Hi List, CV Our mailserver server serves about 100 users. Our config: CV Sendmail+Procmail+SpamAssassin. CV The question is: CV If I got it right, we should run sa-learn for each user in order to benefit CV from bayes. We intend

Bayes Questions

2005-07-11 Thread Andrew Ott
For those of you running large sites ( we have about 12,000 users, with 210,000 messages a day) what do you have for a bayes_expiry_max_db_size? Also is there any way to see the count of spam and ham messages that are in the bayes database, I can't seem to find any info on that. I want to make

Re: Bayes Questions

2005-07-11 Thread Daniel J. Cody
Andrew, Andrew Ott wrote: Also is there any way to see the count of spam and ham messages that are in the bayes database, I can't seem to find any info on that. I want to make sure there are a lot in there before I turn the bayes rules on. If you run spamassassin --lint -D you should see a