Re: Barracuda RBL in first place
well, you have half of it, as any hit shown here by invaluement was missed by spamhaus. I can't give you the data for other cases because it's a short circuit -> 550 type of thing. That's not an ideal metric. You really need to test every incoming message against each RBL (up to 4 or so, to avoid DNS timeouts). Postfix supports this with "warn_if_reject" before doing the actual "5XX" reject. It's the warnings that yield valid data, or at least they do with large and representative samples (which IME >= 100K msgs/day). Roger Marquis
Re: Counting RAZOR2 hits
MySQL Student wrote: > Hi, > > I thought "grep -c RAZOR2_CHECK" through my mail logs would give me a > good approximation of the number of times RAZOR2 was consulted, but > that doesn't seem to be the case. There are some mails that don't have > it listed in the "tests=" section. > > I've also tried the razor-* commands, and they don't appear to be able > to help here either. What am I missing? > > Does RAZOR2_CHECK mean that it was found in the RAZOR2 db, or that it > merely consulted the db? > That means it was found and was above your min_cf. i.e.: Razor believes it is spam.
RE: DKIM-Reputation list
is this DKIM-Reputation setup for any *general* current spamassassin deployment or does it only work with certain MTA setups ??? i am asking because i believe what i saw was that Amavis was mentioned, and nothing else. TIA - rh
Re: Barracuda RBL in first place
On Sat, 15 Aug 2009 13:28:01 -0400, MySQL Student wrote: > Any chance someone has a bit of time to hack on it on this lazy > Saturday afternoon? :-) http://www.mikecappella.com/logwatch/ -- Benny Pedersen
Counting RAZOR2 hits
Hi, I thought "grep -c RAZOR2_CHECK" through my mail logs would give me a good approximation of the number of times RAZOR2 was consulted, but that doesn't seem to be the case. There are some mails that don't have it listed in the "tests=" section. I've also tried the razor-* commands, and they don't appear to be able to help here either. What am I missing? Does RAZOR2_CHECK mean that it was found in the RAZOR2 db, or that it merely consulted the db? Thanks, Alex
Re: Barracuda RBL in first place
Hi, >> What log script do you good people use to generate the list above ? Is it >> a home brew or one we can download so we can compare our own hits ? > > http://www.rulesemporium.com/programs/sa-stats.txt Any chance someone knows where there is a compatible one that parses amavisd instead of spamd? I've tried, but guess I don't know enough perl to get it right. Any chance someone has a bit of time to hack on it on this lazy Saturday afternoon? :-) Thanks, Alex
Re: Barracuda RBL in first place
Hi, > Unknown user 32.00% (32.00%) 87427696 > Greylisted 24.88% (16.92%) 46225401 > Throttled 11.03% (5.64%) 15399444 > Relay access denied 0.01% (0.00%) 7034 > Bogus DNS (Broadcast) 0.01% (0.00%) 11692 > Bogus DNS (RFC 1918 space) 0.07% (0.03%) 82135 > Spoofed Address 0.26% (0.12%) 319551 > Unclassified Event 0.77% (0.35%) 949388 > Temporary Local Problem 0.01% (0.00%) 8165 > Require FQDN sender address 0.04% (0.02%) 51022 > Require FQDN for HELO hostname 8.97% (4.02%) 10988455 [...] Can I ask how you produced those stats? They look very helpful. Thanks, Alex
Re: Subject starts Re: but no References/In-Reply-To
On Sat, Aug 15, 2009 at 07:12:18AM -0700, Evan Platt wrote: > At 02:56 AM 8/15/2009, you wrote: >> How would I create a rule to match when a subject line begins /^Re: /i >> but the message contains no References or In-Reply-To headers? > > Just FYI, I'm on a number of lists where different people insist on > starting their subject with RE: ... YMMV. :) Something here.. http://ruleqa.spamassassin.org/?rule=%2FFAKE_REPLY http://svn.apache.org/repos/asf/spamassassin/trunk/rulesrc/sandbox/fredt/99_zFVGT_FakeReply.cf
SV: Subject starts SV: but no References/In-Reply-To
On Sat, 15 Aug 2009 07:12:18 -0700, Evan Platt wrote: > At 02:56 AM 8/15/2009, you wrote: >>How would I create a rule to match when a subject line begins /^Re: >>/i but the message contains no References or In-Reply-To headers? > > Just FYI, I'm on a number of lists where different people insist on > starting their subject with RE: ... YMMV. :) so lets change the stats :) -- Benny Pedersen
Re: Subject starts Re: but no References/In-Reply-To
At 02:56 AM 8/15/2009, you wrote: How would I create a rule to match when a subject line begins /^Re: /i but the message contains no References or In-Reply-To headers? Just FYI, I'm on a number of lists where different people insist on starting their subject with RE: ... YMMV. :)
Re: [Solved] Bad performance of Bayes with MySQL cluster
Henrik K wrote: On Sat, Aug 15, 2009 at 09:50:41AM +0200, Jorn Argelo wrote: Henrik K wrote: On Fri, Aug 14, 2009 at 07:43:37PM +0200, Jorn Argelo wrote: Hi All, I'm running spamassassin 3.2.5 on RHEL 5.3 x86_64. We have three boxes, and all three of them are sharing the same bayes DB using a MySQL cluster, version 7.0.6 (based on 5.1.34). The cluster has 2 datanodes with a quadcore and 4 GB of memory. Everything is working fine, even the AWL in SQL, except for Bayes. The bayes database currently houses a bit less than 500k tokens and the database size is not very big either, as the datanodes have less than 1 GB of storage in use. I've followed the instructions from the Spamassassin wiki, and I also used the supplied bayes_mysql.sql file to create my tables. In case anyone is interested, you can find the cluster.ini and the my.cnf used on the SQL nodes here: http://www.wcborstel.com/web/mysql/my.cnf skip-innodb That's pretty much the reason. You _need_ to use InnoDB as it has row level locking. MyISAM just kills Bayes. Actually I'm using NDB and not MyISAM. I need a clustered storage engine, otherwise the bayes DB can't really be shared. If I create an InnoDB table on one SQL node, it doesn't show up at the other SQL node, while this is the case with an NDB storage engine. Ah right sorry.. I have no idea on NDB and how it performs for SA. What I can do however, is point all mailservers to one SQL node. I just need to synchronize the bayes_token table to the other SQL node I guess. Do you have an idea about this? MySQL replication? Maybe search on spamassassin-users archives to find experiences. Thanks for this, I was not aware of it. Running expiry runs manually is done by sa-learn --force-expiry, correct? Yep. In case anybody else comes across the same, I've kicked out the MySQL cluster and now using MySQL with multi-master replication. There we can use InnoDB and this definitely solved all of the problems I had with bayes. Scantimes are now below 1 second. I don't have much load as of yet, so I expect this to increase somewhat during business hours, but all in all things look a lot more promising. I've used this howto: http://capttofu.livejournal.com/1752.html Thanks for the pointers, Henrik. Regards, Jorn __ Information from ESET NOD32 Antivirus, version of virus signature database 4336 (20090814) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
Subject starts Re: but no References/In-Reply-To
How would I create a rule to match when a subject line begins /^Re: /i but the message contains no References or In-Reply-To headers? -- Mike Cardwell - IT Consultant and LAMP developer Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
Re: (no report template found)
Matt Kettler-3 wrote: > > Loren Wilton wrote: >> There is a standard template that gives the form of the report in the >> mail message. I don't recall which cf file this is normally in, but >> it sounds like that file is not being included in the cf files in your >> configuration. >> >> I would check include paths and possibly permissions and the like, as >> well as and special configuration files or options that may be >> included by the process you are following. > > It normally lives in 10_misc.cf... > > However, it is also possible there's a clear_report_template command, > with no new template declared after it. > > I too had this problem and found the solution.. Running "sa-compile" created a folder like /var/lib/spamassassin/3xxx which is being picked up by the spamassassin config and causing the 'template not found' error. Solution: spamassassin --lint -D sa-compile rm -rf /var/lib/spamassassin/3xx /etc/rc.d/init.d/spamassassin restart I hope this helps you guys, it took me a day to figure this one out. -- View this message in context: http://www.nabble.com/%28no-report-template-found%29-tp14623651p24983070.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: giftcardsurveys.us.com
John Hardin wrote: On Thu, 13 Aug 2009, Johnson, S wrote: When I put in the email address of the user that was being sent these survey offers for gift cards I got a message stating please allow 10 days for removal which makes me think they are not legit. That's not necessarily the case. One legitimate reason for claiming a delay like that is if a marketing promotion is already underway materials may already be in the pipeline. Granted, that's more true of physical mail than email, but the procedures in place for electronic marketing may have the same latency. It doesn't automatically mean they're lying about unsubscribing you as quickly as they practically can. However, I agree it's annoying. But it's so easy to check if they are lying. Just setup a fake e-mail address that feeds right into your Bays filter for spam, then after keying in the user's e-mail address that you want to unsubscribe, submit your feeder address for "unsubscribing" If they are a bona-fied spammer when they see the virgin (to their database) feeder address punched into their unsubscribe link, they will immediately start spamming it. Ted
OT - my eyes hurt (Re: Barracuda RBL in first place)
On Sat, Aug 15, 2009 at 10:02:52AM +0100, --[ UxBoD ]-- wrote: > - "Marc Perkel" wrote: > > > > > > Aaron Wolfe wrote: > > On Fri, Aug 14, 2009 at 11:24 AM, Chris Owen wrote: > > On Aug 14, 2009, at 10:13 AM, Mike Cardwell wrote: > > The comparisons on that page are useless. What matters is list policy, > reliability and reputation. > > SpamHaus is hands down the best dnsbl. While I certainly agree that SpamHaus > is very good, I would argue that > Invalument is currently better. It certainly stops a lot more spam here and > I think false positives are still extremely low. Invaluement lists are also > the top performers at my site: > > Total messages: 273235355 > Total blocked: 227710956 83.34% > > Unknown user 32.00% (32.00%)87427696 > Greylisted 24.88% (16.92%)46225401 >Throttled 11.03% (5.64%) 15399444 > Relay access denied 0.01% (0.00%) 7034 >Bogus DNS (Broadcast) 0.01% (0.00%)11692 > Bogus DNS (RFC 1918 space) 0.07% (0.03%)82135 > Spoofed Address 0.26% (0.12%) 319551 > Unclassified Event 0.77% (0.35%) 949388 > Temporary Local Problem 0.01% (0.00%) 8165 > Require FQDN sender address 0.04% (0.02%)51022 > Require FQDN for HELO hostname 8.97% (4.02%) 10988455 > Require DNS for sender's domain 0.78% (0.32%) 870643 > Require Reverse DNS 23.83% (9.65%) 26372877 >Require DNS for HELO hostname 0.20% (0.06%) 165157 > The Spamhaus Block List 21.87% (6.74%) 18405091 > The Invaluement SIP Block List 22.14% (5.33%) 14557404 >The SIP/24 Block List 3.84% (0.72%) 1965510 > The Barracuda Reputation Block List 3.89% (0.70%) 1915628 > (several RBLs not widely used snipped) > > We have several hundred domains and each can use it's own filtering > options, so not all RBLs/checks are used on all mail. Checks are > listed in order applied, so a message dropped by "unknown user" for > instance is never seen by "greylisted". > > Invalument lists block over 25% of all messages that make it past all > the checks in front of them, including Spamhaus. That's massive. > Barracuda is not used by a majority of clients and is used after the > others, so the low number is not an indication of poor performance. > I've actually had pretty good luck with it. > > -Aaron > > -- > RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM > -- > 1 URIBL_INVALUEMENT 27029 47.58 85.13 0.60 > 2 RCVD_IN_INVALUEMENT 26116 45.81 82.26 0.22 > 3 HTML_MESSAGE 25184 79.83 79.32 80.48 > 4 BAYES_99 23445 41.09 73.84 0.12 > 5 RCVD_IN_INVALUEMENT24 23290 40.85 73.35 0.18 > 6 URIBL_BLACK 22372 39.49 70.46 0.74 > 7 RCVD_IN_JMF_BL 16845 30.70 53.06 2.74 > 8 URIBL_JP_SURBL 15962 27.99 50.27 0.12 > 9 DKIM_SIGNED 12137 37.32 38.23 36.18 > 10 DKIM_VERIFIED 11051 33.93 34.81 32.84 > > Chris > > - > Chris Owen - Garden City (620) 275-1900 - Lottery (noun): > President - Wichita (316) 858-3000 - A stupidity tax > Hubris Communications Inc www.hubris.net > - > > > > Yep Invalument is a good list. But there's no public option to compare it. > > > What log script do you good people use to generate the list above ? Is it a > home brew or one we can download so we can compare our own hits ? > A bit OT but please don't post HTML (Marc!) and make incomprehensible and full message quotes messages like this. Takes good while to scroll and understand all this using mutt.
Re: Barracuda RBL in first place
On 8/15/2009 11:02 AM, --[ UxBoD ]-- wrote: -- RANKRULE NAME COUNT %OFMAIL %OFSPAM %OFHAM -- 1 URIBL_INVALUEMENT 2702947.58 85.130.60 2 RCVD_IN_INVALUEMENT 2611645.81 82.260.22 3 HTML_MESSAGE2518479.83 79.32 80.48 4 BAYES_992344541.09 73..840.12 5 RCVD_IN_INVALUEMENT24 2329040.85 73.350.18 6 URIBL_BLACK 2237239.49 70.460.74 7 RCVD_IN_JMF_BL 1684530.70 53.062.74 8 URIBL_JP_SURBL 1596227.99 50.270.12 9 DKIM_SIGNED 1213737.32 38.23 36.18 10 DKIM_VERIFIED 1105133.93 34.81 32.84 Chris - Chris Owen - Garden City (620) 275-1900 - Lottery (noun): President - Wichita (316) 858-3000 -A stupidity tax Hubris Communications Inc www.hubris.net - Yep Invalument is a good list. But there's no public option to compare it.. What log script do you good people use to generate the list above ? Is it a home brew or one we can download so we can compare our own hits ? http://www.rulesemporium.com/programs/sa-stats.txt
Re: Barracuda RBL in first place
- "Marc Perkel" wrote: > > > Aaron Wolfe wrote: On Fri, Aug 14, 2009 at 11:24 AM, Chris Owen wrote: On Aug 14, 2009, at 10:13 AM, Mike Cardwell wrote: The comparisons on that page are useless. What matters is list policy, reliability and reputation. SpamHaus is hands down the best dnsbl. While I certainly agree that SpamHaus is very good, I would argue that Invalument is currently better. It certainly stops a lot more spam here and I think false positives are still extremely low. Invaluement lists are also the top performers at my site: Total messages: 273235355 Total blocked: 227710956 83.34% Unknown user 32.00% (32.00%)87427696 Greylisted 24.88% (16.92%)46225401 Throttled 11.03% (5.64%) 15399444 Relay access denied 0.01% (0.00%) 7034 Bogus DNS (Broadcast) 0.01% (0.00%)11692 Bogus DNS (RFC 1918 space) 0.07% (0.03%)82135 Spoofed Address 0.26% (0.12%) 319551 Unclassified Event 0.77% (0.35%) 949388 Temporary Local Problem 0.01% (0.00%) 8165 Require FQDN sender address 0.04% (0.02%)51022 Require FQDN for HELO hostname 8.97% (4.02%) 10988455 Require DNS for sender's domain 0.78% (0.32%) 870643 Require Reverse DNS 23.83% (9.65%) 26372877 Require DNS for HELO hostname 0.20% (0.06%) 165157 The Spamhaus Block List 21.87% (6.74%) 18405091 The Invaluement SIP Block List 22.14% (5.33%) 14557404 The SIP/24 Block List 3.84% (0.72%) 1965510 The Barracuda Reputation Block List 3.89% (0.70%) 1915628 (several RBLs not widely used snipped) We have several hundred domains and each can use it's own filtering options, so not all RBLs/checks are used on all mail. Checks are listed in order applied, so a message dropped by "unknown user" for instance is never seen by "greylisted". Invalument lists block over 25% of all messages that make it past all the checks in front of them, including Spamhaus. That's massive. Barracuda is not used by a majority of clients and is used after the others, so the low number is not an indication of poor performance. I've actually had pretty good luck with it. -Aaron -- RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM -- 1 URIBL_INVALUEMENT 27029 47.58 85.13 0.60 2 RCVD_IN_INVALUEMENT 26116 45.81 82.26 0.22 3 HTML_MESSAGE 25184 79.83 79.32 80.48 4 BAYES_99 23445 41.09 73.84 0.12 5 RCVD_IN_INVALUEMENT24 23290 40.85 73.35 0.18 6 URIBL_BLACK 22372 39.49 70.46 0.74 7 RCVD_IN_JMF_BL 16845 30.70 53.06 2.74 8 URIBL_JP_SURBL 15962 27.99 50.27 0.12 9 DKIM_SIGNED 12137 37.32 38.23 36.18 10 DKIM_VERIFIED 11051 33.93 34.81 32.84 Chris - Chris Owen - Garden City (620) 275-1900 - Lottery (noun): President - Wichita (316) 858-3000 - A stupidity tax Hubris Communications Inc www.hubris.net - > > Yep Invalument is a good list. But there's no public option to compare it. > What log script do you good people use to generate the list above ? Is it a home brew or one we can download so we can compare our own hits ? -- This message has been scanned for viruses and dangerous content and is believed to be clean. SplatNIX IT Services :: Innovation through collaboration
Re: Bad performance of Bayes with MySQL cluster
On Sat, Aug 15, 2009 at 09:50:41AM +0200, Jorn Argelo wrote: > Henrik K wrote: >> On Fri, Aug 14, 2009 at 07:43:37PM +0200, Jorn Argelo wrote: >> >>> Hi All, >>> >>> I'm running spamassassin 3.2.5 on RHEL 5.3 x86_64. We have three >>> boxes, and all three of them are sharing the same bayes DB using a >>> MySQL cluster, version 7.0.6 (based on 5.1.34). The cluster has 2 >>> datanodes with a quadcore and 4 GB of memory. Everything is working >>> fine, even the AWL in SQL, except for Bayes. The bayes database >>> currently houses a bit less than 500k tokens and the database size >>> is not very big either, as the datanodes have less than 1 GB of >>> storage in use. I've followed the instructions from the Spamassassin >>> wiki, and I also used the supplied bayes_mysql.sql file to create my >>> tables. In case anyone is interested, you can find the cluster.ini >>> and the my.cnf used on the SQL nodes here: >>> >>> http://www.wcborstel.com/web/mysql/my.cnf >>> >> >> skip-innodb >> >> That's pretty much the reason. You _need_ to use InnoDB as it has row level >> locking. MyISAM just kills Bayes. >> > Actually I'm using NDB and not MyISAM. I need a clustered storage > engine, otherwise the bayes DB can't really be shared. If I create an > InnoDB table on one SQL node, it doesn't show up at the other SQL node, > while this is the case with an NDB storage engine. Ah right sorry.. I have no idea on NDB and how it performs for SA. > What I can do however, is point all mailservers to one SQL node. I just > need to synchronize the bayes_token table to the other SQL node I guess. > Do you have an idea about this? MySQL replication? Maybe search on spamassassin-users archives to find experiences. > Thanks for this, I was not aware of it. Running expiry runs manually is > done by sa-learn --force-expiry, correct? Yep.
Re: Bad performance of Bayes with MySQL cluster
Henrik K wrote: On Fri, Aug 14, 2009 at 07:43:37PM +0200, Jorn Argelo wrote: Hi All, I'm running spamassassin 3.2.5 on RHEL 5.3 x86_64. We have three boxes, and all three of them are sharing the same bayes DB using a MySQL cluster, version 7.0.6 (based on 5.1.34). The cluster has 2 datanodes with a quadcore and 4 GB of memory. Everything is working fine, even the AWL in SQL, except for Bayes. The bayes database currently houses a bit less than 500k tokens and the database size is not very big either, as the datanodes have less than 1 GB of storage in use. I've followed the instructions from the Spamassassin wiki, and I also used the supplied bayes_mysql.sql file to create my tables. In case anyone is interested, you can find the cluster.ini and the my.cnf used on the SQL nodes here: http://www.wcborstel.com/web/mysql/my.cnf skip-innodb That's pretty much the reason. You _need_ to use InnoDB as it has row level locking. MyISAM just kills Bayes. Actually I'm using NDB and not MyISAM. I need a clustered storage engine, otherwise the bayes DB can't really be shared. If I create an InnoDB table on one SQL node, it doesn't show up at the other SQL node, while this is the case with an NDB storage engine. What I can do however, is point all mailservers to one SQL node. I just need to synchronize the bayes_token table to the other SQL node I guess. Do you have an idea about this? Now the problem at the first glance seems to be, from my perspective (please correct me if I'm wrong), the actual queries being done. For every mail being scanned by spamassassin, it seems to be doing the "SELECT RPAD(token, 5, ' '), spam_count, ham_count, atime FROM bayes_token" query every time. This effectively requesting the entire bayes_token table What you are seeing are expiry runs. As you right now use MyISAM, the whole table is locked for such operations so you are pretty much hosed. In any case, you should use "bayes_auto_expire 0" and run expire for example once every night when traffic is slower. Thanks for this, I was not aware of it. Running expiry runs manually is done by sa-learn --force-expiry, correct? It seems that the query cache is either not suitable for this or I am doing something majorly wrong :) You are right. Better to disable completely if there's nothing else running that uses it and save little CPU. Good to know. There will be other applications running on it as well so I'll reduce the size of the query cache for a good bit. Thanks a lot for your feedback. Jorn __ Information from ESET NOD32 Antivirus, version of virus signature database 4336 (20090814) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com