Re: translation help please
Am 24.11.2006 um 04:22 schrieb Chris: This was tossed into my spam folder tonight but it was during my NANAS report run. I'm not sure if its a reply from abuse@ or just a spam: Neither. It's instructions on how to use the website galeon.com configuring the browser to work with cookies, etc. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Interesting text content in the new spams
Looks like there are some pretty impressive self-learning systems out there. I'm enclosing the content of the text part of a new spam. I think it's quite an interesting vocabulary that they are using, presumably from their own trained ham database. This spam got through four different checks (postfix + blacklisting, spamassassin, spambayes and Opera's own spam system)! Given them a couple of years and we can finally close slashdot et al. and actually start reading this stuff! ;-) Charlie Raquo Areas Bugs. Open total a bug Tracking Support or Requests in Tech Patches. Release archive is raquo of Areas? Framework gd Engine Details Developers Beta Intended Audience. In Create Newscreate Farm Mapcreate or Projectnew am Wantedmy? Statistics currently Browse Most! Of feeds available for this About by or the from. Activity Percentile last week View list of feeds available is. Language a License gnu of. Patches Patch Feature a Request. Details Developers Beta Intended Audience Education Technology. Education Technology or Other Topic English Unix name Registered. Language License gnu? Va Software Ostg Source Group all Rights Reserved or Find. Projectnew Wantedmy Statussite is. Areas in Bugs open total bug Tracking Support. Va Software Ostg Source Group all Rights Reserved or Find. Bug or Tracking Support Requests or Tech Patches am Patch in. Audience or Education Technology Other Topic English Unix. Support in Requests Tech Patches Patch Feature Request. Kolmafia sw Test Automation Framework gd. System of os Written an language of License gnu General Public. License gnu General Public gpl. Create Newscreate is Farm of Mapcreate Projectnew am Wantedmy Statussite Status web! Sprites a Release archive raquo of Areas Bugs? Open total a bug Tracking Support or Requests in Tech Patches. Book Search is Advanced log in Create is. Va Software Ostg Source Group in all Rights. Latest a News new or Graphics and Sprites Release archive. Va Software Ostg Source Group in all Rights. Intended Audience Education. -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Greylisting
Am 21.11.2006 um 01:12 schrieb John Andersen: On Monday 20 November 2006 15:08, Rick Macdougall wrote: It's possible that they could send it all twice but I've never seen it. Remember that some unbelievable number of infected Windows clients are the main source of spam and it would just be too much trouble for the spammer to try every address twice after a 15 minute interval. Oh come on! It costs the spammer NOTHING to make that adjustment to his bot net. Its someone else's bandwidth, and someone else's cpu cycles. They are reading this list and planning the changes already. Of course! Spam and Spamassassin is the ultimate cops robbers! I'm sure the best spammers continually update the rules and run their own tests against them to develop new mails which get through. Despite everyone's best efforts we are fighting a losing battle with a solution that does not tackle the botnet problem at source but for that to happen things might have to get a whole lot worst! :-/ Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: blarsbl
Am 21.11.2006 um 17:53 schrieb Thomas Lindell: Att mail servers use his service. Which means I can't send to mediacom which is an att partner I couldn't believe att used his service. What's odd is that my company uses att backhaul bandwidth in the form of 4 t1's Grr the whole thing is frustrating The guy's a moron but I think his disclaimer lets him off: The BlarsBL is maintained by Blars at his wim. Use for any purpouse should be done at your own risk, and Blars is not responsible for use by anyone but himself. While he is under no compunction to remove an address I think his demand for money is ludicrous. If this is held under the right nose at ATT or Mediacom it should produce the right reaction. But this and other issues do pose the question: how easy is it going to be for spammers to start using blocking list against normal users? Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: would SA benefit from port to Java
Am 17.11.2006 um 20:36 schrieb Eric A. Hall: Thinking about the GPL Java announcement some, and trying to imagine the kinds of opportunities this allows for, it occurs to me that SpamAssassin might be a natural fit for Java. Why on earth do you come to that conclusion and what does Java going GPL have anything to do with it? I'm just thinking out loud here, not advocating anything... At best you are speculating rather thank thinking. Would it run better? Would it be faster, have smaller memory footprint, better reclamation, better hooks for plugins etc? OTOH, would it be harder to build, given the dependence of SA on perl modules? Please do some research on progam languages and domains because one size almost never fits all. While I personally very much dislike perl, it is extremely well-suited to this task: text-centric, rapidly changing. SA was the first out there, has a large body of active developers and is extensible by rules. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Endusers and spam
Dear list, this is an obvious question but not part of the FAQs or at least I couldn't find it! What is the best way of getting end users to identify spam getting through so that it can be learned? I have so far set up an extra account and forward the e-mail and then tell Spamassassin to learn from this but I'm worried about the extra headers and formatting that are added when forwarding. The mail accounts are all mbox format so it also isn't possible to pass the individual messages into be learnt. Thanks Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Distributed Bayes DB?
Am 11.11.2006 um 10:48 schrieb Matthias Leisi: I already took a look at using SQL, but this quote: | NB: This should be considered BETA, and the interface, schema, or | overall operation of SQL support may change at any time with future | releases of SA. stops me from using it. Unfortunately, I can not run software officially considered Beta on this system. I suppose you could use something like NFS so that all systems share the same DB, config files, etc. Use a SQL server backend. If you must have a no-failure option for the bayes DB, use a cluster of SQL servers. Example with mysql: http://www.howtoforge.com/loadbalanced_mysql_cluster_debian I suppose that every message passed through SpamAssassin will issue at least on query and one update statement to the DB. How does a MySQL cluster perform with 500'000 messages per day, considering that replication must also take place? How long is a piece of string? 500,000 queries per day shouldn't cause any problems for an RDBMS but the architecture of such a system should be given a bit of consideration - connection pooling et al. There is in fact a mail system that uses PostgreSQL to store all the mails. If you want more information on requirements, speed, etc. I'm pretty sure you could run Spamassassin on the top of it. What is the best practice in that regard with Spamassassin? Using SQL is by far the best practice here. I do not see many mentions of the SQL approach - either because it is not used much or because it works so well? Probably the former. And you're right not to use something like the SQL backend for a large volume production system. Not because it's unreliable but because it's still in development and keeping the schema up to date could become a real headache. I suspect that at some point it might make sense to use something like SQLite for persistence (because it's relatively easy to distribute) which would make using alternative backends relatively easy. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Distributed Bayes DB?
Am 11.11.2006 um 11:47 schrieb Matt Kettler: I suppose you could use something like NFS so that all systems share the same DB, config files, etc. NFS would be HIGHLY not -recommended. http://article.gmane.org/gmane.mail.spam.spamassassin.general/72362/ match=sql In fact, I personally would suggest never using NFS for anything at all, and I'm shocked that you'd even consider using it for any production purpose. NFS or equivalent has its place and can be made safe enough if required but I think other issues like concurrent access suggest that the SQL approach is the way to go. Besides, the point here is to eliminate any single-point-of- failure. NFS would offer no redundancy at all. If the server hosting the NFS share went down, the bayes DB would be unavailable. Agreed. I do not see many mentions of the SQL approach - either because it is not used much or because it works so well? Probably the former. And you're right not to use something like the SQL backend for a large volume production system. Not because it's unreliable but because it's still in development and keeping the schema up to date could become a real headache. But it's not still in development.. It's the recommended configuration as of 3.1.0. SA's SQL support is solid. I personally don't use it, but many here do. Yes, sorry I should have read all e-mails relating to the thread first. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: sa-update rules for SA 3.1.7 have been updated but they fail lint
Am 11.11.2006 um 01:18 schrieb Daryl C. W. O'Shea: Justin Mason wrote: Randal, Phil writes: I've just run sa-update -D and it's failed with return code 4. update 473327: config: warning: score set for non-existent rule PART_CID_STOCK config: warning: score set for non-existent rule PART_CID_STOCK_LESS As a result, the rules get rolled back. oops. now fixed. OK, try it soonish (it may take a few minutes for the mirrors to update and the cached DNS txt record to expire). Ha! Remember what I said about feeling unlucky? ;) Rule #1 - Let someone else ask the really stupid question for you first! Thanks, this was biting me, too and I saw it had been fixed in the past! tsk, tsk ;-) re. my own problems: it looks like things have settled down. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Problem with spamd
Am 09.11.2006 um 02:10 schrieb Daryl C. W. O'Shea: Charlie Clark wrote: Looks like I'm on top of the resources problem but I am getting 421 delivery errors even though the e-mails are coming through. This looks very similar to bug 3828 (which is Spamassassin + Exim). Except this bug should have been closed a long time ago. Without looking at the bug, it sounds like you're saying that Exim temp fails messages when a filter (SA) isn't available to filter the message in time. If that's the case it's sensible for that to happen. Indeed it is. I just don't understand why it is happening on this machine which has a very low load. The strange thing is these errors never occurred before last week and having just upgraded to 3.1.7 I would hope to have a system including all relevant bug fixes. Of course, as Theo said it might simply be easier to stop using spamd and just call spamassassin but it might also be helpful to track down the problem. Should I jump on the back of the old bug or make a new submission? Have you actually looked into making sure that you're not experiencing an expiry issue (like the expiry being times out and never completed) like Theo inferred you do off the bat? No, and I'll admit to not really understanding exactly what you mean. Where can I check and if necessary change this? Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Problem with spamd
Am 09.11.2006 um 19:27 schrieb Daryl C. W. O'Shea: If your one and only child is busy doing an expire it can't scan messages too. ah, so I could increase the number of children running to do this? The strange thing is these errors never occurred before last week and having just upgraded to 3.1.7 I would hope to have a system including all relevant bug fixes. Of course, as Theo said it might simply be easier to stop using spamd and just call spamassassin but it might also be helpful to track down the problem. Should I jump on the back of the old bug or make a new submission? Have you actually looked into making sure that you're not experiencing an expiry issue (like the expiry being times out and never completed) like Theo inferred you do off the bat? No, and I'll admit to not really understanding exactly what you mean. Where can I check and if necessary change this? Disable bayes_auto_expire in your local.cf and run an expire manually (and then set it up as a cron job) by running sa-learn -- force-expire as the user that SA normally runs as (if SA runs as more than one user, run it for all the users it runs as). It's probably going to take a considerable amount of time for it to run... let it finish, it will eventually. bayes_auto-expire isn't actually in my local.cf so I've added it as bayes_auto-expire 0 It also strikes me that I can probably enable trusting the localhost on this machine - does this mean that spamassassin will not bother checking e-mail sent via the local SMTP? Thank you very much for your help! Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Problem with spamd
Am 09.11.2006 um 20:35 schrieb Daryl C. W. O'Shea: Charlie Clark wrote: Am 09.11.2006 um 19:27 schrieb Daryl C. W. O'Shea: If your one and only child is busy doing an expire it can't scan messages too. ah, so I could increase the number of children running to do this? You could, running at least 2 children if you've got the resources to do it isn't a bad idea), but it sounds like you've either got a lot of individual bayes databases to expire or the expiry of one or more of the databases is never being allowed to complete. I think I have the resources for more children. It's not a lot of mail going through the system but I think the network connection often seems to have problems. You're best off disabling auto expire and doing it manually. I ran it manually and it only took a couple of seconds so I think that now that the performance issue has gone, this will hopefully eventually go away. Disable bayes_auto_expire in your local.cf and run an expire manually (and then set it up as a cron job) by running sa-learn -- force-expire as the user that SA normally runs as (if SA runs as more than one user, run it for all the users it runs as). It's probably going to take a considerable amount of time for it to run... let it finish, it will eventually. bayes_auto-expire isn't actually in my local.cf so I've added it as bayes_auto-expire 0 Yeah, it's on by default, that's how you disable it. It also strikes me that I can probably enable trusting the localhost on this machine - does this mean that spamassassin will not bother checking e-mail sent via the local SMTP? That's not at all what it means. If you need help configuring your that search the archives or start another thread. Got it sussed now all I need to do is tell Exim to unfreeze it's queue... Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Problem with spamd
Hi, about a week ago my server started experiencing load problems and eventually closed all connections. It is running at an ISP and has lots of software preconfigured including spam assassin configured by the ISP. There are currently two problems: spamd is nearly monopolising the CPU but also the tcprcvbuf eventually get used up; but I suspect the two are related. As I did not configure the system I am have to working my way through but it looks like a default install. I could not find anything on the FAQ relating to this specifically apart from the reference to max-children (set to 1 in this case). It doesn't look like there are a lot of e-mails to process. The setup is Debian with spamd being called by as an Exim transport. These are the active rules vs171127:/usr/share/spamassassin# ls -l total 552 -rw-r--r-- 1 root root 6013 Jun 30 2005 10_misc.cf -rw-r--r-- 1 root root 1600 Jun 30 2005 20_anti_ratware.cf -rw-r--r-- 1 root root 8193 Jun 30 2005 20_body_tests.cf -rw-r--r-- 1 root root 1608 Jun 30 2005 20_compensate.cf -rw-r--r-- 1 root root 12078 Jun 30 2005 20_dnsbl_tests.cf -rw-r--r-- 1 root root 15695 Jun 30 2005 20_drugs.cf -rw-r--r-- 1 root root 11263 Jun 30 2005 20_fake_helo_tests.cf -rw-r--r-- 1 root root 27706 Jun 30 2005 20_head_tests.cf -rw-r--r-- 1 root root 15482 Jun 30 2005 20_html_tests.cf -rw-r--r-- 1 root root 10934 Jun 30 2005 20_meta_tests.cf -rw-r--r-- 1 root root 22094 Jun 30 2005 20_phrases.cf -rw-r--r-- 1 root root 4961 Jun 30 2005 20_porn.cf -rw-r--r-- 1 root root 14134 Jun 30 2005 20_ratware.cf -rw-r--r-- 1 root root 5027 Jun 30 2005 20_uri_tests.cf -rw-r--r-- 1 root root 2329 Jun 30 2005 23_bayes.cf -rw-r--r-- 1 root root 9112 Jun 30 2005 25_body_tests_es.cf -rw-r--r-- 1 root root 2733 Jun 30 2005 25_hashcash.cf -rw-r--r-- 1 root root 2299 Jun 30 2005 25_spf.cf -rw-r--r-- 1 root root 4698 Jun 30 2005 25_uribl.cf -rw-r--r-- 1 root root 52288 Jun 30 2005 30_text_de.cf -rw-r--r-- 1 root root 40677 Jun 30 2005 30_text_fr.cf -rw-r--r-- 1 root root 57934 Jun 30 2005 30_text_nl.cf -rw-r--r-- 1 root root 34798 Jun 30 2005 30_text_pl.cf -rw-r--r-- 1 root root 29369 Jun 30 2005 50_scores.cf -rw-r--r-- 1 root root 6882 Jun 30 2005 60_whitelist.cf -rw-r--r-- 1 root root939 Jun 30 2005 65_debian.cf -rw-r--r-- 1 root root 101479 Jun 30 2005 languages -rw-r--r-- 1 root root 18944 Jun 30 2005 triplets.txt -rw-r--r-- 1 root root 1531 Jun 30 2005 user_prefs.template This is from top: 3796 web1p239 15 46764 42m 4252 R 65.2 0.7 144:06.98 spamd and this is a check of the tcprc use tcprcvbuf481548189607218840243681759 148967 (machine was rebooted this morning) Is it possible to get more information from spamd about why it's taking so long? Thanks for any help. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Problem with spamd
Am 08.11.2006 um 18:43 schrieb Theo Van Dinter: On Wed, Nov 08, 2006 at 06:38:19PM +0100, Charlie Clark wrote: 2006-11-08 17:31:00 [9733] i: debug: refresh: 9733 refresh /home/ confixx/web1p2/.spamassassin/bayes.lock Is this standard behaviour? It seemed okay when the lock is acquired but seems to spend most of its time actually refreshing the lock. It's ok if it's doing something to the DB, you want the lock refreshed. I'm guessing you're seeing a bayes expiry. Okay, seems to have calmed down now. i wonder if that's related to the fact that I seem to be having problems sending e-mail: The address to which the message has not yet been delivered is: [EMAIL PROTECTED] Delay reason: Connection timed out Presumably because my buffers have been filled. I'v restarted Exim in the hope that will help but I wonder what's causing this in the first place - what is screwing my SMTP server? It really doesn't look like it should be that busy but I don't really know where I should be looking! Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Problem with spamd
Am 08.11.2006 um 20:51 schrieb François Rousseau: max-children (set to 1 in this case). Why 1??? That's the default for servers run by this ISP. Do you have a suggestion? How many email to you received by day? (or by minute???) Excluding spam it's probably less than 50 per day for all accounts on this server! So there shouldn't ever be a problem. I *think* that the changes I've made today including restarting Exim seem to be working. The problem may have been related to one account getting full and not accepting any new mail but I don't find this particularly convincing for the mail server running out of resources, Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Problem with spamd
Am 08.11.2006 um 22:45 schrieb Theo Van Dinter: On Wed, Nov 08, 2006 at 10:18:53PM +0100, Charlie Clark wrote: How many email to you received by day? (or by minute???) Excluding spam it's probably less than 50 per day for all accounts on this server! So there shouldn't ever be a problem. I *think* that the changes I've made today including restarting Exim seem to be working. If you only receive 2-3 messages per hour, just run spamassassin and don't bother with spamc/spamd. Why have another daemon? I didn't set this up originally and I generally try and follow the rule of messing with the system as little as possible as it is. That said I've extended the local.cf file which had virtually no directives and am in the process of upgrading from 3.0.3 to 3.1.7. I'm not pleased with my ISP for taking over a week to investigate the initial complaint and me actually using the trouble ticket to annotate the changes I make! Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Problem with spamd
Am 08.11.2006 um 23:00 schrieb Charlie Clark: Am 08.11.2006 um 22:45 schrieb Theo Van Dinter: On Wed, Nov 08, 2006 at 10:18:53PM +0100, Charlie Clark wrote: How many email to you received by day? (or by minute???) Excluding spam it's probably less than 50 per day for all accounts on this server! So there shouldn't ever be a problem. I *think* that the changes I've made today including restarting Exim seem to be working. If you only receive 2-3 messages per hour, just run spamassassin and don't bother with spamc/spamd. Why have another daemon? I didn't set this up originally and I generally try and follow the rule of messing with the system as little as possible as it is. That said I've extended the local.cf file which had virtually no directives and am in the process of upgrading from 3.0.3 to 3.1.7. I'm not pleased with my ISP for taking over a week to investigate the initial complaint and me actually using the trouble ticket to annotate the changes I make! Looks like I'm on top of the resources problem but I am getting 421 delivery errors even though the e-mails are coming through. This looks very similar to bug 3828 (which is Spamassassin + Exim). Except this bug should have been closed a long time ago. The strange thing is these errors never occurred before last week and having just upgraded to 3.1.7 I would hope to have a system including all relevant bug fixes. Of course, as Theo said it might simply be easier to stop using spamd and just call spamassassin but it might also be helpful to track down the problem. Should I jump on the back of the old bug or make a new submission? Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226
Re: Rule for raw HTML
Am 09.11.2006 um 01:18 schrieb Ron: A few spams have slipped by that contain HTML that is appearing as normal text (due to them not getting something right). For example: and you may haveBRcontempt seemed abundantly increasing with the length of his second speech, and at the end of it heBRand the mortification of kitty Is there a rule that will catch HTML like tags that are not in the right MIME type section? I also see this a lot with A HREF=... links. I can't see the need for an extra rule for this as it should be caught by the Bayesian rules after the very briefest of training. That the HTML doesn't display correctly is par for the course for spam which almost by definition does not play by the rules. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360 GSM: +49-178-782-6226