United-MAP spam flood
Hello Folks, Did you also get many spams from United-MAP, a dynamic company with rapid development, with a united team of professionals in its core.? :) Or maybe this new spam flood is only Poland targeted? Here are a few spam samples: http://pastebin.com/m178f4a58 http://pastebin.com/m6d07f79d http://pastebin.com/m477546b9 My best regards, Pawel
Re: maildrop+spamc: invalid usage
Петров Николай pisze: Hi,all! I always some day try to resolve the problem spamc through unix socket+maildrop, but unsuccessfully, please help resolve the problem! Hello Nicolas! ;) Did you try TCP/IP mode too? I have never used socket mode of spamc. We only use TCP/IP mode and have no problems. When I have incoming message, nothing about 'spam' is maillog, only '...spamc[2183]: invalid usage...' I've looked at spamc/spam.c file and I can see the following piece of code there: int read_args(int argc, char **argv, int *max_size, char **username, struct transport *ptrn) { #ifndef _WIN32 const char *opts = -BcrRd:e:fyp:t:s:u:xSHU:ElhV; #else const char *opts = -BcrRd:fyp:t:s:u:xSHElhV; #endif int opt; int ret = EX_OK; while ((opt = getopt(argc, argv, opts)) != -1) { switch (opt) { // [...] case '?': case ':': { libspamc_log(flags, LOG_ERR, invalid usage); ret = EX_USAGE; /* FALLTHROUGH */ } // [...] } } return ret; } It seems that your maildrop runs spamc with strange arguments. Sorry, but I don't know why. My best regards, Pawel
Re: [NEW SPAM FLOOD] www.shopXX.net
Dnia 2009-07-10, pią o godzinie 16:48 -0700, fchan pisze: Don't tempt them, I already get enough spam not only from these guys. Also they will flood the network with smtp useless connections and unless you have good network attack mitigation system so you don't have a DDoS, don't tempt them. Please don't be affraid and help to beat them. Do you only update your local rules? I think it's not sufficient reaction. We also should send abuse reports to Internet providers of spammers. They have to shutdown that website. P.
Never ending spam flood www.viaXX.net?
Hi, Because of Apache.org spam filters I can't send here my message about spammers again: Jul 9 22:32:07 hermes2 courieresmtp: id=00174B77.4A5653AA.7F82,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org: 552 spam score (15.4) exceeded threshold Jul 9 22:32:07 hermes2 courieresmtp: id=00174B77.4A5653AA.7F82,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org,status: failure [...] Jul 10 10:48:59 hermes1 courieresmtp: id=000B43A2.4A57005C.346D,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org: 552 spam score (15.4) exceeded threshold Jul 10 10:48:59 hermes1 courieresmtp: id=000B43A2.4A57005C.346D,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org,status: failure Please see my initial post on Pastebin: http://pastebin.com/f6a83e9fb My best regards, Pawel
Re: Never ending spam flood www.viaXX.net?
Terry Carmen pisze: Hi, Because of Apache.org spam filters I can't send here my message about spammers again: . . . http://pastebin.com/f6a83e9fb I'm new to this list, and may be missing something obvious, but this looks like a great candidate for a firewall DROP rule. Hi Terry, You are welcome here! :) Is there any reason you don't just drop the packets instead of wasting time deciding if they're spam? I pasted a few IP adresses of web drug store with viagra and another medicaments for the men with erection issues. The spam flood advertises that shop, but we receive unsolicited messages from infected Windows machines, compromised or buggy webmails, etc. in all the world. My best regards, Pawel
Re: [NEW SPAM FLOOD] www.shopXX.net
Dnia 2009-07-11, sob o godzinie 00:18 +0200, Paweł Tęcza pisze: I received very similar spam too. It also includes www.ma29. net domain. It's probably personal dedication from the spammers to me ;) Thank you! I know you're watching that mailing list. Hey spammers! ;) It's after midnight here, but I've updated my rules. So you have to think up something new. P.
Re: AE_MEDS35 does not more work...
Michelle Konzack pisze: Am 2009-07-02 15:18:16, schrieb John Hardin: Can you post the original raw message to a pastebin, please? I am on GSM (O2) and not able to upload to pastebin (I can view contents abut not upload) I will try to upload it to http://devel.debian.tamay-dogan.net/tmp/spamassassin/ Hello, $ wget http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.00.msg ... $ wget http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.11.msg $ spamassassin -D non_working_sa.00.msg non_working_sa.00.log 21 ... $ spamassassin -D non_working_sa.00.msg non_working_sa.11.log 21 $ grep ran body rule LOCAL_BODY_WWW_MEDSXX_NET non_working_sa.*.log non_working_sa.00.log:[16376] dbg: rules: ran body rule LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net non_working_sa.01.log:[17726] dbg: rules: ran body rule LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net non_working_sa.02.log:[21854] dbg: rules: ran body rule LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net non_working_sa.10.log:[22118] dbg: rules: ran body rule LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net non_working_sa.11.log:[22291] dbg: rules: ran body rule LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net I have probably older version John's regexp and as you can see above it works for me very well. # Thanks to John Hardin! :) body LOCAL_BODY_WWW_MEDSXX_NET /\bwww(?:\s|\s\W|\W\s)\w{3,6}\d{2,6}(?:\s|\s\W|\W\s)(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i scoreLOCAL_BODY_WWW_MEDSXX_NET 5.0 describe LOCAL_BODY_WWW_MEDSXX_NET (www medsXX net) spam Kind regards, P.
Re: AE_MEDS35 does not more work...
Paweł Tęcza pisze: Hello, $ wget http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.00.msg ... $ wget http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.11.msg $ spamassassin -D non_working_sa.00.msg non_working_sa.00.log 21 ... $ spamassassin -D non_working_sa.00.msg non_working_sa.11.log 21 ^^ Should be non_working_sa.11.msg, of course. It's only typo, I've checked all your spam samples. P.
Re: [NEW SPAM FLOOD] www.shopXX.net
Dnia 2009-06-26, pią o godzinie 14:15 -0700, John Hardin pisze: On Fri, 26 Jun 2009, Pawe~B T~Ycza wrote: Dnia 2009-06-23, wto o godzinie 09:39 +0200, Paweł Tęcza pisze: body OBFU_URI_WWDD_2 /\bwww\s(?:\W\s)?\w{3,6}\d{2,6}\s(?:\W\s)?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i The spammers strike in weekend again. Unfortunately the rule above doesn't work for the latest incarnation of that spam, it means www. pill22. com. {sung to the tune of Peter Gabriel's Kiss That Frog} Whack that mole! /\bwww(?:\s|\s\W|\W\s)\w{3,6}\d{2,6}(?:\s|s\W|\W\s)(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i ^ John, Thanks a lot for rule update! It works fine. I can say it's nearly perfect, because it missing only one small back-slash :) Please look above. Have a nice weekend! P.
552 spam score (11.3) exceeded threshold
Hello There, Yesterday I was trying to send here warning of new www.shopXX.net spam flood. It was short letter with a few URLs to pastebin.com. Unfortunately my messages hasn't arrived at the mailing list. Today I've checked a maillog of my servers and I can see the following lines: Jun 21 21:10:15 hermes2 amavis[24103]: (24103-7) Passed CLEAN, AM-SOCK [:::195.225.120.114] [195.225.120.114] pte...@uw.edu.pl - users@spamassassin.apache.org, Queue-ID: 00140038.4A3E8597.6FAE, Message-ID: 1245611412.5515.5.ca...@localhost.localdomain, mail_id: yYNQAmuAAueT, Hits: -, 61 ms Jun 21 21:10:15 hermes2 courierd: started,id=00140038.4A3E8597.6FAE,from=pte...@uw.edu.pl,module=esmtp,host=spamassassin.apache.org,addr=users@spamassassin.apache.org Jun 21 21:10:51 hermes2 courieresmtp: id=00140038.4A3E8597.6FAE,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org: 552 spam score (11.3) exceeded threshold Jun 21 21:10:51 hermes2 courieresmtp: id=00140038.4A3E8597.6FAE,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org,status: failure Sorry, but there were broken by my stupid Thunderbird... What's up? Do I really look like a spammer? ;) My best regards, Pawel
Re: [NEW SPAM FLOOD] www.shopXX.net
Jari Fredriksson pisze: Your tule does not compute. Syntax errors. Hi, I can confirm the issue: # spamassassin --lint [25805] warn: config: error: rule 'AE_MEDS35:' has invalid characters (not Alphanumeric + Underscore + starting with a non-digit) [25805] warn: config: SpamAssassin failed to parse line, no value provided for score, skipping: score 3.0 [25805] warn: config: warning: description exists for non-existent rule obfuscated [25805] warn: lint: 3 issues detected, please rerun with debug enabled for more information Using --lint option of SA for testing a new rule is a good idea in my humble opinion :) Cheers. Pawel
Re: [NEW SPAM FLOOD] www.shopXX.net
Michelle Konzack pisze: Am 2009-06-22 10:52:54, schrieb Pawe?? T??cza: # spamassassin --lint [25805] warn: config: error: rule 'AE_MEDS35:' has invalid characters (not Alphanumeric + Underscore + starting with a non-digit) Copied wrongly? Here it is working. Michelle, Copied correctly, but it wasn't copy-and-paste snippet as I thought :) Bellow is diff between your version and version ready to paste: -body AE_MEDS35: /\(\s?w{2,4}\s(meds|shop)\d{1,4}\s(?:net|com|org)\s?)/ -describe obfuscated domain in message -score3.0 +body AE_MEDS35 /\(\s?w{2,4}\s(meds|shop)\d{1,4}\s(?:net|com|org)\s?\)/ +describe AE_MEDS35 obfuscated domain in message +scoreAE_MEDS35 3.0 Please note a colon character in the rule name. `spamassassin --lint` says it's forbidden. Also you should escape the last round bracket in the regexp. Cheers, P.
Re: Bayes and SQL.
Kasper Sacharias Eenberg pisze: Goodday. I'm installing a new spamfilter for my company, and i figured i'd try Bayes is SQL. However i have some problems with maintenance and a few general questions. Hi Kasper, We have been using Bayes in SQL formerly. I don't know/remember all answers for your questions, but I can try to help you little. 5) Another 'problem' i'm having is that restoring from backup - sql is horribly slow. Is this normal or might my mysql/network not be running optimally? I don't really know how to test bayes queries, but normal queries to the SQL go fast. Did you try MySQL dumps? It should be faster way for restoring. mysqldump your_bayes_db bayes_dump.sql echo drop database your_bayes_db |mysql your_bayes_db echo create database your_bayes_db |mysql your_bayes_db mysql your_bayes_db bayes_dump.sql Versions: CentOS 5.3 Spamassassin 3.2.5 Perl: 5.8.8 MySQL: 5.0.45-7.el5 (The mysql is run on another server of WAN) What database storage do you use for your Bayes? I remember that we had to switch from MyISAM do InnoDB because of stable and performance issues. Have a nice summer day :) P.
Re: [NEW SPAM FLOOD] www.shopXX.net
McDonald, Dan pisze: I'm considering a low-scoring rule like: body AE_MEDS37 /\(\s?w{2,4}\s[:alpha:]{4}\d{1,4}\s(?:net|com|org)\s?\)/ describe AE_MEDS37 rule to catch the next wave of spaced domains score AE_MEDS37 1.0 Hi Dan, I have score 4.0 for that kind of spam, but I can see that even such high score is not sufficient sometimes. My SA tags that messages as spam only if they also pass RCVD_IN_BL_SPAMCOP_NET and RCVD_IN_SORBS_DUL tests. My best regards, Pawel
Re: new spam image with random body message
Anthony Peacock pisze: Adam Cécile (Le_Vert) wrote: Hello, Could you give us the line from your local.cf to enable such tests ? Thanks in advance, Which tests? You quote the whole list, some are standard some are additions. Hi Anthony, Please show us your addition tests, of course :D My best regards, P.
Re: new spam image with random body message
Anthony Peacock pisze: Hi, Paweł Tęcza wrote: Hi Anthony, Please show us your addition tests, of course :D Unless you are a UK Higher Education organisation you won't be able to use RCVD_IN_JANET_DUL. What a pity. We are Polish university :) Other than that I think the only additional one is the BOTNET plugin by John Rudd, which is available here: http://people.ucsc.edu/~jrudd/spamassassin/ As far as I remember the rest are standard. Thank you very much for the URL to BOTNET plugin! Have a nice weekend, P.
Re: New www.medsXX.net spam
Benny Pedersen pisze: On Fri, June 19, 2009 11:24, Pawe? T?cza wrote: Hello People, http://pastebin.com/m5988eed are you sure you want email To: r...@uw.edu.pl from outside world ? assume its the envelope recipient, if not just ignore me :) check your aliases in mta Hello Benny, r...@uw.edu.pl is only alias. We have postmas...@uw.edu.pl alias too, but there not the same aliases :) http://pastebin.com/m5835257 same here To: mailer-dae...@student.uw.edu.pl is mailer-daemon one that works local to you ?, if no then its clearly spam bounces or non working remote mta It's a next alias :) http://pastebin.com/m11b07539 your mta/sa is running on ipv6 host, ipv6 is not supported very well in sa, thats why you get low scores Have a nice day, no problem Thanks a lot for your comments! :) P.
Re: [SA SPAM 1.4 ] Re: New www.medsXX.net spam
Randal, Phil pisze: Paweł Tęcza wrote: What's the rule for deliberately misspelled words? My best regards, Pawel In this country, at least, misspelled belongs in that list of misspelt words. Oh, don't we all love American English? *grin* Hi Phil, It's funny, isn't? :) Sorry, if it was hurting for your pure British English ;) Simply my typing was faster than my thinking :D Have a nice weekend! P.
Re: new spam image with random body message
Randal, Phil pisze: Anthony Peacock wrote: Paweł Tęcza wrote: Anthony Peacock pisze: Hi, Paweł Tęcza wrote: Hi Anthony, Please show us your addition tests, of course :D Unless you are a UK Higher Education organisation you won't be able to use RCVD_IN_JANET_DUL. What a pity. We are Polish university :) Yes, but this is just an academic feed of the MAPS RBL+ http://mail-abuse.com/index.html I've just had a quick look at our recent MAPS RBL+ hits and there are none which weren't already scoring highly. It's good to know that we can life happy without that RBL :) Thanks! I'd recommend both the Botnet and iXhash SA plugins if you're not already using them. Thank you for that recommendation! I'll try both plugins. Best regards, P.
Re: New www.medsXX.net spam
Dnia 2009-06-19, pią o godzinie 09:45 -0700, John Hardin pisze: On Fri, 2009-06-19 at 09:24 -0700, John Hardin wrote: On Fri, 2009-06-19 at 16:21 +0200, Paweł Tęcza wrote: body AE_MEDS35 /w{2,4}\s{0,4}meds\d{1,4}\s{0,4}(?:net|com|org)/ I've just noticed missing 'i' switch for your rule regexp. Is it a bug or a feature? :) That depends. If the URIs are always lowercasein the spams, making the RE case-insensitive doesn't help and may hurt. Hi John, I could see only lowercase URIs, but I rather prefer case-insensitive rules. Simply I don't want to get a lot of spam, because the spammer read that thread and changed only one letter :) BTW, probably \s+ will be better than \s{0,4}. Similarly with w{2,4} and \d{1,4}. No, it's not. In SA, unbounded matches are hazardous and should be avoided. {0,20} is safer than * and {1,20} is safer than +. This is not a general rule, it only applies where the text being scanned is from an untrusted (and possibly actively hostile) source. Another improvement: add word boundaries at the beginning and end: /\bw{2,4}\s{0,10}meds\d{1,4}\s{0,10}(?:net|com|org)\b/ Thanks a lot for your tips! It's next valuable lesson for me today :) If the parentheses in the original example are actually in the message, including them will help to. Are they actually in the message? Yes, I can see the parentheses in all the spam messages I received. But spammers can remove them soon, of course. D'oh, /me checks pastebins from first message... Also, body rules match cleaned-up text with runs of spaces collapsed, so you don't need to use + or {1,...} Try this: /\(\s?w{2,4}\smeds\d{1,4}\s(?:net|com|org)\s?\)/ Yes, I noticed it when I was testing my own rule: [1438] dbg: rules: ran body rule LOCAL_BODY_WWW_MEDSXX_NET == got hit: (www meds88 net) My best regards, Pawel
Re: new spam image with random body message
Ibrahim Harrani pisze: Do you have any solution about this kind of spams? Hello Ibrahim, Could you please show me the Content-* headers of image attachment? Did you send all headers of that spam in your previous post? I have some success with fighting that spam I called BAD GOOD PENIS, but I can see that it evolves, so my rules should be improved too. My best regards, Pawel
Re: new spam image with random body message
Ibrahim Harrani pisze: Hi, another header from another image spams. All images contain god, bad and a url with numbers. The spamers are cunning... It seems that they have stopped sending spams with X-Mailer: header containing something like PHP v5.2.0 or PHP/4.4.5. Also they don't use only digits in attachment filenames. So I'm affraid that my Spamassassin rules are not effective for that kind of spam :( It seems that ocrad can't decode the strings in the images. FuzzyOcr version is 3.6.0 I've added BAD, GOOD and exemplary domain name to my FuzzyOcr word file, but unfortunately FuzzyOcr didn't recognise them :( Maybe someone has better idea how to fight that image spam? Cheers, P.
Re: InnoDB as storage engine for sa_bayes
Alex Woick [EMAIL PROTECTED] writes: -rw-rw 1 mysql mysql 1010M Aug 28 08:25 ibdata1 -rw-rw 1 mysql mysql 264M Aug 27 17:09 awl.ibd -rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_expire.ibd -rw-rw 1 mysql mysql 96K Aug 27 17:09 bayes_global_vars.ibd -rw-rw 1 mysql mysql 468M Aug 27 21:11 bayes_seen.ibd -rw-rw 1 mysql mysql 148M Aug 27 21:43 bayes_token.ibd -rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_vars.ibd As you can see above, the new storage engine consumed 2 times bigger diskspace then the old. Is it a good behave or I should feel worried? Nothing to worry. But you have perhaps imported your data twice and have an empty ibdata1 file which only occupies space. Hello Alex, At first, thanks a lot for your reply and interesting comments! :) I quoted the innodb data files. Since you have defined innodb_file_per_table, the table data is saved into the *.ibd files in the database directory. Without that option all table data would go to the ibdata* file(s) in the base data directory. As far as I know, data for one table is saved either in ibdata* or in the *.ibd file, but not both. I can quote innodb_data_file_path option to test it, of course, but I affraid that it's necessary. The MySQL doc [1] says: Note: InnoDB always needs the shared tablespace because it puts its internal data dictionary and undo logs there. The .ibd files are not sufficient for InnoDB to operate. Perhaps you played around and first imported the data without innodb_file_per_table, which imported into ibdata1. Then you perhaps dropped the tables and defined innodb_file_per table and imported again, so the *.ibd files were created and filled. The ibdata1 may now be empty, but it will never shrink. I remember that before injecting the MySQL dump I removed all ib* files and created initial InnoDB tablespace and logs running mysqld from command line: [EMAIL PROTECTED]:/var/lib/mysql# /usr/sbin/mysqld InnoDB: The first specified data file ./ibdata1 did not exist: InnoDB: a new database to be created! 070827 15:25:01 InnoDB: Setting file ./ibdata1 size to 10 MB InnoDB: Database physically writes the file full: wait... 070827 15:25:01 InnoDB: Log file ./ib_logfile0 did not exist: new to be created InnoDB: Setting log file ./ib_logfile0 size to 10 MB InnoDB: Database physically writes the file full: wait... 070827 15:25:01 InnoDB: Log file ./ib_logfile1 did not exist: new to be created InnoDB: Setting log file ./ib_logfile1 size to 10 MB InnoDB: Database physically writes the file full: wait... InnoDB: Doublewrite buffer not found: creating new InnoDB: Doublewrite buffer created InnoDB: Creating foreign key constraint system tables InnoDB: Foreign key constraint system tables created 070827 15:25:02 InnoDB: Started; log sequence number 0 0 070827 15:25:02 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.0.38-Ubuntu_0ubuntu1-log' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Ubuntu 7.04 distribution While injecting I could see how the ibdata1 file was growing from 10MB to 1010MB. Try the following: Dump all databases which have innodb tables and drop all innodb tables. Stop the server, remove the ibdata1 and *.ibd files and restart the server. An empty and small ibdata1 file will be recreated. Now import your databases. I bet the ibdata1 file will not grow and all data will be imported into *.ibd. Please look at below. What have I won in the bet? ;) It is not neccessary to dump/reload the data for changing the database engine of a table. Simply edit the tables with Mysql Query Browser or the Mysql Administrator and change the table engine from myisam to innodb. Or execute an SQL statement: ALTER TABLE mytable ENGINE=innodb. Yes, I know it and even I was trying to convert the MyISAM tables in that way, but it was terrible slowly, so I chose a method with injecting MySQL dump. Unfortunately it wasn't faster ;) [EMAIL PROTECTED]:/var/lib/mysql# date mysql sa_bayes -u root -p ~/sa_bayes_innodb.sql date Mon Aug 27 15:28:40 CEST 2007 Enter password: Mon Aug 27 21:42:45 CEST 2007 [EMAIL PROTECTED]:/var/lib/mysql# You can see above that it took more then 6 hours for ~560MB dump file! My best regards, Pawel [1] http://dev.mysql.com/doc/refman/5.0/en/multiple-tablespaces.html
Re: InnoDB as storage engine for sa_bayes
Paweł Tęcza [EMAIL PROTECTED] writes: Alex Woick [EMAIL PROTECTED] writes: Perhaps you played around and first imported the data without innodb_file_per_table, which imported into ibdata1. Then you perhaps dropped the tables and defined innodb_file_per table and imported again, so the *.ibd files were created and filled. The ibdata1 may now be empty, but it will never shrink. I remember that before injecting the MySQL dump I removed all ib* files and created initial InnoDB tablespace and logs running mysqld from command line: I've forgotten to add that my sa_bayes database was empty before injecting of MySQL dump, because firstly I dropped it and next created. Kind regards, Pawel
InnoDB as storage engine for sa_bayes
Hello Spamassassins! ;) A few weeks ago I had problems with a capacity of my MySQL 5.0.38 server with sa_bayes database stored in MyISAM when it was handling a lot of SQL queries from my Spamassassin cluster. The only one solution was to disable using Bayes. I wrote about my problems here and I heard many useful advices. One of them was to convert my sa_bayes database from MyISAM to InnoDB storage engine. I didn't have any experiences with InnoDB, so I had to learn it. Now I know more about it, but I still have a few doubts... Below you can see details about a copy of my old sa_bayes database with MyISAM: # ls -lh sa_bayes/ total 809M -rw-r- 1 ptecza ptecza 8,5K 2007-08-16 15:29 awl.frm -rw-r- 1 ptecza ptecza 139M 2007-08-16 15:29 awl.MYD -rw-r- 1 ptecza ptecza 112M 2007-08-16 15:29 awl.MYI -rw-r- 1 ptecza ptecza 8,4K 2007-08-16 15:29 bayes_expire.frm -rw-r- 1 ptecza ptecza 207 2007-08-16 15:29 bayes_expire.MYD -rw-r- 1 ptecza ptecza 2,0K 2007-08-16 15:29 bayes_expire.MYI -rw-r- 1 ptecza ptecza 8,4K 2007-08-16 15:29 bayes_global_vars.frm -rw-r- 1 ptecza ptecza 20 2007-08-16 15:29 bayes_global_vars.MYD -rw-r- 1 ptecza ptecza 2,0K 2007-08-16 15:29 bayes_global_vars.MYI -rw-r- 1 ptecza ptecza 8,5K 2007-08-16 15:29 bayes_seen.frm -rw-r- 1 ptecza ptecza 213M 2007-08-16 15:29 bayes_seen.MYD -rw-r- 1 ptecza ptecza 278M 2007-08-16 15:29 bayes_seen.MYI -rw-r- 1 ptecza ptecza 8,5K 2007-08-16 15:29 bayes_token.frm -rw-r- 1 ptecza ptecza 24M 2007-08-16 15:29 bayes_token.MYD -rw-r- 1 ptecza ptecza 44M 2007-08-16 15:29 bayes_token.MYI -rw-r- 1 ptecza ptecza 8,8K 2007-08-16 15:29 bayes_vars.frm -rw-r- 1 ptecza ptecza 52 2007-08-16 15:29 bayes_vars.MYD -rw-r- 1 ptecza ptecza 3,0K 2007-08-16 15:29 bayes_vars.MYI -rw-r- 1 ptecza ptecza 65 2007-08-16 15:29 db.opt Here are details about a new sa_bayes database with InnoDB: ls -lh ib* -rw-rw 1 mysql mysql 10M Aug 28 08:25 ib_logfile0 -rw-rw 1 mysql mysql 10M Aug 27 21:42 ib_logfile1 -rw-rw 1 mysql mysql 1010M Aug 28 08:25 ibdata1 # ls -lh sa_bayes/ total 882M -rw-rw 1 mysql mysql 8.5K Aug 27 15:28 awl.frm -rw-rw 1 mysql mysql 264M Aug 27 17:09 awl.ibd -rw-rw 1 mysql mysql 8.4K Aug 27 17:08 bayes_expire.frm -rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_expire.ibd -rw-rw 1 mysql mysql 8.4K Aug 27 17:08 bayes_global_vars.frm -rw-rw 1 mysql mysql 96K Aug 27 17:09 bayes_global_vars.ibd -rw-rw 1 mysql mysql 8.5K Aug 27 17:08 bayes_seen.frm -rw-rw 1 mysql mysql 468M Aug 27 21:11 bayes_seen.ibd -rw-rw 1 mysql mysql 8.5K Aug 27 21:09 bayes_token.frm -rw-rw 1 mysql mysql 148M Aug 27 21:43 bayes_token.ibd -rw-rw 1 mysql mysql 8.8K Aug 27 21:42 bayes_vars.frm -rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_vars.ibd -rw-rw 1 mysql mysql 65 Aug 27 15:23 db.opt It has exactly the same content like old database and it was simply injected from MySQL dump. As you can see above, the new storage engine consumed 2 times bigger diskspace then the old. Is it a good behave or I should feel worried? Could you please tell me what the size of your sa_bayes database with InnoDB is? What diskspace should I reserve? Probably you would like to know my InnoDB settings too: # grep ^innodb /etc/mysql/my.cnf innodb_data_file_path=ibdata1:10M:autoextend innodb_autoextend_increment=10M innodb_file_per_table innodb_buffer_pool_size=60M innodb_additional_mem_pool_size=5M innodb_log_files_in_group=2 innodb_fast_shutdown=1 innodb_log_file_size=10M innodb_log_buffer_size=5M innodb_flush_log_at_trx_commit=1 innodb_lock_wait_timeout=25 I agree that a size of buffers is not too big, but it's only my testing box, not a production machine. My best regards, Pawel
Re: picture spams
Loren Wilton [EMAIL PROTECTED] writes: Hi Loren, I did the test and unfortunately my FuzzyOcr (3.5.1) was bitten by that spam image. The normal scan setups for FuzzyOCR don't rotate the images, so will in all probability miss a rotated image like this. These were quite popular for a while and a couple of people developed scansets that contained rotation as one of the preprocessing steps. I don't seem to have saved any of the messages relating to that thread. As best I recall they found that rotating 8 degrees or so worked well. Or maybe it was 18. You can probably find info on the FuzzyOcr mailing list: Hi Loren, I was quite sure that FuzzyOcr project is dead, because a few months ago I was trying to contact his author, Decoder, but no success. Probably he was very busy :) Fortunately, it seems that FuzzyOcr project still is alive. It's a very good message for me, because it's really a very useful utility :) I've found a threat about rotated spam images at FuzzyOcr page [1]. Currently Decoder hasn't time to implement checking image rotation, but he will try to do it in the future. Now we can only work-around it, for example using the preprocessor/scanset settings. Who of you do rotate images in your FuzzyOcr? Do you use fixed degrees or detect the skew angle and rotate the image accordingly? Could you share this? Kind regards, Pawel [1] http://fuzzyocr.own-hero.net/ticket/408
Re: picture spams
[EMAIL PROTECTED] writes: On Fri, 17 Aug 2007, Pawe? T?cza wrote: I did the test and unfortunately my FuzzyOcr (3.5.1) was bitten by that spam image. You can manually mark this picture as bad : # fuzzy-find --delete image # fuzzy-find --learn-spam image Hi, Thanks for the hint! I believe that it's an effective method, but I have no time to learn my FuzzyOcr manually ;) Have a nice day, Pawel
Re: Spam kills my MySQL with Bayes
Pawel Sasin [EMAIL PROTECTED] writes: [...] Have you tried this on your SA servers? http://wiki.apache.org/spamassassin/DBIPlugin Hello Pawel! :D Thank you very much for the message about DBIPlugin! I've never used it before. It looks interesting for me, so I've just downloaded that plugin and I'm testing it on one of my SA nodes right now :) AFAIK spawning many connections to mysql servers causes quite a big load on them. I didn't noticed big load on my server with MySQL while punctuation spam bombing. Yes, it increased, but from 0.1 to 1.1 :) I think we didn't have many connections, but many SQL queries. Greetings from Warsaw! :) Pawel
Re: Spam kills my MySQL with Bayes
SM [EMAIL PROTECTED] writes: [...] Now I use MyISAM strorage backend, because I just created Bayesian database using Spamassassin sql/bayes_mysql.sql file :) The recommendations in the sql/bayes_mysql.sql file are for the average setup. It doesn't cover MySQL optimization techniques as that a MySQL specific issue. You can change the engine from MyISAM to InnoDB (see ALTER TABLE). That should improve performance for INSERTs. With the amount of mail your server handles, you either have to improve MySQL performance, switch to more powerful hardware or disable Bayes. If you disable Bayes, the punctuation spam would still be caught in your setup as it scored over 19 points. Hello again! :) I'm working on conversion of storage engine from MyISAM to InnoDB. My hardware seems to be good enough. It's Sun Fire x4100 M2 server with 2 x Dual-Core AMD Opteron 2220 SE CPUs and 8GB RAM on the board and it's bored with its job ;) I think I rather need faster disks. My best regards, Pawel
Re: picture spams
Loren Wilton [EMAIL PROTECTED] writes: FuzzyOcr should do a good job on something like that. Loren http://dreams.741.com/spam.gif Hi Loren, I did the test and unfortunately my FuzzyOcr (3.5.1) was bitten by that spam image. Here are the message headers: X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on anubis2.poczta.uw.edu.pl X-Spam-Level: x X-Spam-Status: No, score=1.3 required=5.0 tests=SB_GIF_AND_NO_URIS autolearn=disabled version=3.2.1 And here is a piece of output of `spamassassin -D`: [17547] dbg: FuzzyOcr: Starting FuzzyOcr... [17547] info: FuzzyOcr: Processing Message with ID [EMAIL PROTECTED] (Pawel Tecza [EMAIL PROTECTED] - [EMAIL PROTECTED]) [17547] dbg: FuzzyOcr: fname: spam.gif = spam.gif [17547] dbg: message: decoding base64 [17547] info: FuzzyOcr: GIF: [342x434] spam.gif (9377) [17547] dbg: FuzzyOcr: Saved: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif [17547] dbg: FuzzyOcr: Saved: /tmp/.spamassassin17547AFJ63Ztmp/raw.eml [17547] info: FuzzyOcr: Found: 1 images [17547] dbg: FuzzyOcr: Connecting to: dbi:mysql:database=FuzzyOcr;host=mysqlhost [17547] dbg: dbiplugin: Creating uncached database handle to 'database=FuzzyOcr;host=mysqlhost_fuzzyocr_fuzzyocr_AutoCommit=1_PrintError=1_Username=fuzzyocr' [17547] dbg: config: using /var/lib/courier/.spamassassin for user state dir [17547] dbg: FuzzyOcr: pfile = /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm [17547] dbg: FuzzyOcr: efile = /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.err [17547] dbg: FuzzyOcr: Errors to: /tmp/.spamassassin17547AFJ63Ztmp/raw.err [17547] dbg: FuzzyOcr: File has Content-Type image/gif and File Extension gif [17547] info: FuzzyOcr: Found GIF header name=spam.gif [17547] dbg: FuzzyOcr: Saved pid: 17671 [17671] dbg: FuzzyOcr: Exec : /usr/bin/giftext /tmp/.spamassassin17547AFJ63Ztmp/spam.gif [17671] dbg: FuzzyOcr: Stdout: /tmp/.spamassassin17547AFJ63Ztmp/giftext.info [17671] dbg: FuzzyOcr: Stderr: /tmp/.spamassassin17547AFJ63Ztmp/giftext.err [17547] dbg: FuzzyOcr: Elapsed [17671]: 0.016500 sec. (/usr/bin/giftext: exit 0) [17547] info: FuzzyOcr: Image is single non-interlaced... [17673] dbg: FuzzyOcr: Exec : /usr/bin/giffix /tmp/.spamassassin17547AFJ63Ztmp/spam.gif [17673] dbg: FuzzyOcr: Stdout: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif-fixed.gif [17673] dbg: FuzzyOcr: Stderr: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.err [17547] dbg: FuzzyOcr: Saved pid: 17673 [17547] dbg: FuzzyOcr: Elapsed [17673]: 0.019540 sec. (/usr/bin/giffix: exit 0) [17674] dbg: FuzzyOcr: Exec : /usr/bin/giftopnm /tmp/.spamassassin17547AFJ63Ztmp/spam.gif-fixed.gif [17674] dbg: FuzzyOcr: Stdout: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm [17674] dbg: FuzzyOcr: Stderr: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.err [17547] dbg: FuzzyOcr: Saved pid: 17674 [17547] dbg: FuzzyOcr: Elapsed [17674]: 0.173627 sec. (/usr/bin/giftopnm: exit 0) [17547] info: FuzzyOcr: Calculating image hash for: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm [17681] dbg: FuzzyOcr: Exec : /usr/bin/ppmhist -noheader /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm [17681] dbg: FuzzyOcr: Stdout: /tmp/.spamassassin17547AFJ63Ztmp/ppmhist.info [17681] dbg: FuzzyOcr: Stderr: /dev/null [17547] dbg: FuzzyOcr: Saved pid: 17681 [17547] dbg: FuzzyOcr: Elapsed [17681]: 0.022073 sec. (/usr/bin/ppmhist: exit 0) [17547] dbg: FuzzyOcr: Got: 445299:342:434:7::252:254:252:253:90487::44:106:172:95:21369::84:150:204:136:16164::108:182:220:164:8414::20:74:140:65:7329::252:206:4:197:2789 [17547] dbg: FuzzyOcr: delete from FuzzyOcr.Hash where Hash.check 1186058256 [17547] info: FuzzyOcr: Found[Safe]: Score='0.000' Info: '' [17547] info: FuzzyOcr: Matched [2] time(s). Prev match: 9 min. 52 sec. ago [17547] dbg: FuzzyOcr: update FuzzyOcr.Safe set Safe.match='2',Safe.check='1187354257' where Safe.key='252:254:252:253:90487::44:106:172:95:21369::84:150:204:136:16164::108:182:220:164:8414::20:74:140:65:7329::252:206:4:197:2789' [17547] info: FuzzyOcr: Image in KNOWN_GOOD. Skipping OCR checks... [17547] dbg: FuzzyOcr: Remove DIR: /tmp/.spamassassin17547AFJ63Ztmp [17547] dbg: FuzzyOcr: FuzzyOcr ending successfully... [17547] dbg: FuzzyOcr: Processed in 0.345128 sec. My best regards, Pawel
Re: Spam kills my MySQL with Bayes
Henrik Krohns [EMAIL PROTECTED] writes: [...] My hardware seems to be good enough. It's Sun Fire x4100 M2 server with 2 x Dual-Core AMD Opteron 2220 SE CPUs and 8GB RAM on the board and it's bored with its job ;) I think I rather need faster disks. With that amount memory you won't see much disk activity. You can happily increase mysql buffer cache sizes to a GB or two. It's all basic mysql tuning. Hi Henrik, It's a good suggestion. Thanks a lot! :) My best regards, Pawel
Re: Suggested botnet rule scores
Henrik Krohns [EMAIL PROTECTED] writes: [...] If you want a simple solution, you can try http://sa.hege.li/ for BadRelay plugin. Interesting license... ;) Have a nice day, Pawel
Re: Spam kills my MySQL with Bayes
Pawel Sasin [EMAIL PROTECTED] writes: [...] You said you have several servers running spamd - if updates are causing you much trouble then you could disable bayes_autolearn on most of the servers, so that only some of them (down to 1) would update your bayes DB, while the others would just query it. Thanks for the next hint, Pawel! I didn't think about it :) I agree it's a better solution then disabled Bayes everywhere. Pawel
Re: Spam kills my MySQL with Bayes
SM [EMAIL PROTECTED] writes: Hi Pawel, At 01:36 16-08-2007, =?iso-8859-2?Q?Pawe=B3_T=EAcza?= wrote: [...] Is it not a new kind of spam and Spamassassin should be improved to fight it? I'm not sure... No, it is not new. I posted the following reply a few days back regarding this type of message referred to as punctuation spam. The message hits hit BAYES_99 and FRT_PRICE. As you did not include the headers, it's not possible to tell whether it would hit some of the DYNAMIC rules as well. Hello mysterious SM! ;) Thanks a lot for the reply and the explanation! Here are the Spamassassin headers for one of a spam mail we received: X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on anubis3.poczta.uw.edu.pl X-Spam-Level: xxx X-Spam-Status: Yes, score=19.3 required=5.0 tests=FH_HELO_EQ_D_D_D_D,FRT_PRICE, FRT_STRONG1,FRT_SYMBOL,HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_BL_SPAMCOP_NET , RCVD_IN_PBL,TVD_FUZZY_SYMBOL,TVD_STOCK1 autolearn=disabled version=3.2.1 X-Spam-Report: =?ISO-8859-1?Q? * 0.5 FH_HELO_EQ_D_D_D_D Helo is d-d-d-d * 2.5 FRT_PRICE BODY: ReplaceTags: Price * 3.6 FRT_SYMBOL BODY: ReplaceTags: Symbol * 1.4 TVD_FUZZY_SYMBOL BODY: TVD_FUZZY_SYMBOL * 2.9 FRT_STRONG1 BODY: ReplaceTags: Strong (1) * 3.8 TVD_STOCK1 BODY: TVD_STOCK1 * 0.0 HTML_MESSAGE BODY: Wiadomo=b6=e6 zawiera kod HTML * 1.8 MIME_QP_LONG_LINE RAW: Linia QP d=b3u=bfsza ni=bf 76 znak=f3w * 2.2 RCVD_IN_BL_SPAMCOP_NET RBL: Odebrane od systemu klasy RELAY w/g: * bl.spamcop.net * [Blocked - see http://www.spamcop.net/bl.shtml?89.191.164.221] * 0.5 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL * [89.191.164.221 listed in zen.spamhaus.org]?= Bill Landry suggested using chickenpox.cf and mangled.cf rules from SARE. Thanks for the hint! I'll try a look at them. The results is that spam was killing our MySQL database, because we had ~50k queries per minute with INSERTs and UPDATEs of a many tokens. The only one solution was to disable Bayes. MySQL can be optimized to handle such a load. If you aren't using InnoDB for Bayesian storage, switch to it. Now I use MyISAM strorage backend, because I just created Bayesian database using Spamassassin sql/bayes_mysql.sql file :) Have a nice day, Pawel
Re: Spam kills my MySQL with Bayes
Paweł Tęcza [EMAIL PROTECTED] writes: [...] Here are the Spamassassin headers for one of a spam mail we received: Ups! It was spam received when we disabled Bayes. Below are spam headers we scanned before: X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on anubis4.poczta.uw.edu.pl X-Spam-Level: x X-Spam-Status: Yes, score=21.5 required=5.0 tests=AXB_XMID_1212,BAYES_99, FH_HELO_EQ_D_D_D_D,FRT_PRICE,FRT_STRONG1,HELO_DYNAMIC_IPADDR,RCVD_IN_PBL, RCVD_IN_SORBS_DUL,RDNS_DYNAMIC,STOX_REPLY_TYPE,TVD_STOCK1 autolearn=spam version=3.2.1 X-Spam-Report: =?ISO-8859-1?Q? * 3.5 BAYES_99 BODY: Bayesowskie prawdopodobie=f1stwo spamu wynosi 99 do * 100% * [score: 1.] * 0.0 STOX_REPLY_TYPE STOX_REPLY_TYPE * 0.0 FH_HELO_EQ_D_D_D_D Helo is d-d-d-d * 2.4 HELO_DYNAMIC_IPADDR Relay HELO'd using suspicious hostname (IP addr * 1) * 3.5 AXB_XMID_1212 Barbera Fingerprint * 0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL * [90.14.168.63 listed in zen.spamhaus.org] * 0.9 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address * [90.14.168.63 listed in dnsbl.sorbs.net] * 3.5 FRT_PRICE BODY: ReplaceTags: Price * 3.0 FRT_STRONG1 BODY: ReplaceTags: Strong (1) * 3.8 TVD_STOCK1 BODY: TVD_STOCK1 * 0.1 RDNS_DYNAMIC Delivered to trusted network by host with * dynamic-looking rDNS?= What can you tell about it now? :) Regards, Pawel
Re: Spam kills my MySQL with Bayes
Kai Schaetzl [EMAIL PROTECTED] writes: Pawe³ Têcza wrote on Thu, 16 Aug 2007 12:25:48 +0200: What can you tell about it now? I think that's not the point, or is it? You don't seem to have a problem with detection but with token storage slowness on SQL as these fuzzy mails seem to generate a lot of unique tokens. Is that what you wanted to get fixed? Hi Kai, I would like two things: 1. try to speed up my MySQL server 2. decrease a number of unique tokens for punctuation spam The first of them is a task for me, of course. But the second is rather Spamassassin's job. I'm thinking whether it's really necessary to keep *all* tokens for that kind of spam... Maybe Spamassassin could save only some part of them? What's your opinion about it? My best regards, Pawel
Re: Spam kills my MySQL with Bayes
Kai Schaetzl [EMAIL PROTECTED] writes: Pawe³ Têcza wrote on Thu, 16 Aug 2007 14:28:05 +0200: [...] But the second is rather Spamassassin's job. I'm thinking whether it's really necessary to keep *all* tokens for that kind of spam... Maybe Spamassassin could save only some part of them? What's your opinion about it? I really don't know enough about Bayes and SA to say much about it. I think it would be difficult for SA to determine what are good and bad tokens. Yes, it can be difficult to determine, but what about configureable plugin option for maximum number of tokens per message to store? If it doesn't exist yet, of course ;) Pawel