United-MAP spam flood

2009-07-24 Thread Paweł Tęcza
Hello Folks,

Did you also get many spams from United-MAP, a dynamic company with
rapid development, with a united team of professionals in its core.? :)
Or maybe this new spam flood is only Poland targeted?

Here are a few spam samples:

http://pastebin.com/m178f4a58
http://pastebin.com/m6d07f79d
http://pastebin.com/m477546b9

My best regards,

Pawel



Re: maildrop+spamc: invalid usage

2009-07-22 Thread Paweł Tęcza
Петров Николай pisze:
 Hi,all!
 I always some day try to resolve the problem spamc through unix 
 socket+maildrop, but unsuccessfully, please help resolve the problem!

Hello Nicolas! ;)

Did you try TCP/IP mode too? I have never used socket mode of spamc.
We only use TCP/IP mode and have no problems.

 When I have incoming message, nothing about 'spam' is maillog, only 
 '...spamc[2183]: invalid usage...'

I've looked at spamc/spam.c file and I can see the following piece of
code there:

int
read_args(int argc, char **argv,
  int *max_size, char **username,
  struct transport *ptrn)
{
#ifndef _WIN32
const char *opts = -BcrRd:e:fyp:t:s:u:xSHU:ElhV;
#else
const char *opts = -BcrRd:fyp:t:s:u:xSHElhV;
#endif
int opt;
int ret = EX_OK;

while ((opt = getopt(argc, argv, opts)) != -1)
{
switch (opt)
{
// [...]
case '?':
case ':':
{
libspamc_log(flags, LOG_ERR, invalid usage);
ret = EX_USAGE;
/* FALLTHROUGH */
}
// [...]
}
}

return ret;
}

It seems that your maildrop runs spamc with strange arguments. Sorry,
but I don't know why.

My best regards,

Pawel


Re: [NEW SPAM FLOOD] www.shopXX.net

2009-07-11 Thread Paweł Tęcza
Dnia 2009-07-10, pią o godzinie 16:48 -0700, fchan pisze:
 Don't tempt them, I already get enough spam not only from these guys.
 Also they will flood the network with smtp useless connections and
 unless you have good network attack mitigation system so you don't
 have a DDoS, don't tempt them.

Please don't be affraid and help to beat them.

Do you only update your local rules? I think it's not sufficient
reaction. We also should send abuse reports to Internet providers of
spammers. They have to shutdown that website.

P.




Never ending spam flood www.viaXX.net?

2009-07-10 Thread Paweł Tęcza
Hi,

Because of Apache.org spam filters I can't send here my message about
spammers again:

Jul  9 22:32:07 hermes2 courieresmtp:
id=00174B77.4A5653AA.7F82,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org:
552 spam score (15.4) exceeded threshold
Jul  9 22:32:07 hermes2 courieresmtp:
id=00174B77.4A5653AA.7F82,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org,status:
failure
[...]
Jul 10 10:48:59 hermes1 courieresmtp:
id=000B43A2.4A57005C.346D,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org:
552 spam score (15.4) exceeded threshold
Jul 10 10:48:59 hermes1 courieresmtp:
id=000B43A2.4A57005C.346D,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org,status:
failure

Please see my initial post on Pastebin:

http://pastebin.com/f6a83e9fb

My best regards,

Pawel


Re: Never ending spam flood www.viaXX.net?

2009-07-10 Thread Paweł Tęcza
Terry Carmen pisze:
 Hi,

 Because of Apache.org spam filters I can't send here my message about
 spammers again:
 . . .
 
 http://pastebin.com/f6a83e9fb
 
 I'm new to this list, and may be missing something obvious, but this looks
 like a great candidate for a firewall DROP rule.

Hi Terry,

You are welcome here! :)

 Is there any reason you don't just drop the packets instead of wasting time
 deciding if they're spam?

I pasted a few IP adresses of web drug store with viagra and another
medicaments for the men with erection issues. The spam flood advertises
that shop, but we receive unsolicited messages from infected Windows
machines, compromised or buggy webmails, etc. in all the world.

My best regards,

Pawel


Re: [NEW SPAM FLOOD] www.shopXX.net

2009-07-10 Thread Paweł Tęcza
Dnia 2009-07-11, sob o godzinie 00:18 +0200, Paweł Tęcza pisze:

 I received very similar spam too. It also includes www.ma29. net
 domain. It's probably personal dedication from the spammers to me ;)
 Thank you! I know you're watching that mailing list.

Hey spammers! ;)

It's after midnight here, but I've updated my rules. So you have to
think up something new.

P.




Re: AE_MEDS35 does not more work...

2009-07-03 Thread Paweł Tęcza
Michelle Konzack pisze:
 Am 2009-07-02 15:18:16, schrieb John Hardin:
 Can you post the original raw message to a pastebin, please?
 
 I am on GSM (O2) and not able to upload to pastebin
 (I can view contents abut not upload)
 
 I will try to upload it to
 
 http://devel.debian.tamay-dogan.net/tmp/spamassassin/

Hello,

$ wget
http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.00.msg
...
$ wget
http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.11.msg

$ spamassassin -D  non_working_sa.00.msg  non_working_sa.00.log 21
...
$ spamassassin -D  non_working_sa.00.msg  non_working_sa.11.log 21

$ grep ran body rule LOCAL_BODY_WWW_MEDSXX_NET non_working_sa.*.log
non_working_sa.00.log:[16376] dbg: rules: ran body rule
LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net
non_working_sa.01.log:[17726] dbg: rules: ran body rule
LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net
non_working_sa.02.log:[21854] dbg: rules: ran body rule
LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net
non_working_sa.10.log:[22118] dbg: rules: ran body rule
LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net
non_working_sa.11.log:[22291] dbg: rules: ran body rule
LOCAL_BODY_WWW_MEDSXX_NET == got hit: www. gen88. net

I have probably older version John's regexp and as you can see above it
works for me very well.

# Thanks to John Hardin! :)
body LOCAL_BODY_WWW_MEDSXX_NET
/\bwww(?:\s|\s\W|\W\s)\w{3,6}\d{2,6}(?:\s|\s\W|\W\s)(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i
scoreLOCAL_BODY_WWW_MEDSXX_NET  5.0
describe LOCAL_BODY_WWW_MEDSXX_NET  (www medsXX net) spam

Kind regards,

P.



Re: AE_MEDS35 does not more work...

2009-07-03 Thread Paweł Tęcza
Paweł Tęcza pisze:
 Hello,
 
 $ wget
 http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.00.msg
 ...
 $ wget
 http://devel.debian.tamay-dogan.net/tmp/spamassassin/non_working_sa.11.msg
 
 $ spamassassin -D  non_working_sa.00.msg  non_working_sa.00.log 21
 ...
 $ spamassassin -D  non_working_sa.00.msg  non_working_sa.11.log 21
 ^^
Should be non_working_sa.11.msg, of course. It's only typo, I've checked
all your spam samples.

P.


Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-26 Thread Paweł Tęcza
Dnia 2009-06-26, pią o godzinie 14:15 -0700, John Hardin pisze:
 On Fri, 26 Jun 2009, Pawe~B T~Ycza wrote:
 
  Dnia 2009-06-23, wto o godzinie 09:39 +0200, Paweł Tęcza pisze:
 
   body OBFU_URI_WWDD_2
  /\bwww\s(?:\W\s)?\w{3,6}\d{2,6}\s(?:\W\s)?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i
 
  The spammers strike in weekend again. Unfortunately the rule above
  doesn't work for the latest incarnation of that spam, it means www.
  pill22. com.
 
 {sung to the tune of Peter Gabriel's Kiss That Frog} Whack that mole!
 
 /\bwww(?:\s|\s\W|\W\s)\w{3,6}\d{2,6}(?:\s|s\W|\W\s)(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i
^
John,

Thanks a lot for rule update! It works fine. I can say it's nearly
perfect, because it missing only one small back-slash :) Please look
above.

Have a nice weekend!

P.




552 spam score (11.3) exceeded threshold

2009-06-22 Thread Paweł Tęcza
Hello There,

Yesterday I was trying to send here warning of new www.shopXX.net spam
flood. It was short letter with a few URLs to pastebin.com.
Unfortunately my messages hasn't arrived at the mailing list.

Today I've checked a maillog of my servers and I can see the following
lines:

Jun 21 21:10:15 hermes2 amavis[24103]: (24103-7) Passed CLEAN, AM-SOCK
[:::195.225.120.114] [195.225.120.114] pte...@uw.edu.pl -
users@spamassassin.apache.org, Queue-ID:
00140038.4A3E8597.6FAE, Message-ID:
1245611412.5515.5.ca...@localhost.localdomain, mail_id: yYNQAmuAAueT,
Hits: -, 61 ms
Jun 21 21:10:15 hermes2 courierd:
started,id=00140038.4A3E8597.6FAE,from=pte...@uw.edu.pl,module=esmtp,host=spamassassin.apache.org,addr=users@spamassassin.apache.org
Jun 21 21:10:51 hermes2 courieresmtp:
id=00140038.4A3E8597.6FAE,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org:
552 spam score (11.3) exceeded threshold
Jun 21 21:10:51 hermes2 courieresmtp:
id=00140038.4A3E8597.6FAE,from=pte...@uw.edu.pl,addr=users@spamassassin.apache.org,status:
failure

Sorry, but there were broken by my stupid Thunderbird...

What's up? Do I really look like a spammer? ;)

My best regards,

Pawel



Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-22 Thread Paweł Tęcza
Jari Fredriksson pisze:
  
 Your tule does not compute. Syntax errors.

Hi,

I can confirm the issue:

# spamassassin --lint
[25805] warn: config: error: rule 'AE_MEDS35:' has invalid characters
(not Alphanumeric + Underscore + starting with a non-digit)
[25805] warn: config: SpamAssassin failed to parse line, no value
provided for score, skipping: score 3.0
[25805] warn: config: warning: description exists for non-existent rule
obfuscated
[25805] warn: lint: 3 issues detected, please rerun with debug enabled
for more information

Using --lint option of SA for testing a new rule is a good idea in my
humble opinion :)

Cheers.

Pawel



Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-22 Thread Paweł Tęcza
Michelle Konzack pisze:
 Am 2009-06-22 10:52:54, schrieb Pawe?? T??cza:
 # spamassassin --lint
 [25805] warn: config: error: rule 'AE_MEDS35:' has invalid characters
 (not Alphanumeric + Underscore + starting with a non-digit)
 
 Copied wrongly?  Here it is working.

Michelle,

Copied correctly, but it wasn't copy-and-paste snippet as I thought :)

Bellow is diff between your version and version ready to paste:

-body AE_MEDS35: /\(\s?w{2,4}\s(meds|shop)\d{1,4}\s(?:net|com|org)\s?)/
-describe obfuscated domain in message
-score3.0
+body AE_MEDS35 /\(\s?w{2,4}\s(meds|shop)\d{1,4}\s(?:net|com|org)\s?\)/
+describe AE_MEDS35 obfuscated domain in message
+scoreAE_MEDS35 3.0

Please note a colon character in the rule name. `spamassassin --lint`
says it's forbidden. Also you should escape the last round bracket in
the regexp.

Cheers,

P.



Re: Bayes and SQL.

2009-06-22 Thread Paweł Tęcza
Kasper Sacharias Eenberg pisze:
 Goodday.
 
 I'm installing a new spamfilter for my company, and i figured i'd try
 Bayes is SQL.
 However i have some problems with maintenance and a few general
 questions.

Hi Kasper,

We have been using Bayes in SQL formerly. I don't know/remember all
answers for your questions, but I can try to help you little.

 5) Another 'problem' i'm having is that restoring from backup - sql is
 horribly slow. Is this normal or might my mysql/network not be running
 optimally? I don't really know how to test bayes queries, but normal
 queries to the SQL go fast.

Did you try MySQL dumps? It should be faster way for restoring.

mysqldump your_bayes_db  bayes_dump.sql
echo drop database your_bayes_db |mysql your_bayes_db
echo create database your_bayes_db |mysql your_bayes_db
mysql your_bayes_db  bayes_dump.sql

 Versions:
 CentOS 5.3
 Spamassassin 3.2.5
 Perl: 5.8.8
 MySQL: 5.0.45-7.el5  (The mysql is run on another server of WAN)

What database storage do you use for your Bayes? I remember that we had
to switch from MyISAM do InnoDB because of stable and performance issues.

Have a nice summer day :)

P.



Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-22 Thread Paweł Tęcza
McDonald, Dan pisze:

 I'm considering a low-scoring rule like:
 body   AE_MEDS37  /\(\s?w{2,4}\s[:alpha:]{4}\d{1,4}\s(?:net|com|org)\s?\)/
 describe AE_MEDS37  rule to catch the next wave of spaced domains
 score  AE_MEDS37  1.0

Hi Dan,

I have score 4.0 for that kind of spam, but I can see that even such
high score is not sufficient sometimes. My SA tags that messages as spam
only if they also pass RCVD_IN_BL_SPAMCOP_NET and RCVD_IN_SORBS_DUL tests.

My best regards,

Pawel


Re: new spam image with random body message

2009-06-19 Thread Paweł Tęcza
Anthony Peacock pisze:
 Adam Cécile (Le_Vert) wrote:

 Hello,
 
 Could you give us the line from your local.cf to enable such tests ?
 
 Thanks in advance,
 
 Which tests?  You quote the whole list, some are standard some are 
 additions.

Hi Anthony,

Please show us your addition tests, of course :D

My best regards,

P.



Re: new spam image with random body message

2009-06-19 Thread Paweł Tęcza
Anthony Peacock pisze:
 Hi,
 
 Paweł Tęcza wrote:

 Hi Anthony,
 
 Please show us your addition tests, of course :D
 
 Unless you are a UK Higher Education organisation you won't be able to 
 use RCVD_IN_JANET_DUL.

What a pity. We are Polish university :)

 Other than that I think the only additional one is the BOTNET plugin by 
 John Rudd, which is available here:
 
 http://people.ucsc.edu/~jrudd/spamassassin/
 
 As far as I remember the rest are standard.

Thank you very much for the URL to BOTNET plugin!

Have a nice weekend,

P.



Re: New www.medsXX.net spam

2009-06-19 Thread Paweł Tęcza
Benny Pedersen pisze:
 On Fri, June 19, 2009 11:24, Pawe? T?cza wrote:
 Hello People,
 
 http://pastebin.com/m5988eed
 
 are you sure you want email To: r...@uw.edu.pl from outside world ?
 
 assume its the envelope recipient, if not just ignore me :)
 
 check your aliases in mta

Hello Benny,

r...@uw.edu.pl is only alias. We have postmas...@uw.edu.pl alias too,
but there not the same aliases :)

 http://pastebin.com/m5835257
 
 same here To: mailer-dae...@student.uw.edu.pl is mailer-daemon one that
 works local to you ?, if no then its clearly spam bounces or non working
 remote mta

It's a next alias :)

 http://pastebin.com/m11b07539
 
 your mta/sa is running on ipv6 host, ipv6 is not supported very well in
 sa, thats why you get low scores
 
 Have a nice day,
 
 no problem

Thanks a lot for your comments! :)

P.



Re: [SA SPAM 1.4 ] Re: New www.medsXX.net spam

2009-06-19 Thread Paweł Tęcza
Randal, Phil pisze:
 Paweł Tęcza wrote:

 What's the rule for deliberately misspelled words?
 
 My best regards,
 
 Pawel
 
 In this country, at least, misspelled belongs in that list of misspelt 
 words.
 
 Oh, don't we all love American English?  *grin*

Hi Phil,

It's funny, isn't? :)

Sorry, if it was hurting for your pure British English ;) Simply my
typing was faster than my thinking :D

Have a nice weekend!

P.



Re: new spam image with random body message

2009-06-19 Thread Paweł Tęcza
Randal, Phil pisze:
 Anthony Peacock wrote:
 Paweł Tęcza wrote:
 Anthony Peacock pisze:
 Hi,
 
 Paweł Tęcza wrote:
 
 Hi Anthony,
 
 Please show us your addition tests, of course :D
 Unless you are a UK Higher Education organisation you won't be able
 to use RCVD_IN_JANET_DUL.
 
 What a pity. We are Polish university :)
 
 Yes, but this is just an academic feed of the MAPS RBL+
 http://mail-abuse.com/index.html 
 
 I've just had a quick look at our recent MAPS RBL+ hits and there are
 none which weren't already scoring highly.

It's good to know that we can life happy without that RBL :) Thanks!

 I'd recommend both the Botnet and iXhash SA plugins if you're not
 already using them.

Thank you for that recommendation! I'll try both plugins.

Best regards,

P.



Re: New www.medsXX.net spam

2009-06-19 Thread Paweł Tęcza
Dnia 2009-06-19, pią o godzinie 09:45 -0700, John Hardin pisze:
 On Fri, 2009-06-19 at 09:24 -0700, John Hardin wrote:
  On Fri, 2009-06-19 at 16:21 +0200, Paweł Tęcza wrote:
  
body   AE_MEDS35  /w{2,4}\s{0,4}meds\d{1,4}\s{0,4}(?:net|com|org)/
  
   I've just noticed missing 'i' switch for your rule regexp. Is it a bug
   or a feature? :)
  
  That depends. If the URIs are always lowercasein the spams, making the
  RE case-insensitive doesn't help and may hurt.

Hi John,

I could see only lowercase URIs, but I rather prefer case-insensitive
rules. Simply I don't want to get a lot of spam, because the spammer
read that thread and changed only one letter :)

   BTW, probably \s+ will be better than \s{0,4}. Similarly with w{2,4} and
   \d{1,4}.
  
  No, it's not. In SA, unbounded matches are hazardous and should be
  avoided. {0,20} is safer than * and {1,20} is safer than +.
  
  This is not a general rule, it only applies where the text being scanned
  is from an untrusted (and possibly actively hostile) source.
  
  Another improvement: add word boundaries at the beginning and end:
  
/\bw{2,4}\s{0,10}meds\d{1,4}\s{0,10}(?:net|com|org)\b/

Thanks a lot for your tips! It's next valuable lesson for me today :)

  If the parentheses in the original example are actually in the message,
  including them will help to. Are they actually in the message?

Yes, I can see the parentheses in all the spam messages I received. But
spammers can remove them soon, of course.

 D'oh, /me checks pastebins from first message...
 
 Also, body rules match cleaned-up text with runs of spaces collapsed, so
 you don't need to use + or {1,...}
 
 Try this:
 
/\(\s?w{2,4}\smeds\d{1,4}\s(?:net|com|org)\s?\)/

Yes, I noticed it when I was testing my own rule:

[1438] dbg: rules: ran body rule LOCAL_BODY_WWW_MEDSXX_NET == got
hit: (www meds88 net)

My best regards,

Pawel




Re: new spam image with random body message

2009-06-17 Thread Paweł Tęcza
Ibrahim Harrani pisze:
 Do you have any solution about this kind of spams?

Hello Ibrahim,

Could you please show me the Content-* headers of image attachment?
Did you send all headers of that spam in your previous post?

I have some success with fighting that spam I called BAD GOOD PENIS,
but I can see that it evolves, so my rules should be improved too.

My best regards,

Pawel


Re: new spam image with random body message

2009-06-17 Thread Paweł Tęcza
Ibrahim Harrani pisze:
 Hi,
 
 another header from another image spams.
 All images contain god, bad and a url with numbers.

The spamers are cunning... It seems that they have stopped sending spams
with X-Mailer: header containing something like PHP v5.2.0 or
PHP/4.4.5. Also they don't use only digits in attachment filenames.
So I'm affraid that my Spamassassin rules are not effective for that
kind of spam :(

 It seems that ocrad can't decode the strings in the images.
 FuzzyOcr version is 3.6.0

I've added BAD, GOOD and exemplary domain name to my FuzzyOcr word
file, but unfortunately FuzzyOcr didn't recognise them :(

Maybe someone has better idea how to fight that image spam?

Cheers,

P.



Re: InnoDB as storage engine for sa_bayes

2007-08-30 Thread Paweł Tęcza
Alex Woick [EMAIL PROTECTED] writes:

 -rw-rw 1 mysql mysql 1010M Aug 28 08:25 ibdata1

 -rw-rw 1 mysql mysql 264M Aug 27 17:09 awl.ibd
 -rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_expire.ibd
 -rw-rw 1 mysql mysql  96K Aug 27 17:09 bayes_global_vars.ibd
 -rw-rw 1 mysql mysql 468M Aug 27 21:11 bayes_seen.ibd
 -rw-rw 1 mysql mysql 148M Aug 27 21:43 bayes_token.ibd
 -rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_vars.ibd

 As you can see above, the new storage engine consumed 2 times
 bigger diskspace then the old. Is it a good behave or I should
 feel worried?

 Nothing to worry. But you have perhaps imported your data twice and have an
 empty ibdata1 file which only occupies space.

Hello Alex,

At first, thanks a lot for your reply and interesting comments! :)

 I quoted the innodb data files. Since you have defined
 innodb_file_per_table, the table data is saved into the *.ibd files in
 the database directory. Without that option all table data would go to
 the ibdata* file(s) in the base data directory. As far as I know, data
 for one table is saved either in ibdata* or in the *.ibd file, but not
 both.

I can quote innodb_data_file_path option to test it, of course,
but I affraid that it's necessary. The MySQL doc [1] says:

Note: InnoDB always needs the shared tablespace because it puts its
internal data dictionary and undo logs there. The .ibd files are not
sufficient for InnoDB to operate.

 Perhaps you played around and first imported the data without
 innodb_file_per_table, which imported into ibdata1. Then you perhaps
 dropped the tables and defined innodb_file_per table and imported
 again, so the *.ibd files were created and filled. The ibdata1 may now
 be empty, but it will never shrink.

I remember that before injecting the MySQL dump I removed all
ib* files and created initial InnoDB tablespace and logs running
mysqld from command line:

[EMAIL PROTECTED]:/var/lib/mysql# /usr/sbin/mysqld
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
070827 15:25:01  InnoDB: Setting file ./ibdata1 size to 10 MB
InnoDB: Database physically writes the file full: wait...
070827 15:25:01  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 10 MB
InnoDB: Database physically writes the file full: wait...
070827 15:25:01  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 10 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
070827 15:25:02  InnoDB: Started; log sequence number 0 0
070827 15:25:02 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.38-Ubuntu_0ubuntu1-log'  socket: '/var/run/mysqld/mysqld.sock'  
port: 3306  Ubuntu 7.04 distribution

While injecting I could see how the ibdata1 file was growing
from 10MB to 1010MB.

 Try the following: Dump all databases which have innodb tables and
 drop all innodb tables. Stop the server, remove the ibdata1 and *.ibd
 files and restart the server. An empty and small ibdata1 file will be
 recreated. Now import your databases. I bet the ibdata1 file will not
 grow and all data will be imported into *.ibd.

Please look at below. What have I won in the bet? ;)

 It is not neccessary to dump/reload the data for changing the database
 engine of a table. Simply edit the tables with Mysql Query Browser or
 the Mysql Administrator and change the table engine from myisam to
 innodb. Or execute an SQL statement: ALTER TABLE mytable
 ENGINE=innodb.

Yes, I know it and even I was trying to convert the MyISAM tables
in that way, but it was terrible slowly, so I chose a method
with injecting MySQL dump. Unfortunately it wasn't faster ;)

[EMAIL PROTECTED]:/var/lib/mysql# date  mysql sa_bayes -u root -p  
~/sa_bayes_innodb.sql  date
Mon Aug 27 15:28:40 CEST 2007
Enter password:
Mon Aug 27 21:42:45 CEST 2007
[EMAIL PROTECTED]:/var/lib/mysql#

You can see above that it took more then 6 hours for ~560MB
dump file!

My best regards,

Pawel


[1] http://dev.mysql.com/doc/refman/5.0/en/multiple-tablespaces.html


Re: InnoDB as storage engine for sa_bayes

2007-08-30 Thread Paweł Tęcza
Paweł Tęcza [EMAIL PROTECTED] writes:

 Alex Woick [EMAIL PROTECTED] writes:

 Perhaps you played around and first imported the data without
 innodb_file_per_table, which imported into ibdata1. Then you perhaps
 dropped the tables and defined innodb_file_per table and imported
 again, so the *.ibd files were created and filled. The ibdata1 may now
 be empty, but it will never shrink.

 I remember that before injecting the MySQL dump I removed all
 ib* files and created initial InnoDB tablespace and logs running
 mysqld from command line:

I've forgotten to add that my sa_bayes database was empty
before injecting of MySQL dump, because firstly I dropped it
and next created.

Kind regards,

Pawel


InnoDB as storage engine for sa_bayes

2007-08-28 Thread Paweł Tęcza
Hello Spamassassins! ;)

A few weeks ago I had problems with a capacity of my MySQL 5.0.38
server with sa_bayes database stored in MyISAM when it was handling
a lot of SQL queries from my Spamassassin cluster. The only one
solution was to disable using Bayes.

I wrote about my problems here and I heard many useful advices.
One of them was to convert my sa_bayes database from MyISAM to
InnoDB storage engine.

I didn't have any experiences with InnoDB, so I had to learn it.
Now I know more about it, but I still have a few doubts...

Below you can see details about a copy of my old sa_bayes
database with MyISAM:

# ls -lh sa_bayes/
total 809M
-rw-r- 1 ptecza ptecza 8,5K 2007-08-16 15:29 awl.frm
-rw-r- 1 ptecza ptecza 139M 2007-08-16 15:29 awl.MYD
-rw-r- 1 ptecza ptecza 112M 2007-08-16 15:29 awl.MYI
-rw-r- 1 ptecza ptecza 8,4K 2007-08-16 15:29 bayes_expire.frm
-rw-r- 1 ptecza ptecza  207 2007-08-16 15:29 bayes_expire.MYD
-rw-r- 1 ptecza ptecza 2,0K 2007-08-16 15:29 bayes_expire.MYI
-rw-r- 1 ptecza ptecza 8,4K 2007-08-16 15:29 bayes_global_vars.frm
-rw-r- 1 ptecza ptecza   20 2007-08-16 15:29 bayes_global_vars.MYD
-rw-r- 1 ptecza ptecza 2,0K 2007-08-16 15:29 bayes_global_vars.MYI
-rw-r- 1 ptecza ptecza 8,5K 2007-08-16 15:29 bayes_seen.frm
-rw-r- 1 ptecza ptecza 213M 2007-08-16 15:29 bayes_seen.MYD
-rw-r- 1 ptecza ptecza 278M 2007-08-16 15:29 bayes_seen.MYI
-rw-r- 1 ptecza ptecza 8,5K 2007-08-16 15:29 bayes_token.frm
-rw-r- 1 ptecza ptecza  24M 2007-08-16 15:29 bayes_token.MYD
-rw-r- 1 ptecza ptecza  44M 2007-08-16 15:29 bayes_token.MYI
-rw-r- 1 ptecza ptecza 8,8K 2007-08-16 15:29 bayes_vars.frm
-rw-r- 1 ptecza ptecza   52 2007-08-16 15:29 bayes_vars.MYD
-rw-r- 1 ptecza ptecza 3,0K 2007-08-16 15:29 bayes_vars.MYI
-rw-r- 1 ptecza ptecza   65 2007-08-16 15:29 db.opt

Here are details about a new sa_bayes database with InnoDB:

ls -lh ib*
-rw-rw 1 mysql mysql   10M Aug 28 08:25 ib_logfile0
-rw-rw 1 mysql mysql   10M Aug 27 21:42 ib_logfile1
-rw-rw 1 mysql mysql 1010M Aug 28 08:25 ibdata1

# ls -lh sa_bayes/
total 882M
-rw-rw 1 mysql mysql 8.5K Aug 27 15:28 awl.frm
-rw-rw 1 mysql mysql 264M Aug 27 17:09 awl.ibd
-rw-rw 1 mysql mysql 8.4K Aug 27 17:08 bayes_expire.frm
-rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_expire.ibd
-rw-rw 1 mysql mysql 8.4K Aug 27 17:08 bayes_global_vars.frm
-rw-rw 1 mysql mysql  96K Aug 27 17:09 bayes_global_vars.ibd
-rw-rw 1 mysql mysql 8.5K Aug 27 17:08 bayes_seen.frm
-rw-rw 1 mysql mysql 468M Aug 27 21:11 bayes_seen.ibd
-rw-rw 1 mysql mysql 8.5K Aug 27 21:09 bayes_token.frm
-rw-rw 1 mysql mysql 148M Aug 27 21:43 bayes_token.ibd
-rw-rw 1 mysql mysql 8.8K Aug 27 21:42 bayes_vars.frm
-rw-rw 1 mysql mysql 112K Aug 28 08:25 bayes_vars.ibd
-rw-rw 1 mysql mysql   65 Aug 27 15:23 db.opt

It has exactly the same content like old database and it was
simply injected from MySQL dump.

As you can see above, the new storage engine consumed 2 times
bigger diskspace then the old. Is it a good behave or I should
feel worried?

Could you please tell me what the size of your sa_bayes
database with InnoDB is? What diskspace should I reserve?

Probably you would like to know my InnoDB settings too:

# grep ^innodb /etc/mysql/my.cnf
innodb_data_file_path=ibdata1:10M:autoextend
innodb_autoextend_increment=10M
innodb_file_per_table
innodb_buffer_pool_size=60M
innodb_additional_mem_pool_size=5M
innodb_log_files_in_group=2
innodb_fast_shutdown=1
innodb_log_file_size=10M
innodb_log_buffer_size=5M
innodb_flush_log_at_trx_commit=1
innodb_lock_wait_timeout=25

I agree that a size of buffers is not too big, but it's only
my testing box, not a production machine.

My best regards,

Pawel


Re: picture spams

2007-08-20 Thread Paweł Tęcza
Loren Wilton [EMAIL PROTECTED] writes:

 Hi Loren,

 I did the test and unfortunately my FuzzyOcr (3.5.1) was bitten
 by that spam image.

 The normal scan setups for FuzzyOCR don't rotate the images, so will
 in all probability miss a rotated image like this.  These were quite
 popular for a while and a couple of people developed scansets that
 contained rotation as one of the preprocessing steps.  I don't seem to
 have saved any of the messages relating to that thread.  As best I
 recall they found that rotating 8 degrees or so worked well.  Or maybe
 it was 18.

 You can probably find info on the FuzzyOcr mailing list:

Hi Loren,

I was quite sure that FuzzyOcr project is dead, because a few
months ago I was trying to contact his author, Decoder,
but no success. Probably he was very busy :) Fortunately, it seems
that FuzzyOcr project still is alive. It's a very good message
for me, because it's really a very useful utility :)

I've found a threat about rotated spam images at FuzzyOcr page [1].
Currently Decoder hasn't time to implement checking image rotation,
but he will try to do it in the future. Now we can only work-around it,
for example using the preprocessor/scanset settings.

Who of you do rotate images in your FuzzyOcr? Do you use fixed
degrees or detect the skew angle and rotate the image accordingly?
Could you share this?

Kind regards,

Pawel

[1] http://fuzzyocr.own-hero.net/ticket/408


Re: picture spams

2007-08-20 Thread Paweł Tęcza
[EMAIL PROTECTED] writes:

 On Fri, 17 Aug 2007, Pawe? T?cza wrote:

 I did the test and unfortunately my FuzzyOcr (3.5.1) was bitten by
 that spam image.

 You can manually mark this picture as bad :

 # fuzzy-find --delete image
 # fuzzy-find --learn-spam image

Hi,

Thanks for the hint! I believe that it's an effective method,
but I have no time to learn my FuzzyOcr manually ;)

Have a nice day,

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-17 Thread Paweł Tęcza
Pawel Sasin [EMAIL PROTECTED] writes:
[...]
 Have you tried this on your SA servers?
 http://wiki.apache.org/spamassassin/DBIPlugin

Hello Pawel! :D

Thank you very much for the message about DBIPlugin!  I've never
used it before.  It looks interesting for me, so I've just
downloaded that plugin and I'm testing it on one of my SA nodes
right now :)

 AFAIK spawning many connections to mysql servers causes quite a big
 load on them.

I didn't noticed big load on my server with MySQL while punctuation
spam bombing.  Yes, it increased, but from 0.1 to 1.1 :)  I think
we didn't have many connections, but many SQL queries.

Greetings from Warsaw! :)

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-17 Thread Paweł Tęcza
SM [EMAIL PROTECTED] writes:
[...]
Now I use MyISAM strorage backend, because I just created Bayesian
database using Spamassassin sql/bayes_mysql.sql file :)

 The recommendations in the sql/bayes_mysql.sql file are for the
 average setup.  It doesn't cover MySQL optimization techniques as that
 a MySQL specific issue.

 You can change the engine from MyISAM to InnoDB (see ALTER TABLE).
 That should improve performance for INSERTs.  With the amount of mail
 your server handles, you either have to improve MySQL performance,
 switch to more powerful hardware or disable Bayes.  If you disable
 Bayes, the punctuation spam would still be caught in your setup as it
 scored over 19 points.

Hello again! :)

I'm working on conversion of storage engine from MyISAM to InnoDB.

My hardware seems to be good enough.  It's Sun Fire x4100 M2 server
with 2 x Dual-Core AMD Opteron 2220 SE CPUs and 8GB RAM on the board
and it's bored with its job ;)  I think I rather need faster disks.

My best regards,

Pawel


Re: picture spams

2007-08-17 Thread Paweł Tęcza
Loren Wilton [EMAIL PROTECTED] writes:

 FuzzyOcr should do a good job on something like that.

Loren

 http://dreams.741.com/spam.gif

Hi Loren,

I did the test and unfortunately my FuzzyOcr (3.5.1) was bitten
by that spam image.

Here are the message headers:

X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on
anubis2.poczta.uw.edu.pl
X-Spam-Level: x
X-Spam-Status: No, score=1.3 required=5.0 tests=SB_GIF_AND_NO_URIS
autolearn=disabled version=3.2.1

And here is a piece of output of `spamassassin -D`:

[17547] dbg: FuzzyOcr: Starting FuzzyOcr...
[17547] info: FuzzyOcr: Processing Message with ID [EMAIL PROTECTED] (Pawel 
Tecza [EMAIL PROTECTED] - [EMAIL PROTECTED])
[17547] dbg: FuzzyOcr: fname: spam.gif = spam.gif
[17547] dbg: message: decoding base64
[17547] info: FuzzyOcr: GIF: [342x434] spam.gif (9377)
[17547] dbg: FuzzyOcr: Saved: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif
[17547] dbg: FuzzyOcr: Saved: /tmp/.spamassassin17547AFJ63Ztmp/raw.eml
[17547] info: FuzzyOcr: Found: 1 images
[17547] dbg: FuzzyOcr: Connecting to: dbi:mysql:database=FuzzyOcr;host=mysqlhost
[17547] dbg: dbiplugin: Creating uncached database handle to 
'database=FuzzyOcr;host=mysqlhost_fuzzyocr_fuzzyocr_AutoCommit=1_PrintError=1_Username=fuzzyocr'
[17547] dbg: config: using /var/lib/courier/.spamassassin for user state dir
[17547] dbg: FuzzyOcr: pfile = /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm
[17547] dbg: FuzzyOcr: efile = /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.err
[17547] dbg: FuzzyOcr: Errors to: /tmp/.spamassassin17547AFJ63Ztmp/raw.err
[17547] dbg: FuzzyOcr: File has Content-Type image/gif and File Extension 
gif
[17547] info: FuzzyOcr: Found GIF header name=spam.gif
[17547] dbg: FuzzyOcr: Saved pid: 17671
[17671] dbg: FuzzyOcr: Exec : /usr/bin/giftext 
/tmp/.spamassassin17547AFJ63Ztmp/spam.gif
[17671] dbg: FuzzyOcr: Stdout: /tmp/.spamassassin17547AFJ63Ztmp/giftext.info
[17671] dbg: FuzzyOcr: Stderr: /tmp/.spamassassin17547AFJ63Ztmp/giftext.err
[17547] dbg: FuzzyOcr: Elapsed [17671]: 0.016500 sec. (/usr/bin/giftext: exit 0)
[17547] info: FuzzyOcr: Image is single non-interlaced...
[17673] dbg: FuzzyOcr: Exec : /usr/bin/giffix 
/tmp/.spamassassin17547AFJ63Ztmp/spam.gif
[17673] dbg: FuzzyOcr: Stdout: 
/tmp/.spamassassin17547AFJ63Ztmp/spam.gif-fixed.gif
[17673] dbg: FuzzyOcr: Stderr: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.err
[17547] dbg: FuzzyOcr: Saved pid: 17673
[17547] dbg: FuzzyOcr: Elapsed [17673]: 0.019540 sec. (/usr/bin/giffix: exit 0)
[17674] dbg: FuzzyOcr: Exec : /usr/bin/giftopnm 
/tmp/.spamassassin17547AFJ63Ztmp/spam.gif-fixed.gif
[17674] dbg: FuzzyOcr: Stdout: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm
[17674] dbg: FuzzyOcr: Stderr: /tmp/.spamassassin17547AFJ63Ztmp/spam.gif.err
[17547] dbg: FuzzyOcr: Saved pid: 17674
[17547] dbg: FuzzyOcr: Elapsed [17674]: 0.173627 sec. (/usr/bin/giftopnm: exit 
0)
[17547] info: FuzzyOcr: Calculating image hash for: 
/tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm
[17681] dbg: FuzzyOcr: Exec : /usr/bin/ppmhist -noheader 
/tmp/.spamassassin17547AFJ63Ztmp/spam.gif.pnm
[17681] dbg: FuzzyOcr: Stdout: /tmp/.spamassassin17547AFJ63Ztmp/ppmhist.info
[17681] dbg: FuzzyOcr: Stderr: /dev/null
[17547] dbg: FuzzyOcr: Saved pid: 17681
[17547] dbg: FuzzyOcr: Elapsed [17681]: 0.022073 sec. (/usr/bin/ppmhist: exit 0)
[17547] dbg: FuzzyOcr: Got: 
445299:342:434:7::252:254:252:253:90487::44:106:172:95:21369::84:150:204:136:16164::108:182:220:164:8414::20:74:140:65:7329::252:206:4:197:2789
[17547] dbg: FuzzyOcr: delete from FuzzyOcr.Hash where Hash.check  1186058256
[17547] info: FuzzyOcr: Found[Safe]: Score='0.000' Info: ''
[17547] info: FuzzyOcr: Matched [2] time(s). Prev match: 9 min. 52 sec. ago
[17547] dbg: FuzzyOcr: update FuzzyOcr.Safe set 
Safe.match='2',Safe.check='1187354257' where 
Safe.key='252:254:252:253:90487::44:106:172:95:21369::84:150:204:136:16164::108:182:220:164:8414::20:74:140:65:7329::252:206:4:197:2789'
[17547] info: FuzzyOcr: Image in KNOWN_GOOD. Skipping OCR checks...
[17547] dbg: FuzzyOcr: Remove DIR: /tmp/.spamassassin17547AFJ63Ztmp
[17547] dbg: FuzzyOcr: FuzzyOcr ending successfully...
[17547] dbg: FuzzyOcr: Processed in 0.345128 sec.

My best regards,

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-17 Thread Paweł Tęcza
Henrik Krohns [EMAIL PROTECTED] writes:
[...]
 My hardware seems to be good enough.  It's Sun Fire x4100 M2 server
 with 2 x Dual-Core AMD Opteron 2220 SE CPUs and 8GB RAM on the board
 and it's bored with its job ;)  I think I rather need faster disks.

 With that amount memory you won't see much disk activity. You can happily
 increase mysql buffer cache sizes to a GB or two. It's all basic mysql
 tuning.

Hi Henrik,

It's a good suggestion. Thanks a lot! :)

My best regards,

Pawel


Re: Suggested botnet rule scores

2007-08-17 Thread Paweł Tęcza
Henrik Krohns [EMAIL PROTECTED] writes:
[...]
 If you want a simple solution, you can try http://sa.hege.li/ for BadRelay
 plugin.

Interesting license... ;)

Have a nice day,

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-17 Thread Paweł Tęcza
Pawel Sasin [EMAIL PROTECTED] writes:
[...]
 You said you have several servers running spamd - if updates are
 causing you much trouble then you could disable bayes_autolearn on
 most of the servers, so that only some of them (down to 1) would
 update your bayes DB, while the others would just query it.

Thanks for the next hint, Pawel!  I didn't think about it :)
I agree it's a better solution then disabled Bayes everywhere.

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-16 Thread Paweł Tęcza
SM [EMAIL PROTECTED] writes:

 Hi Pawel,
 At 01:36 16-08-2007, =?iso-8859-2?Q?Pawe=B3_T=EAcza?= wrote:
[...]
Is it not a new kind of spam and Spamassassin should be improved
to fight it?  I'm not sure...

 No, it is not new.  I posted the following reply a few days back regarding 
 this
 type of message referred to as punctuation spam.

 The message hits hit BAYES_99 and FRT_PRICE.  As you did not include the
 headers, it's not possible to tell whether it would hit some of the DYNAMIC
 rules as well.

Hello mysterious SM! ;)

Thanks a lot for the reply and the explanation!

Here are the Spamassassin headers for one of a spam mail we received:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on
anubis3.poczta.uw.edu.pl
X-Spam-Level: xxx
X-Spam-Status: Yes, score=19.3 required=5.0
tests=FH_HELO_EQ_D_D_D_D,FRT_PRICE,

FRT_STRONG1,FRT_SYMBOL,HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_BL_SPAMCOP_NET
,
RCVD_IN_PBL,TVD_FUZZY_SYMBOL,TVD_STOCK1 autolearn=disabled
version=3.2.1
X-Spam-Report: =?ISO-8859-1?Q?
*  0.5 FH_HELO_EQ_D_D_D_D Helo is d-d-d-d
*  2.5 FRT_PRICE BODY: ReplaceTags: Price
*  3.6 FRT_SYMBOL BODY: ReplaceTags: Symbol
*  1.4 TVD_FUZZY_SYMBOL BODY: TVD_FUZZY_SYMBOL
*  2.9 FRT_STRONG1 BODY: ReplaceTags: Strong (1)
*  3.8 TVD_STOCK1 BODY: TVD_STOCK1
*  0.0 HTML_MESSAGE BODY: Wiadomo=b6=e6 zawiera kod HTML
*  1.8 MIME_QP_LONG_LINE RAW: Linia QP d=b3u=bfsza ni=bf 76
znak=f3w
*  2.2 RCVD_IN_BL_SPAMCOP_NET RBL: Odebrane od systemu klasy
RELAY w/g:
*  bl.spamcop.net
*  [Blocked - see
http://www.spamcop.net/bl.shtml?89.191.164.221]
*  0.5 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
*  [89.191.164.221 listed in zen.spamhaus.org]?=

 Bill Landry suggested using chickenpox.cf and mangled.cf rules from SARE.

Thanks for the hint!  I'll try a look at them.

The results is that spam was killing our MySQL database, because we
had ~50k queries per minute with INSERTs and UPDATEs of a many tokens.
The only one solution was to disable Bayes.

 MySQL can be optimized to handle such a load.  If you aren't using InnoDB for
 Bayesian storage, switch to it.

Now I use MyISAM strorage backend, because I just created Bayesian
database using Spamassassin sql/bayes_mysql.sql file :)

Have a nice day,

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-16 Thread Paweł Tęcza
Paweł Tęcza [EMAIL PROTECTED] writes:
[...]
 Here are the Spamassassin headers for one of a spam mail we received:

Ups!  It was spam received when we disabled Bayes. Below are spam
headers we scanned before:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on
anubis4.poczta.uw.edu.pl
X-Spam-Level: x
X-Spam-Status: Yes, score=21.5 required=5.0
tests=AXB_XMID_1212,BAYES_99,

FH_HELO_EQ_D_D_D_D,FRT_PRICE,FRT_STRONG1,HELO_DYNAMIC_IPADDR,RCVD_IN_PBL,
RCVD_IN_SORBS_DUL,RDNS_DYNAMIC,STOX_REPLY_TYPE,TVD_STOCK1
autolearn=spam
version=3.2.1
X-Spam-Report: =?ISO-8859-1?Q?
*  3.5 BAYES_99 BODY: Bayesowskie prawdopodobie=f1stwo spamu
wynosi 99 do
*  100%
*  [score: 1.]
*  0.0 STOX_REPLY_TYPE STOX_REPLY_TYPE
*  0.0 FH_HELO_EQ_D_D_D_D Helo is d-d-d-d
*  2.4 HELO_DYNAMIC_IPADDR Relay HELO'd using suspicious
hostname (IP addr
*  1)
*  3.5 AXB_XMID_1212 Barbera Fingerprint
*  0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
*  [90.14.168.63 listed in zen.spamhaus.org]
*  0.9 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic
IP address
*  [90.14.168.63 listed in dnsbl.sorbs.net]
*  3.5 FRT_PRICE BODY: ReplaceTags: Price
*  3.0 FRT_STRONG1 BODY: ReplaceTags: Strong (1)
*  3.8 TVD_STOCK1 BODY: TVD_STOCK1
*  0.1 RDNS_DYNAMIC Delivered to trusted network by host with
*  dynamic-looking rDNS?=

What can you tell about it now? :)

Regards,

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-16 Thread Paweł Tęcza
Kai Schaetzl [EMAIL PROTECTED] writes:

 Pawe³ Têcza wrote on Thu, 16 Aug 2007 12:25:48 +0200:

 What can you tell about it now?

 I think that's not the point, or is it? You don't seem to have a problem 
 with detection but with token storage slowness on SQL as these fuzzy 
 mails seem to generate a lot of unique tokens. Is that what you wanted to 
 get fixed?

Hi Kai,

I would like two things:

1. try to speed up my MySQL server
2. decrease a number of unique tokens for punctuation spam

The first of them is a task for me, of course.  But the second
is rather Spamassassin's job.

I'm thinking whether it's really necessary to keep *all* tokens
for that kind of spam...  Maybe Spamassassin could save only
some part of them?  What's your opinion about it?

My best regards,

Pawel


Re: Spam kills my MySQL with Bayes

2007-08-16 Thread Paweł Tęcza
Kai Schaetzl [EMAIL PROTECTED] writes:

 Pawe³ Têcza wrote on Thu, 16 Aug 2007 14:28:05 +0200:
[...]
 But the second
 is rather Spamassassin's job.
 
 I'm thinking whether it's really necessary to keep *all* tokens
 for that kind of spam...  Maybe Spamassassin could save only
 some part of them?  What's your opinion about it?

 I really don't know enough about Bayes and SA to say much about it. I 
 think it would be difficult for SA to determine what are good and bad 
 tokens.

Yes, it can be difficult to determine, but what about configureable
plugin option for maximum number of tokens per message to store?
If it doesn't exist yet, of course ;)

Pawel