Re: Bayes timeouts and database handle being DESTROY'd without explicit disconnect

2010-10-26 Thread Micah Anderson
Dominic Benson domi...@lenny.cus.org writes:

 On 19 Oct 2010, at 17:05, Micah Anderson wrote:

 
 Hello,
 
 I'm running a busy mail server. We've got a bayes database on its own
 server, with InnoDB tables. 

 What is your total DB size / server RAM? Could you include a snapshot of the 
 output of top from the DB server? I would guess that your problem is 
 indexing/tuning or server capacity MySQL side rather than in SA, but without 
 more data it is just a guess.

The databsae size is 2.74gig.

$ free
 total   used   free sharedbuffers cached
Mem:   805587668727401183136  0 5840325403916
-/+ buffers/cache: 8847927171084
Swap:  1959912 5694321390480

top - 07:26:39 up 10 days, 20:37,  1 user,  load average: 9.24, 6.80, 6.15
Tasks:  24 total,   2 running,  22 sleeping,   0 stopped,   0 zombie
Cpu(s): 83.3%us, 16.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.5%si,  0.0%st
Mem:   8055876k total,  6890032k used,  1165844k free,   584364k buffers
Swap:  1959912k total,   569432k used,  1390480k free,  5405264k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND   
10744 mysql 20   0  655m 110m 5500 S  190  1.4   9296:14 mysqld 
10765 stunnel4  20   0  123m 109m 1416 S2  1.4 179:38.73 stunnel4   
1 root  20   0  1984  636  548 S0  0.0   2:40.15 init   
  397 bind  20   0 82856  23m 2632 S0  0.3   0:46.72 named  
 1812 root  20   0  3120 1176  772 S0  0.0   0:15.04 syslog-ng  
 3551 messageb  20   0  2488  648  488 S0  0.0   0:00.00 dbus-daemon
 3610 nobody20   0  6368 2668  888 S0  0.0   0:11.94 nagios-statd   
 4828 root  20   0  5484 1824 1476 S0  0.0   0:09.44 master 
10707 root  20   0  3784 1276 1076 S0  0.0   0:00.02 mysqld_safe
10745 root  20   0  2892  608  532 S0  0.0   0:00.00 logger 
10760 stunnel4  20   0  3836  688  348 S0  0.0   1:25.14 stunnel4   
10761 stunnel4  20   0  3836  692  352 S0  0.0   1:16.94 stunnel4   
10762 stunnel4  20   0  3836  692  352 S0  0.0   1:16.24 stunnel4   
10763 stunnel4  20   0  3836  692  352 S0  0.0   1:16.45 stunnel4   
10764 stunnel4  20   0  3836  692  352 S0  0.0   1:20.77 stunnel4   
11311 root  20   0  2044  888  704 S0  0.0   0:09.02 cron   
15444 postfix   20   0  5496 1788 1452 S0  0.0   0:00.00 pickup 

I'm averaging around 150 mysql threads, with peaks during peak mail
times. 

 and a few of these, although not that many:
 
 Oct 17 12:02:29 spamd3 spamd[6367]: prepare_cached(SELECT max(runtime) from 
 bayes_expire WHERE id = ?) statement handle DBI::st=HASH(0xadbb060)still 
 Active at /usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 722


 Try an EXPLAIN SELECT max(runtime) from bayes_expire WHERE id = some value; 
 as you know it to be slow it might give a clue where to look to improve 
 performance. Or try turning the general query log on for a while and see what 
 queries are taking up time. MonYog is quite a nice frontend to this, but you 
 can do it by hand fairly simply.

mysql EXPLAIN SELECT max(runtime) from bayes_expire WHERE id = 5;
++-+--+--+---+---+-+---+--+---+
| id | select_type | table| type | possible_keys | key  
 | key_len | ref   | rows | Extra |
++-+--+--+---+---+-+---+--+---+
|  1 | SIMPLE  | bayes_expire | ref  | bayes_expire_idx1 | 
bayes_expire_idx1 | 2   | const |  198 |   | 
++-+--+--+---+---+-+---+--+---+
1 row in set (0.00 sec)

Note, this might be related to the post I made today about sa-learn
--expire taking hours... 

micah



Re: Bayes timeouts and database handle being DESTROY'd without explicit disconnect

2010-10-26 Thread Dominic Benson

On 26/10/10 15:38, Micah Anderson wrote:

The databsae size is 2.74gig.


top - 07:26:39 up 10 days, 20:37,  1 user,  load average: 9.24, 6.80, 6.15
Tasks:  24 total,   2 running,  22 sleeping,   0 stopped,   0 zombie
Cpu(s): 83.3%us, 16.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.5%si,  0.0%st
Mem:   8055876k total,  6890032k used,  1165844k free,   584364k buffers
Swap:  1959912k total,   569432k used,  1390480k free,  5405264k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
10744 mysql 20   0  655m 110m 5500 S  190  1.4   9296:14 mysqld


I'm averaging around 150 mysql threads, with peaks during peak mail
times.

   


The thing that jumps out at me from this is that MySQL is only using 
~112MB of memory; sure your FS cache is a respectable size, but I would 
expect MySQL itself to want to use some of the 1GB free memory. Which 
would suggest that my.cnf needs tuning somewhat.


Have a look at SHOW STATUS; - a few particular things come to mind:
Innodb_buffer_pool_read_requests/Innodb_buffer_pool_reads
Innodb_buffer_pool_wait_free
Innodb_log_waits
Handler_read_%
Created_tmp_%
Sort_%

and a few of these, although not that many:

Oct 17 12:02:29 spamd3 spamd[6367]: prepare_cached(SELECT max(runtime) from 
bayes_expire WHERE id = ?) statement handle DBI::st=HASH(0xadbb060)still Active 
at /usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 722
   


Try an EXPLAIN SELECT max(runtime) from bayes_expire WHERE id =some value; as 
you know it to be slow it might give a clue where to look to improve performance. Or 
try turning the general query log on for a while and see what queries are taking up 
time. MonYog is quite a nice frontend to this, but you can do it by hand fairly 
simply.
 

mysql  EXPLAIN SELECT max(runtime) from bayes_expire WHERE id = 5;
++-+--+--+---+---+-+---+--+---+
| id | select_type | table| type | possible_keys | key  
 | key_len | ref   | rows | Extra |
++-+--+--+---+---+-+---+--+---+
|  1 | SIMPLE  | bayes_expire | ref  | bayes_expire_idx1 | 
bayes_expire_idx1 | 2   | const |  198 |   |
++-+--+--+---+---+-+---+--+---+
1 row in set (0.00 sec)
   


This looks well indexed. I can only see it taking a long time if there 
are locking issues with other queries.

Note, this might be related to the post I made today about sa-learn
--expire taking hours...
   


Very probably. Try SHOW FULL PROCESSLIST to see the whole query. 
bayes_token looks like a problem table - how many rows are in it?

micah

   

Dominic



Bayes timeouts and database handle being DESTROY'd without explicit disconnect

2010-10-19 Thread Micah Anderson

Hello,

I'm running a busy mail server. We've got a bayes database on its own
server, with InnoDB tables. 

I'm seeing a number of these entries in my log files and am struggling
to determine what could be causing them and how to fix them:

Oct 19 07:02:10 spamd3 spamd[27474]: learn: exceeded time limit in pms learn
Oct 17 06:30:12 spamd3 spamd[25651]: plugin: eval failed: bayes: (in learn) 
__alarm__ignore__(15190)
Oct 17 06:30:42 spamd3 spamd[25598]: plugin: eval failed: bayes: (in learn) 
child processing timeout at /usr/sbin/spamd line 1283, GEN1295 line 185.

I get quite a few of these:

Oct 19 07:02:19 spamd3 spamd[18746]: Issuing rollback() for database handle 
being DESTROY'd without explicit disconnect() at 
/usr/share/perl5/Mail/SpamAssassin/Plugin/Bayes.pm line 1516, GEN19133 line 2.

and a few of these, although not that many:

Oct 17 12:02:29 spamd3 spamd[6367]: prepare_cached(SELECT max(runtime) from 
bayes_expire WHERE id = ?) statement handle DBI::st=HASH(0xadbb060)still Active 
at /usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 722

Oct 19 05:33:13 spamd3 spamd[1630]: bayes: db_seen corrupt: value='1287482415' 
for 5d6fb52248450ee7528848c3a78b5a0650a24...@sa_generated, ignored at 
/usr/share/perl5/Mail/SpamAssassin/Plugin/Bayes.pm line 397, GEN18675 line 
112.

thanks for any insights!
micha


pgpOWKtRHjXPz.pgp
Description: PGP signature


Re: Bayes timeouts and database handle being DESTROY'd without explicit disconnect

2010-10-19 Thread Dominic Benson

On 19 Oct 2010, at 17:05, Micah Anderson wrote:

 
 Hello,
 
 I'm running a busy mail server. We've got a bayes database on its own
 server, with InnoDB tables. 

What is your total DB size / server RAM? Could you include a snapshot of the 
output of top from the DB server? I would guess that your problem is 
indexing/tuning or server capacity MySQL side rather than in SA, but without 
more data it is just a guess.

 
 I'm seeing a number of these entries in my log files and am struggling
 to determine what could be causing them and how to fix them:
 
 Oct 19 07:02:10 spamd3 spamd[27474]: learn: exceeded time limit in pms learn
 Oct 17 06:30:12 spamd3 spamd[25651]: plugin: eval failed: bayes: (in learn) 
 __alarm__ignore__(15190)
 Oct 17 06:30:42 spamd3 spamd[25598]: plugin: eval failed: bayes: (in learn) 
 child processing timeout at /usr/sbin/spamd line 1283, GEN1295 line 185.
 
 I get quite a few of these:
 
 Oct 19 07:02:19 spamd3 spamd[18746]: Issuing rollback() for database handle 
 being DESTROY'd without explicit disconnect() at 
 /usr/share/perl5/Mail/SpamAssassin/Plugin/Bayes.pm line 1516, GEN19133 line 
 2.
 

If the processing timeout above occurs inside an uncommitted transaction, you 
will get this. 

 and a few of these, although not that many:
 
 Oct 17 12:02:29 spamd3 spamd[6367]: prepare_cached(SELECT max(runtime) from 
 bayes_expire WHERE id = ?) statement handle DBI::st=HASH(0xadbb060)still 
 Active at /usr/share/perl5/Mail/SpamAssassin/BayesStore/SQL.pm line 722


Try an EXPLAIN SELECT max(runtime) from bayes_expire WHERE id = some value; 
as you know it to be slow it might give a clue where to look to improve 
performance. Or try turning the general query log on for a while and see what 
queries are taking up time. MonYog is quite a nice frontend to this, but you 
can do it by hand fairly simply.

 
 Oct 19 05:33:13 spamd3 spamd[1630]: bayes: db_seen corrupt: 
 value='1287482415' for 5d6fb52248450ee7528848c3a78b5a0650a24...@sa_generated, 
 ignored at /usr/share/perl5/Mail/SpamAssassin/Plugin/Bayes.pm line 397, 
 GEN18675 line 112.
 
 thanks for any insights!
 micha

Dominic