subject:"Serious MySQL internal deadlock"

Re: Re[2]: Serious MySQL internal deadlock

2001-02-14 Thread Andreas Steinmetz


OK,
the problem doesn't occur with --skip-locking. Still, I don't believe this to
be a lockd problem as the partition mysqld is working on is a local ext2 fs.
lockd isn't even running on the test system (no nfs). So, this leaves either
(in no particular sequence):
1. a mysql problem
2. a glibc 2.2 problem
3. a kernel (2.2.18) problem
I understand that you can't investigate further as --skip-locking is an easy
workaround. I will, however, try to investigate, if I do find some spare time.
If I do find out what's going on in case of --enable-locking I'll let you know.

On 14-Feb-2001 Sinisa Milivojevic wrote:
 Andreas Steinmetz writes:
   Hi,
   just FYI: the deadlock problem is still in MySQL 3.23.33, the test I did
 send
   you behaves as in 3.23.32.
   
   
   Andreas Steinmetz
   D.O.M. Datenverarbeitung GmbH
   
 
 Thank you for contacting me, as I was just now starting to search for
 your e-mail address !!
 
 Your test case did cause a deadlock, but this is not MySQL fault !!
 
 As our manual clearly says, lockd is broken on Linux and some other
 systems, so --enable-locking will quite often result in a deadlock.
 
 If you start mysqld with --skip-locking , there are no problems. 
 
 Simply never use --enable-locking on OS's with broken lock daemon.
 
 We do not have time to fix it !!
 
 
 Regards,
 
 Sinisa
 
     __ _   _  ___ ==  MySQL AB
  /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
 /*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
/*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaca, Cyprus
   /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
   /*/^^^\*\^^^
  /*/ \*\Developers Team
 

Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Re[2]: Serious MySQL internal deadlock

2001-02-07 Thread Andreas Steinmetz


Hi,
first of all: the deadlock happened again today, this time with no slave
running so it isn't a replication issue. It seems we're getting closer as when
I did run 'show processlist', the pending query was (excerpt from output):

1666logreader   10.1.1.4syslog  Query   114 Sending data   
select max(swap_used+swap_free) from pcp where host='apollo' and
time='2001020614' and time'20

This is the same type of query as the last time the deadlock occured. Actually
looking with netstat there was no more connection between the webserver on
10.1.1.4 and the database. Then I tried to kill the thread which resulted in
(excerpt from 'show processlist' output):

1666logreader   10.1.1.4syslog  Killed  850 Sending data   
select max(swap_used+swap_free) from pcp where host='apollo' and
time='2001020614' and time'20

The slow query log didn't show the query. After killing mysql (with SIGTERM)
the slow query log contained:

# Time: 010207 14:38:22
# User@Host: logreader[logreader] @  [10.1.1.4]
# Time: 891  Lock_time: 0  Rows_sent: 1
use syslog;
select max(swap_used+swap_free) from pcp where host='apollo' and
time='2001020614' and time'20010207141000';

Running the 'explain' statement on the query shows something very interesting:

mysql explain select max(swap_used+swap_free) from pcp where host='apollo' and
time='2001020614' and time'20010207141000';
+---+--+---+--+-+---+--++
| table | type | possible_keys | key  | key_len | ref   | rows | Extra  |
+---+--+---+--+-+---+--++
| pcp   | ref  | time,host | host |   8 | const | 8318 | where used |
+---+--+---+--+-+---+--++
1 row in set (0.00 sec)
 
mysql explain select max(swap_used+swap_free) from pcp where host='apollo' and
time='2001020614' and time'20010207141000';
+---+--+---+--+-+---+--++
| table | type | possible_keys | key  | key_len | ref   | rows | Extra  |
+---+--+---+--+-+---+--++
| pcp   | ref  | time,host | host |   8 | const | 5549 | where used |
+---+--+---+--+-+---+--++
1 row in set (0.00 sec)
 
mysql

Both commands were run within about a minute. The number of rows, however, is
vastly different, especially the amount of rows of the first query is vastly 
larger than the amount of rows of the second query, but there is no delete on
this table, only inserts and queries. I now did count the total rows in the
table for host apollo:

mysql select count(time) from pcp where host='apollo';
+-+
| count(time) |
+-+
|6684 |
+-+
1 row in set (0.08 sec)
 
mysql

Then I counted the rows for the time range:

mysql select count(time) from pcp where host='apollo' and
time='2001020614' and time'20010207141000';
+-+
| count(time) |
+-+
|1427 |
+-+
1 row in set (0.09 sec)
 
mysql

The latter value is somewhat OK, given the fact that there's one insert/minute
for every host (1440 inserts a day) and that there was a database deadlock
during the given time span so some inserts were lost while I didn't restart
mysql during the search for the reason of the deadlock.

As an additional information 'max(swap_used+swap_free)' currently results in
the same value (1GByte) for all rows of host apollo,during the given time range
all rows do contain swap_used=0 and swap_free=1GByte.

The structure of the table in the query in question runs against is as follows:

CREATE TABLE pcp (
   time timestamp(14),
   host varchar(8) NOT NULL,
   cpu_user smallint(5) unsigned DEFAULT '0' NOT NULL,
   cpu_nice smallint(5) unsigned DEFAULT '0' NOT NULL,
   cpu_sys smallint(5) unsigned DEFAULT '0' NOT NULL,
   cpu_idle smallint(5) unsigned DEFAULT '0' NOT NULL,
   mem_used int(10) unsigned DEFAULT '0' NOT NULL,
   mem_free int(10) unsigned DEFAULT '0' NOT NULL,
   mem_shared int(10) unsigned DEFAULT '0' NOT NULL,
   mem_cached int(10) unsigned DEFAULT '0' NOT NULL,
   mem_bufmem int(10) unsigned DEFAULT '0' NOT NULL,
   swap_free int(10) unsigned DEFAULT '0' NOT NULL,
   swap_used int(10) unsigned DEFAULT '0' NOT NULL,
   nprocs smallint(5) unsigned DEFAULT '0' NOT NULL,
   in_lo int(10) unsigned DEFAULT '0' NOT NULL,
   in_bond0 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth0 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth1 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth2 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth3 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth4 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth5 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth6 int(10) unsigned DEFAULT '0' NOT NULL,
   in_eth7 int(10) unsigned DEFAULT '0' NOT NULL,
   out_lo int(10) unsigned DEFAULT '0' NOT NULL,

Re[4]: Serious MySQL internal deadlock

2001-02-06 Thread Peter Zaitsev


Hello Sinisa,

Sunday, February 04, 2001, 3:15:21 PM, you wrote:

SM Peter Zaitsev writes:
SM   Hello Andreas,
SM   
SM   Thursday, February 01, 2001, 7:42:31 PM, you wrote:
SM   
SM   
SM   I must confirm the problem with table locks. Mysql realy may deadlock
SM   sometimes, and the funny thing is the solution to this case is to kill
SM   the oldest locked thread waiting this condition - afterwards
SM   everything resolves. So this may mean something like broadcast is lost
SM   sometimes (?)
SM   
SM   


SM Hi!

SM I guess we may sound to be obnoxious, but can you make a repeatable
SM case out of it ??

Well. I wish I could - I was never able to repeate this, althought it
appears again and again - on my 15 servers under mysql I usually see
this about once per week, so it seems to be seldom one :)


-- 
Best regards,
 Petermailto:[EMAIL PROTECTED]



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Re[2]: Serious MySQL internal deadlock

2001-02-06 Thread Andreas Steinmetz


Hi,
well, the query is over a network. Single switch with two VLANs between the
systems, network speed is *200MBit/s*. The php code structure executing was:

... prepare query ...
mysql_connect();
mysql_query();
mysql_fetch_row();
mysql_close();
... process and send data to browser ...

The table the query executed on did have a total of about 25000 rows at that
time. The query executing had to return exacty one row (max()). When the query
thread did terminate all threads waiting for inserts did keep *hanging* either
in state 'System lock' or state 'Opening table'. Thus:

The whole database was *deadlocked* with only insert queries pending after a
a data retrieval query thread did execute. In this state a regular shutdown
of the database is *not* possible, you have to send SIGTERM to shut down
mysql. Oh, the same thing did happen *again* today.

The behaviour looks like in some cases mysql doesn't unlock after a data
retrieval query or that pending inserts don't get properly restarted.

As a hint the inserting daemons keep continously connected to mysql as long as
the database is up, there are automatic reconnect attempts in case the database
is not available.

To finally assert that replication isn't involved I'm running this now with the
slaves stopped.

If it happens again I will try to set up code that is able to reproduce the
problem on a test system (single database without replication there).

On 06-Feb-2001 Sinisa Milivojevic wrote:
 Hi!
 
 First of all, please do not send us entire schema and your PHP
 scripts, as they make sense on your site only.
 
 Second, the above processlist does not show any deadlock. There is a
 query that is sending data, on which INSERT and UPDATE queries are
 waiting for.
 
 Judging by the above output I would also recommend you to increase
 table_cache and speed up connections if possible. If the above
 "Sending data" is done over network, you could turn compression on or
 make faster network.
 
 
 Regards,
 
 Sinisa
 
     __ _   _  ___ ==  MySQL AB
  /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
 /*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
/*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaka, Cyprus
   /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
   /*/^^^\*\^^^
  /*/ \*\Developers Team
 

Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Re[2]: Serious MySQL internal deadlock

2001-02-06 Thread Sinisa Milivojevic


Andreas Steinmetz writes:
  Hi,
  well, the query is over a network. Single switch with two VLANs between the
  systems, network speed is *200MBit/s*. The php code structure executing was:
  
  ... prepare query ...
  mysql_connect();
  mysql_query();
  mysql_fetch_row();
  mysql_close();
  ... process and send data to browser ...
  
  The table the query executed on did have a total of about 25000 rows at that
  time. The query executing had to return exacty one row (max()). When the query
  thread did terminate all threads waiting for inserts did keep *hanging* either
  in state 'System lock' or state 'Opening table'. Thus:
  
  The whole database was *deadlocked* with only insert queries pending after a
  a data retrieval query thread did execute. In this state a regular shutdown
  of the database is *not* possible, you have to send SIGTERM to shut down
  mysql. Oh, the same thing did happen *again* today.
  
  The behaviour looks like in some cases mysql doesn't unlock after a data
  retrieval query or that pending inserts don't get properly restarted.
  
  As a hint the inserting daemons keep continously connected to mysql as long as
  the database is up, there are automatic reconnect attempts in case the database
  is not available.
  
  To finally assert that replication isn't involved I'm running this now with the
  slaves stopped.
  
  If it happens again I will try to set up code that is able to reproduce the
  problem on a test system (single database without replication there).
  


HI!

Did INSERT queries changed their state after SELECT has finished
sending data ??

What OS is that and is there anything in the error log file that might
indicate some problems ??

If you are using Linux, there could be some problems with fast
Ethernet cards in duplex mode. If that is a case, please try disabling
duplexing (on hubs and switches). Also enable compression. 

And yes, we would be very much interested in repeatable test case.


Regards,

Sinisa

    __ _   _  ___ ==  MySQL AB
 /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
/*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
   /*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaka, Cyprus
  /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
  /*/^^^\*\^^^
 /*/ \*\Developers Team

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Re[2]: Serious MySQL internal deadlock

2001-02-06 Thread Andreas Steinmetz


Hi,
the inserts did not change the state after the select finished. The network is
running full duplex but the only problem I ever encountered with the cards was
initalization on fast systems. I fixed this and posted it to the linux kernel
mailing list (search for epic100).
If the problem continues to appear with the slaves stopped I'll try to code a
reproducable test case.
The only thing that's visible is the following entry in the slow query log
which matches the process list attached to my earlier mail. The funny thing is
that this query will result in exactly one row and according to the log the
this row was sent. The query time, however, of 277 seconds for a somewhat
simple query makes me wonder...

# Time: 010205 23:13:00
# User@Host: logreader[logreader] @  [10.1.1.4]
# Time: 277  Lock_time: 0  Rows_sent: 1
use syslog;
select max(swap_used+swap_free) from pcp where host='castor' and
time='2001020423' and time'20010205231000';

On 06-Feb-2001 Sinisa Milivojevic wrote:
 Andreas Steinmetz writes:
   Hi,
   well, the query is over a network. Single switch with two VLANs between
 the
   systems, network speed is *200MBit/s*. The php code structure executing
 was:
   
   ... prepare query ...
   mysql_connect();
   mysql_query();
   mysql_fetch_row();
   mysql_close();
   ... process and send data to browser ...
   
   The table the query executed on did have a total of about 25000 rows at
 that
   time. The query executing had to return exacty one row (max()). When the
 query
   thread did terminate all threads waiting for inserts did keep *hanging*
 either
   in state 'System lock' or state 'Opening table'. Thus:
   
   The whole database was *deadlocked* with only insert queries pending after
 a
   a data retrieval query thread did execute. In this state a regular
 shutdown
   of the database is *not* possible, you have to send SIGTERM to shut down
   mysql. Oh, the same thing did happen *again* today.
   
   The behaviour looks like in some cases mysql doesn't unlock after a data
   retrieval query or that pending inserts don't get properly restarted.
   
   As a hint the inserting daemons keep continously connected to mysql as
 long as
   the database is up, there are automatic reconnect attempts in case the
 database
   is not available.
   
   To finally assert that replication isn't involved I'm running this now
 with the
   slaves stopped.
   
   If it happens again I will try to set up code that is able to reproduce
 the
   problem on a test system (single database without replication there).
   
 
 
 HI!
 
 Did INSERT queries changed their state after SELECT has finished
 sending data ??
 
 What OS is that and is there anything in the error log file that might
 indicate some problems ??
 
 If you are using Linux, there could be some problems with fast
 Ethernet cards in duplex mode. If that is a case, please try disabling
 duplexing (on hubs and switches). Also enable compression. 
 
 And yes, we would be very much interested in repeatable test case.
 
 
 Regards,
 
 Sinisa
 
     __ _   _  ___ ==  MySQL AB
  /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
 /*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
/*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaka, Cyprus
   /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
   /*/^^^\*\^^^
  /*/ \*\Developers Team
 

Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Re[2]: Serious MySQL internal deadlock

2001-02-05 Thread Andreas Steinmetz


Hi,
unfortunately I can't set up a scenario that easily reproduces the problem. The
last lockups happened on saturday and today (monday). What I can confirm is:

1. It doesn't seem to be a slave related problem. During the last two lockups I
checked the master as well as the slave statuses, all were in sync. I then
issued 'slave stop' on both slaves, killed the master (TERM), restarted it and
then issued 'slave start' on both slaves then continued syncing just fine.

2. The lockup always happens when there's a concurrent insert and a query.
Today I just looked at some performance statiststics (a web page with 6
generated performance graphics) when the lockup occured. 5 of the 6 graphics
were complete, the last one didn't complete and the database lockup was back
again.

What I could do would be to send you the database schema and some of the php
report scripts.

Fortunately I managed today to react fast to retrieve the process list
(attached). Looking at it, could it be that the problem occurs when a query is
processed on a table and more than one insert to the same table is pending?

The termination of id 584 didn't help. All further inserts were stuck in state
'System lock' (all but 1) or 'Opening table' (1, this was the successor of id 2
which I killed to test what would happen after a reconnect) and all further
queries were stuck in state 'Opening tables'.

As a remainder: I already tried mandatory write locks for the inserters and
mandatory read locks for the queriers which didn't help.
I didn't check, if more than one database is affected by the lockup.

On 04-Feb-2001 Sinisa Milivojevic wrote:
 Peter Zaitsev writes:
   Hello Andreas,
   
   Thursday, February 01, 2001, 7:42:31 PM, you wrote:
   
   
   I must confirm the problem with table locks. Mysql realy may deadlock
   sometimes, and the funny thing is the solution to this case is to kill
   the oldest locked thread waiting this condition - afterwards
   everything resolves. So this may mean something like broadcast is lost
   sometimes (?)
   
   
 
 
 Hi!
 
 I guess we may sound to be obnoxious, but can you make a repeatable
 case out of it ??
 
 d
 Regards,
 
 Sinisa
 
     __ _   _  ___ ==  MySQL AB
  /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
 /*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
/*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaka, Cyprus
   /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
   /*/^^^\*\^^^
  /*/ \*\Developers Team
 

Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH


+-+---+---++-++-
---+
--+
| Id  | User  | Host  | db | Command | Time   | State   
   | Info   
  |
+-+---+---++-++-
---+
--+
|   1 | logdaemon | localhost | syslog | Sleep   | 17 | 
   | NULL   
  |
|   2 | logdaemon | 10.1.1.3  | syslog | Query   | 54 | System lock 
   | INSERT INTO pcp (time,host,cpu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_
free,mem_shared,mem_cache |
|   3 | logdaemon | 10.1.1.6  | syslog | Query   | 43 | Opening table   
   | INSERT INTO pcp (time,host,cpu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_
free,mem_shared,mem_cache |
|   4 | logdaemon | 10.1.1.4  | syslog | Sleep   | 16 | 
   | NULL   
  |
|   5 | logdaemon | 10.1.1.7  | syslog | Sleep   | 17 | 
   | NULL   
  |
|   7 | logdaemon | 10.1.1.2  | syslog | Sleep   | 17 | 
   | NULL   
  |
|  14 | replicate | 10.1.1.2  | NULL   | Binlog Dump | 176368 | Waiting for upda
te | NULL   
  |
|  15 | replicate | 10.1.1.6  | NULL   | Binlog Dump | 176358 | Waiting for upda
te | NULL   
  |
| 519 | dbmaster  | 10.1.1.4  | syslog | Query   | 0  | NULL
   | show processlist   
  |
| 584 | logreader | 10.1.1.4  | syslog | Query   | 54 | Sending data
   | select

Re: Re[2]: Serious MySQL internal deadlock

2001-02-04 Thread Sinisa Milivojevic


Peter Zaitsev writes:
  Hello Andreas,
  
  Thursday, February 01, 2001, 7:42:31 PM, you wrote:
  
  
  I must confirm the problem with table locks. Mysql realy may deadlock
  sometimes, and the funny thing is the solution to this case is to kill
  the oldest locked thread waiting this condition - afterwards
  everything resolves. So this may mean something like broadcast is lost
  sometimes (?)
  
  


Hi!

I guess we may sound to be obnoxious, but can you make a repeatable
case out of it ??

d
Regards,

Sinisa

    __ _   _  ___ ==  MySQL AB
 /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
/*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
   /*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaka, Cyprus
  /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
  /*/^^^\*\^^^
 /*/ \*\Developers Team

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Serious MySQL internal deadlock

2001-02-02 Thread Andreas Steinmetz



On 01-Feb-2001 Sinisa Milivojevic wrote:
 
 HI!
 
 Most probably processes are waiting for the slave to get updated.
 
 To circumvent the problem, you should : 
 
 - use our binary (if possible)
 
 - avoid LOCK TABLES, which truly is necessary only in some rare cases
 
 
 
 Regards,
 
 Sinisa
 
     __ _   _  ___ ==  MySQL AB
  /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
 /*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
/*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaka, Cyprus
   /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
   /*/^^^\*\^^^
  /*/ \*\Developers Team
 

Definitely not a replication problem. Just happened again (see attachment) with
table locks removed (see my earlier mails). I then stopped the slave dbs and
killed the associated threads on the master. What you can easily see is that the
problem still persists. The database is locked for good.
This is where the fun really starts. When you kill all the threads (the log
daemons do recognize this and reconnect) *ALL* threads are in state system lock
(see second attachment). This clearly means to me that the database lock
handling is corrupted.


Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH


Id  UserHostdb  Command TimeState   Info
226 logdaemon   10.1.1.6syslog  Killed  64  System lock INSERT 
INTO pcp (time,host,c
pu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
448 logdaemon   10.1.1.3syslog  Query   64  System lock INSERT 
INTO pcp (time,host,cp
u_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
449 logdaemon   10.1.1.4syslog  Query   64  System lock INSERT 
INTO pcp (time,host,cp
u_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
450 logdaemon   10.1.1.7syslog  Query   64  System lock INSERT 
INTO pcp (time,host,cp
u_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
451 logdaemon   10.1.1.2syslog  Query   53  System lock INSERT 
INTO ipchains (time,ho
st,interface,proto,src_ip,src_port,dst_ip,dst_port,length,tos,id,frag_o
452 logdaemon   localhost   syslog  Query   51  System lock INSERT 
INTO pcp (time,host,c
pu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
453 dbmasterlocalhost   NULLQuery   0   NULLshow 
processlist


Id  UserHostdb  Command TimeState   Info
223 logdaemon   10.1.1.7syslog  Query   358 Opening table   INSERT 
INTO pcp (time,host
,cpu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
224 logdaemon   10.1.1.4syslog  Query   417 Opening table   INSERT 
INTO pcp (time,host
,cpu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
225 logdaemon   10.1.1.3syslog  Query   402 Opening table   INSERT 
INTO pcp (time,host
,cpu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
226 logdaemon   10.1.1.6syslog  Query   415 Opening table   INSERT 
INTO pcp (time,host
,cpu_user,cpu_nice,cpu_sys,cpu_idle,mem_used,mem_free,mem_shared,mem_cache
227 logdaemon   10.1.1.2syslog  Query   1279System lock INSERT 
INTO ipchains (time,
host,interface,proto,src_ip,src_port,dst_ip,dst_port,length,tos,id,frag_o
228 logdaemon   localhost   syslog  Query   635 System lock INSERT 
INTO ipchains (time,
host,interface,proto,src_ip,src_port,dst_ip,dst_port,length,tos,id,frag_o
437 logreader   10.1.1.4syslog  Query   421 Opening tables  select 
src_ip,count(src_i
p) as val,count(distinct dst_ip) as tot from ipchains where time='20010202
438 system  localhost   mysql   Query   396 Opening tables  SELECT User, 
Select_priv FRO
M user where User = 'dbmaster'
444 dbmasterlocalhost   NULLQuery   0   NULLshow 
processlist



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Serious MySQL internal deadlock

2001-02-01 Thread Andreas Steinmetz


It seems that MySQL 3.23.32 has an internal deadlock problem, causing the
database to stop responding.

Situation:

6 processes (each on a different system) connect to a mysql database and insert
various system data into a set of MyISAM tables. Each process uses the user
'logdaemon' which has insert privilege to the required tables.
Data is extacted by php scripts using the user 'logreader', which has select
privilege to the required tables.

It happens at least once a day that MySQL stops responding. First I did assume
that missing locks could cause the problem, so I modified all accesses to the
database to use read and write locks on the required tables. This didn't help.

If you look at the attached output of the 'show processlist' command the most
obvious thing to see is that 2 IDs are in state 'System Lock' on the same table.
Killing the threads doesn't help. The thread state changes to 'killed' but
that's it. Regular shutdown of MySQL is impossible, all MySQL PIDs must be sent
the TERM signal.

Note that even as MySQL is compiled with BDB support there are only MyISAM
tables. All inserts are "regular" inserts, there is no 'insert delayed'. the
amount of inserts/day of all tables of this database is about 1. The amount
of selects is currently much lower (less than 1000/day).

Thus concurrent inserts and selects to the same table may easily cause a denial
of service condition.

The only perhaps non-standard thing I did here was not to set a default database
but to address the tables directly (db.table) which may be, ahem, not usual but
still is a valid way to access a table. I'm going now to change this, i.e. all
connecting processes will set the database and see if the problem persists.
Nevertheless this is a somewhat bad situation that should be resolved as soon
as possible.

As a side effect of this problem I had the database first set to replicate in
Robin Round manner between three systems. Unfortunately, when the above
situation happens and MySQL must be killed, replication is killed for good,
too, as the slave thread of the killed MySQL database will just loop in a
'waiting to reconnect after a failed read' state after MySQL restart. Thus
the current replication setup is already that there's just two shadow databases
for the master that handles all reads and writes.

This leads to an additional thought: What happens to replication after a power
outage or a system crash? Yes, I' using a ups but it can happen, that these
things fail, causing a home made power outage. It wouldn't be too good if a
non-standard shutdown of a slave would mean to restart replication from scratch.


Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH


Id  UserHostdb  Command TimeState   Info
7   replicate   10.1.1.2NULLBinlog Dump 78339   Waiting for 
update  NULL
9   replicate   10.1.1.6NULLBinlog Dump 78339   Waiting for 
update  NULL
187 logdaemon   10.1.1.3NULLQuery   7015System lock LOCK 
TABLES syslog.pcp WRITE
188 logdaemon   10.1.1.7NULLQuery   6963Opening tables  LOCK 
TABLES syslog.pcp WRI
TE
189 logdaemon   10.1.1.6NULLQuery   6961Opening tables  LOCK 
TABLES syslog.pcp WRI
TE
190 logdaemon   10.1.1.4NULLQuery   7017System lock LOCK 
TABLES syslog.pcp WRITE
191 logdaemon   10.1.1.2NULLQuery   6961Opening tables  LOCK 
TABLES syslog.pcp WRI
TE
192 logdaemon   localhost   NULLQuery   6963Opening tables  LOCK 
TABLES syslog.pcp WR
ITE
246 logreader   10.1.1.3NULLQuery   6465Opening tables  lock 
tables syslog.ipchain
s read
247 logreader   10.1.1.3NULLQuery   6447Opening tables  lock 
tables syslog.ipchain
s read
248 logreader   10.1.1.3NULLQuery   6230Opening tables  lock 
tables syslog.ipchain
s read
249 logreader   10.1.1.3NULLQuery   5651Opening tables  lock 
tables syslog.ipchain
s read
250 logreader   10.1.1.3NULLQuery   5640Opening tables  lock 
tables syslog.ipchain
s read
251 logreader   10.1.1.3NULLQuery   4606Opening tables  lock 
tables syslog.ipchain
s read
252 logreader   10.1.1.3NULLQuery   4493Opening tables  lock 
tables syslog.ipchain
s read
253 logreader   10.1.1.3NULLQuery   3445Opening tables  lock 
tables syslog.ipchain
s read
254 logreader   10.1.1.3NULLQuery   3145Opening tables  lock 
tables syslog.ipchain
s read
255 logreader   10.1.1.3NULLQuery   2329Opening tables  lock 
tables syslog.ipchain
s read
256 logreader   10.1.1.3NULLQuery   1332Opening tables  lock 
tables syslog.ipchain
s read
257 logreader   10.1.1.3NULLQuery   1088Opening tables  lock 
tables syslog.ipchain
s read

Re: Serious MySQL internal deadlock

2001-02-01 Thread Andreas Steinmetz



On 01-Feb-2001 Sinisa Milivojevic wrote:
 
 HI!
 
 Most probably processes are waiting for the slave to get updated.
 
 To circumvent the problem, you should : 
 
 - use our binary (if possible)
 
 - avoid LOCK TABLES, which truly is necessary only in some rare cases
 
 
 
 Regards,
 
 Sinisa
 
     __ _   _  ___ ==  MySQL AB
  /*/\*\/\*\   /*/ \*\ /*/ \*\ |*| Sinisa Milivojevic
 /*/ /*/ /*/   \*\_   |*|   |*||*| mailto:[EMAIL PROTECTED]
/*/ /*/ /*/\*\/*/  \*\|*|   |*||*| Larnaka, Cyprus
   /*/ /*/  /*/\*\_/*/ \*\_/*/ |*|
   /*/^^^\*\^^^
  /*/ \*\Developers Team
 

Actually I did avoid using lock tables but got hit by this problem. Thus I
tried with lock tables.

Using supplied binaries is no choice as for certain reasons all systems
involved are completely built from source.

I can't really see a reason why a slave being updated should lock the master
for good. Nevertheless there's just one programmable switch between the master
and the slaves involved and the network speed is 200MBits/s (channel bonding) so
this can't really be the reason for the problem.

If I can do anything to help to sort this out I'll happily do so.


Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Re[2]: Serious MySQL internal deadlock

Re: Re[2]: Serious MySQL internal deadlock

Re[4]: Serious MySQL internal deadlock

Re: Re[2]: Serious MySQL internal deadlock

Re: Re[2]: Serious MySQL internal deadlock

Re: Re[2]: Serious MySQL internal deadlock

Re: Re[2]: Serious MySQL internal deadlock

Re: Re[2]: Serious MySQL internal deadlock

Re: Serious MySQL internal deadlock

Serious MySQL internal deadlock

Re: Serious MySQL internal deadlock

11 matches

Site Navigation

Mail list logo

Footer information