-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Reading the latest comments at https://github.com/PyMySQL/PyMySQL/issues/275, it seems to me that the issue is not to be solved in drivers themselves but instead in libraries that arrange connections (sqlalchemy/oslo.db), correct?
Will the proposed connection reopening help? /Ihar On 05/12/14 23:43, Mike Bayer wrote: > Hey list - > > I’m posting this here just to get some ideas on what might be > happening here, as it may or may not have some impact on Openstack > if and when we move to MySQL drivers that are async-patchable, like > MySQL-connector or PyMySQL. I had a user post this issue a few > days ago which I’ve since distilled into test cases for PyMySQL and > MySQL-connector separately. It uses gevent, not eventlet, so I’m > not really sure if this applies. But there’s plenty of very smart > people here so if anyone can shed some light on what is actually > happening here, that would help. > > The program essentially illustrates code that performs several > steps upon a connection, however if the greenlet is suddenly > killed, the state from the connection, while damaged, is still > being allowed to continue on in some way, and what’s > super-catastrophic here is that you see a transaction actually > being committed *without* all the statements proceeding on it. > > In my work with MySQL drivers, I’ve noted for years that they are > all very, very bad at dealing with concurrency-related issues. The > whole “MySQL has gone away” and “commands out of sync” errors are > ones that we’ve all just drowned in, and so often these are due to > the driver getting mixed up due to concurrent use of a connection. > However this one seems more insidious. Though at the same time, > the script has some complexity happening (like a simplistic > connection pool) and I’m not really sure where the core of the > issue lies. > > The script is at > https://gist.github.com/zzzeek/d196fa91c40cb515365e and also below. > If you run it for a few seconds, go over to your MySQL command line > and run this query: > > SELECT * FROM table_b WHERE a_id not in (SELECT id FROM table_a) > ORDER BY a_id DESC; > > and what you’ll see is tons of rows in table_b where the “a_id” is > zero (because cursor.lastrowid fails), but the *rows are > committed*. If you read the segment of code that does this, it > should be impossible: > > connection = pool.get() rowid = execute_sql( connection, "INSERT > INTO table_a (data) VALUES (%s)", ("a",) ) > > gevent.sleep(random.random() * 0.2) try: execute_sql( connection, > "INSERT INTO table_b (a_id, data) VALUES (%s, %s)", (rowid, "b",) > ) connection.commit() pool.return_conn(connection) > > except Exception: connection.rollback() > pool.return_conn(connection) > > so if the gevent.sleep() throws a timeout error, somehow we are > getting thrown back in there, with the connection in an invalid > state, but not invalid enough to commit. > > If a simple check for “SELECT connection_id()” is added, this query > fails and the whole issue is prevented. Additionally, if you put a > foreign key constraint on that b_table.a_id, then the issue is > prevented, and you see that the constraint violation is happening > all over the place within the commit() call. The connection is > being used such that its state just started after the > gevent.sleep() call. > > Now, there’s also a very rudimental connection pool here. That is > also part of what’s going on. If i try to run without the pool, > the whole script just runs out of connections, fast, which suggests > that this gevent timeout cleans itself up very, very badly. > However, SQLAlchemy’s pool works a lot like this one, so if folks > here can tell me if the connection pool is doing something bad, > then that’s key, because I need to make a comparable change in > SQLAlchemy’s pool. Otherwise I worry our eventlet use could have > big problems under high load. > > > > > > # -*- coding: utf-8 -*- import gevent.monkey > gevent.monkey.patch_all() > > import collections import threading import time import random > import sys > > import logging logging.basicConfig() log = > logging.getLogger('foo') log.setLevel(logging.DEBUG) > > #import pymysql as dbapi from mysql import connector as dbapi > > > class SimplePool(object): def __init__(self): self.checkedin = > collections.deque([ self._connect() for i in range(50) ]) > self.checkout_lock = threading.Lock() self.checkin_lock = > threading.Lock() > > def _connect(self): return dbapi.connect( user="scott", > passwd="tiger", host="localhost", db="test") > > def get(self): with self.checkout_lock: while not self.checkedin: > time.sleep(.1) return self.checkedin.pop() > > def return_conn(self, conn): try: conn.rollback() except: > log.error("Exception during rollback", exc_info=True) try: > conn.close() except: log.error("Exception during close", > exc_info=True) > > # recycle to a new connection conn = self._connect() with > self.checkin_lock: self.checkedin.append(conn) > > > def verify_connection_id(conn): cursor = conn.cursor() try: > cursor.execute("select connection_id()") row = cursor.fetchone() > return row[0] except: return None finally: cursor.close() > > > def execute_sql(conn, sql, params=()): cursor = conn.cursor() > cursor.execute(sql, params) lastrowid = cursor.lastrowid > cursor.close() return lastrowid > > > pool = SimplePool() > > # SELECT * FROM table_b WHERE a_id not in # (SELECT id FROM > table_a) ORDER BY a_id DESC; > > PREPARE_SQL = [ "DROP TABLE IF EXISTS table_b", "DROP TABLE IF > EXISTS table_a", """CREATE TABLE table_a ( id INT NOT NULL > AUTO_INCREMENT, data VARCHAR (256) NOT NULL, PRIMARY KEY (id) ) > engine='InnoDB'""", """CREATE TABLE table_b ( id INT NOT NULL > AUTO_INCREMENT, a_id INT NOT NULL, data VARCHAR (256) NOT NULL, -- > uncomment this to illustrate where the driver is attempting -- to > INSERT the row during ROLLBACK -- FOREIGN KEY (a_id) REFERENCES > table_a(id), PRIMARY KEY (id) ) engine='InnoDB' """] > > connection = pool.get() for sql in PREPARE_SQL: > execute_sql(connection, sql) connection.commit() > pool.return_conn(connection) print("Table prepared...") > > > def transaction_kill_worker(): while True: try: connection = None > with gevent.Timeout(0.1): connection = pool.get() rowid = > execute_sql( connection, "INSERT INTO table_a (data) VALUES (%s)", > ("a",)) gevent.sleep(random.random() * 0.2) > > try: execute_sql( connection, "INSERT INTO table_b (a_id, data) > VALUES (%s, %s)", (rowid, "b",)) > > # this version prevents the commit from # proceeding on a bad > connection # if verify_connection_id(connection): # > connection.commit() > > # this version does not. It will commit the # row for table_b > without the table_a being present. connection.commit() > > pool.return_conn(connection) except Exception: > connection.rollback() pool.return_conn(connection) > sys.stdout.write("$") except gevent.Timeout: # try to return the > connection anyway if connection is not None: > pool.return_conn(connection) sys.stdout.write("#") except > Exception: # logger.exception(e) sys.stdout.write("@") else: > sys.stdout.write(".") finally: if connection is not None: > pool.return_conn(connection) > > > def main(): for i in range(50): > gevent.spawn(transaction_kill_worker) > > gevent.sleep(3) > > while True: gevent.sleep(5) > > > if __name__ == "__main__": main() > > > > > > > _______________________________________________ OpenStack-dev > mailing list OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJUivtJAAoJEC5aWaUY1u57nn8IAJP5zK/htG8EeoOSWZVV1ksA en+lQuIA09aNdkHSNS1b/lNOYwsF4X5SM0dU5Cs4LCsumC5jM9S/cNOn3sfpVooA vd31O1kKtd255YtnsKSmrOPiytoI69n2/65tVqgWLHpuXRSaj4HtqOEY/vOWMX6g BON50QUYwwxAZLNOPmEO7vUnJ3VYO6zquH2mQrA1Vg/LCm3+VaodEHOVCxieaJ/n iQPB4Vx1dkuP10HzWyjQW0j4kbUakqgkq/VHaiCYNC85HzPz6KJUOK/neZcBrWsZ RQcLae1dX1yGMXDd5hyJaoe3qUfjuvSZmV5jS3ok/x8rnKdmVl65PtUUlLfVOU0= =wJ55 -----END PGP SIGNATURE----- _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev