My initial thought is that INSERTs shouldn't be locking the whole table, at least not throughout a whole transaction. There's some MySQL hints that can help with this, if you're on MyISAM take a look at http://dev.mysql.com/doc/refman/5.0/en/concurrent-inserts.html , possibly using the "concurrent_inserts" setting, as well as some of the INSERT hints specific to MySQL: http://dev.mysql.com/doc/refman/5.5/en/insert.html . On InnoDB it doesn't lock the whole table unless you do so explicitly.
SQLAlchemy is just a tool to help emit messages to the database. You'd need to figure out what scheme you want to use to improve your INSERT throughput first, then apply that strategy through SQLAlchemy's API. It might be as easy as switching to InnoDB, not sure. If you're looking to serialize inserts after the fact, you can use a message queue for that, such as a Celery process. This might be the way you want to go if you get large bursts of data with quiet periods, and it's OK that the MySQL database receives all the data "eventually". On May 23, 2012, at 4:56 PM, Jeff wrote: > More data: > A typical not-quite-worst-but-in-the-class-of-worst case scenario is a > half a million rows per insert. Absolute worst case scenarios could be > 10 times that. So that insert will take awhile. > > Would there be any logic to breaking up all the inserts into one row > per insert? Would that allow independent jobs to meaningfully > interleave their inserts? Does the overhead of sqlalchemy's session > adding and committing not really work that way? > > One uglier route that has been suggested to me is to first write all > the rows to independent local files, then later merge and insert them. > Does SQLAlchemy have functionality that might help in this direction? > > On May 23, 4:19 pm, Jeff <jeffalst...@gmail.com> wrote: >> Hello, >> >> I have hundreds of independent jobs on a cluster all writing entries >> to the same MySQL database table. Every time one job INSERTs, it locks >> the table, and the other jobs have to queue up for their turn. So at >> that point, the massively parallel cluster has turned into a massively >> serial cluster :-( What are routes available in SQLAlchemy to improve >> this situation? >> >> Thanks! >> Jeff > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To post to this group, send email to sqlalchemy@googlegroups.com. > To unsubscribe from this group, send email to > sqlalchemy+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.