In article <[email protected]>, Chris Angelico <[email protected]> wrote:
> On Sat, Mar 30, 2013 at 11:41 AM, Roy Smith <[email protected]> wrote: > > In article <[email protected]>, > > Dennis Lee Bieber <[email protected]> wrote: > > > >> If using MySQLdb, there isn't all that much difference... MySQLdb is > >> still compatible with MySQL v4 (and maybe even v3), and since those > >> versions don't have "prepared statements", .executemany() essentially > >> turns into something that creates a newline delimited "list" of > >> "identical" (but for argument substitution) statements and submits that > >> to MySQL. > > > > Shockingly, that does appear to be the case. I had thought during my > > initial testing that I was seeing far greater throughput, but as I got > > more into the project and started doing some side-by-side comparisons, > > it the differences went away. > > How much are you doing per transaction? The two extremes (everything > in one transaction, or each line in its own transaction) are probably > the worst for performance. See what happens if you pepper the code > with 'begin' and 'commit' statements (maybe every thousand or ten > thousand rows) to see if performance improves. > > ChrisA We're doing it all in one transaction, on purpose. We start with an initial dump, then get updates about once a day. We want to make sure that the updates either complete without errors, or back out cleanly. If we ever had a partial daily update, the result would be a mess. Hmmm, on the other hand, I could probably try doing the initial dump the way you describe. If it fails, we can just delete the whole thing and start again. -- http://mail.python.org/mailman/listinfo/python-list
