On Apr 21, 2:15 pm, Philip Semanchuk <phi...@semanchuk.com> wrote: > On Apr 21, 2009, at 2:02 PM, ericwoodwo...@gmail.com wrote: > > > > > Hi, > > I have a python script I'm writing that grabs some data from a > > com object, does a little formatting, and then inserts that data into > > a MS SQL 2003 DB. Because I'm using COM objects I'm importing > > win32com.client. That also allows me to use ADODB.connection and > > ADODB.command objects for working with SQL. > > > The program works fine but it's a little slow. Inserting ~5500 > > rows of data takes about 10 seconds using a DB that is on the same > > machine running the script. > > > I've done some general searches on how to speed this up and in > > other languages people suggest sending batches of inserts off at a > > time instead of executing 1 insert at a time. For java and .net > > people recommend using a stringbuilder function to create strings > > quickly. I don't know of such a function in python s I tried grouping > > my inserts into a single string using string += syntax. I knew that > > would be terrible but I wanted to see how terrible. Final reults: It > > was pretty terrible. Script went from taking ~18sec to taking > > 240sec. The overhead for recreating the strings was monster. No real > > surprise there. > > > So I then loaded up the commands into a list and at the end I > > used the strong join method to create the string. This was far faster > > than using += to create my strings but still took twice as long as > > just running my inserts one at a time. So I'm looking for > > suggestions. > > > Basically I have 5000 SQL inserts that I want to do as quickly as > > possible. This is purely academic as I can live with the 18 seconds > > the script needs to run (9 to talk to the com object and format the > > data and 10 to write to SQL) but I'm still curious how to improve on > > what I have running. > > Are you sure your logjam is in Python? Inserting 5500 rows can take a > few seconds if you're COMMITting after each INSERT. Wrap the whole > thing in an explicit transaction and see if that helps. > > Also, toss in a few print statements containing timestamps so you know > more about where the script is spending time. > > bye > Philip
I'm not 100% sure it's python and not SQL but I do suspect there's a better way to do this than just serial inserts. I could be wrong about that which is what i'm trying to explore. I already do use the time stamps and this is what I see: at 9 secs in I've gotten my data, formatted it, and placed it on the list at 9.047 secs in the string.join() is done and I have my command string at 35 secs the program ends. So somehow my SQL is taking a lot longer when I format it as single string. How would I make the whole thing one transaction? Just insert BEGIN TRANSACTION at the start and COMMIT at the end? Is that enough to do it? -- http://mail.python.org/mailman/listinfo/python-list