[sqlalchemy] Re: mysql extended inserts

Michael Bayer Sun, 01 Nov 2009 15:37:04 -0800


On Nov 1, 2009, at 5:37 PM, BEES INC wrote:


>
> Ah yes I'm not sure of the terminology either, i got the 'extended'
> from the mysqldump usage info (mysqldump -e, on by default but you can
> turn it off with mysqldump --extended-insert=false)
>
> Anyway, from looking at the mysqldump output, extended inserts are
> covered by this bit from the INSERT documentation:
>
> "INSERT statements that use VALUES syntax can insert multiple rows. To
> do this, include multiple lists of column values, each enclosed within
> parentheses and separated by commas. Example:
>
> INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);"
>
> So its the difference between a single INSERT with lots of data vs a
> separate INSERT for each row
>
> The execute many thing is an option, but the relationships are a
> little complex (26 tables and counting). There is only one object
> added to the session, everything else is linked from there.
> Implementing the inserts without the ORM might get a little involved
> when it comes to ensuring the relations are established correctly, but
> if that's the way to do it then so be it.
>
> I am also open to the possibility to rejiggering my schema/session
> usage to fit better with SA, which is a fairly undefined goal I know.
> One thing that is very common in my program is a lot of (datetime,
> float) observations. They are all mapped to different objects and all
> go in different tables as they belong to different objects by the
> business logic. I'm not sure if it would be better to put them in one
> table and use a secondary relationship table with 2 foreign keys
> mapping objects to their given date/float observations. Indeed I had
> it that way originally but it has not seemed to have any impact on
> performance either way.

the ORM can't bundle INSERT statements together in any way since its  
necessary to retrieve information about each INSERT statement  
individually, particularly for MySQL where it's the only way we can  
get at the "last inserted ID", necessary for assigning identity to  
each ORM-mapped instance.  Additionally, when many objects are being  
INSERTed by the ORM, there's no guarantee that each INSERT statement  
would render in the same way, since missing values on an instance  
translate into that column being omitted from just that statement so  
that a server side default can fire off.

so you can have effortless or large-scale-performant for a large bulk  
insert, take your pick.

I also would hypothesize that standard executemany() is a better  
choice than than the "INSERT" with many values version, since the  
latter places a toll on mysql's query parser and buffer.







>
> Thanks for your reply
>
> On Nov 2, 5:38 am, Michael Bayer <mike...@zzzcomputing.com> wrote:
>> On Nov 1, 2009, at 9:52 AM, BEES INC wrote:
>>
>>
>>
>>
>>
>>> Hi
>>
>>> Does sqlalchemy support mysql extended insert syntax?
>>> I have been chasing down very slow performance. My program  
>>> generates a
>>> whole bunch of data then does a big SA commit() at the end, which is
>>> proving to be inconveniently slow.
>>
>>> This is a raw import of a mysqldump without extended inserts
>>
>>> dbregress$ time mysql -ubees -pbees tt_test < up50ma.dbdump6
>>
>>> real    0m33.043s
>>> user    0m1.343s
>>> sys     0m1.572s
>>> dbregress$
>>
>>> Which roughly corresponds to the time it is taking SA to do the
>>> commit.
>>
>>> This is a raw import using extended inserts:
>>
>>> dbregress$ time mysql -ubees -pbees tt_test < up50ma.dbdump6.e
>>
>>> real    0m5.517s
>>> user    0m0.081s
>>> sys     0m0.024s
>>> dbregress$
>>
>>> Which is clearly noticeably faster.
>>
>>> Also this is for a test dataset, the real dataset is much larger,  
>>> and
>>> when used the final commit can take upwards of 25 minutes.
>>
>>> I could not see anything in the documentation regarding this,
>>> apologies if i missed something. I tried periodic commits vs a  
>>> single
>>> commit at the end but this did not offer any real difference in
>>> performance, and also introduced other issues.
>>
>> you'll get much faster performance for bulk inserts if you forego the
>> usage of the ORM and use executemany() syntax which is documented  
>> here:
>>
>> http://www.sqlalchemy.org/docs/05/sqlexpression.html#executing- 
>> multip...
>>
>> I've googled and searched MySQL's site for "extended insert syntax"
>> and while there are many mentions of it, there are no definitions or
>> examples I can find.  For example the word "extended" does not appear
>> on this page in any form:  http://dev.mysql.com/doc/refman/5.0/en/insert.html
>>   .
>>
>>
> >


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: mysql extended inserts

Reply via email to