Thanks Niphlod! What a great discussion, too. Yes, my case had 6500 rows of about 30 vars to be updated/inserted, but with a primary key of one column determining whether to update or insert.
I am so amazed at how easy it was to move from SQLIte to PostgreSQL with web2py. Massimo and the guys who wrote it really know their stuff. -- Joe B. On Thursday, December 6, 2012 1:45:36 PM UTC-8, Niphlod wrote: > > uhm, by "large update_or_insert" do you mean "multiple statements of > update_or_insert" or "update_or_insert with a huge number of fields" ? > it seems obvious to me that an "automatic" update_or_insert() on a table > with columns, e.g., a,b,c,d will have to do > select * from testtable where a=avalue and b=bvalue and c=cvalue and > d=dvalue > while a "raw mode", e.g, scanning only the 'a' column that you know is a > "primary key" would trigger a > select * from testtable where a=avalue > and the second can be faster. > Plus, if you know you have a lot more records to be updated instead of > being inserted, it's faster to do > update testtable set a=avalue, b=bvalue, c=cvalue, d=dvalue > and see the returned updated values than doing a select and then updating. > Web2py does not know in advance how many records are you going to insert > and how many are going to be updated, so it resorts to querying if there > are values BEFORE and then updating those values if no rows are returned. > > However - always in the spirit of "let web2py users reproduce your issues" > I created a small script, that on my machine returned > Summary > started update_or_insert (full-auto-mode), 5000 UPDATE, 5000 INSERT > finished in 16.4003179073 sec > started update_or_insert (semi-auto-mode), 5000 UPDATE, 5000 INSERT > finished in 15.1824979782 sec > started update_or_insert (explicit-mode), 10000 UPDATE, 0 INSERT > finished in 22.3497450352 sec > started update_or_insert (semi-auto-mode), 10000 UPDATE, 0 INSERT > finished in 15.8405299187 sec > started update_or_insert (full-auto-mode), 10000 UPDATE, 0 INSERT > finished in 17.2484908104 sec > started update_or_insert (explicit-mode), 5000 UPDATE, 5000 INSERT > finished in 15.6090190411 sec > on sqlite and > Summary > started update_or_insert (full-auto-mode), 5000 UPDATE, 5000 INSERT > finished in 26.6851980686 sec > started update_or_insert (semi-auto-mode), 5000 UPDATE, 5000 INSERT > finished in 26.9120578766 sec > started update_or_insert (explicit-mode), 10000 UPDATE, 0 INSERT > finished in 23.3219130039 sec > started update_or_insert (semi-auto-mode), 10000 UPDATE, 0 INSERT > finished in 34.6620731354 sec > started update_or_insert (full-auto-mode), 10000 UPDATE, 0 INSERT > finished in 35.9468638897 sec > started update_or_insert (explicit-mode), 5000 UPDATE, 5000 INSERT > finished in 24.5943310261 sec > > try it yourself (it's meant to create a table named "testtable" on the db, > place it into web2py's folder and start it with web2py.py -M -S yourapp -R > update_or_insert.py) > > Seems to me that web2py's default update_or_insert() is not taking that > much, and chose a right balanced "algorithm". > > BTW: I'll never end advising placing indexes on the right spots.... adding > an index on the d column lead to those results > Summary > started update_or_insert (full-auto-mode), 5000 UPDATE, 5000 INSERT > finished in 18.4528090954 sec > started update_or_insert (semi-auto-mode), 5000 UPDATE, 5000 INSERT > finished in 15.58761096 sec > started update_or_insert (explicit-mode), 10000 UPDATE, 0 INSERT > finished in 6.05162096024 sec > started update_or_insert (semi-auto-mode), 10000 UPDATE, 0 INSERT > finished in 15.5920169353 sec > started update_or_insert (full-auto-mode), 10000 UPDATE, 0 INSERT > finished in 19.1516299248 sec > started update_or_insert (explicit-mode), 5000 UPDATE, 5000 INSERT > finished in 10.5115959644 sec > > > > > --