On Tuesday, December 11, 2012, Tom Lane wrote: > Jeff Janes <[email protected] <javascript:;>> writes: > > On Tue, Dec 11, 2012 at 2:50 AM, Patryk Sidzina > > <[email protected] <javascript:;>> wrote: > >> The differences come up when you change the "INSERT" to "EXECUTE > 'INSERT'" ( > >> and i checked this time on 3 machines, one of which was Windows): > > >> FOR i IN 1..cnt LOOP > >> EXECUTE 'INSERT INTO test_table_md_speed(n) VALUES (' || i || ')'; > >> END LOOP; > > > The culprit is the commit below. I don't know exactly why this slows > > down your case. A preliminary oprofile analysis suggests that it most > > of the slowdown is that it calls AllocSetAlloc more often. I suspect > > that this slow-down will be considered acceptable trade-off for > > getting good parameterized plans. > > I'm having a hard time getting excited about optimizing the above case: > the user can do far more to make it fast than we can, simply by not > using EXECUTE, which is utterly unnecessary in this example. >
I assumed his example was an intentionally simplified test-case, not a real world use-case. For a more realistic use, see "[PERFORM] Performance on Bulk Insert to Partitioned Table". There too it would probably be best to get rid of the EXECUTE, but doing so in that case would certainly have a high cost in trigger-code complexity and maintainability. (In my test case of loading 1e7 narrow tuples to 100 partitions, the plan cache change lead to a 26% slow down) > Having said that, though, it's not real clear to me why the plancache > changes would have affected the speed of EXECUTE at all --- the whole > point of that command is we don't cache a plan for the query. > Doing a bottom level profile isn't helpful because all of the extra time is in very low level code that is called from everywhere. Doing call-counts with gprof, I see that there is big increase in the calls to copyObject (which indirectly leads to a big increase in AllocSetAlloc). Before the change, each EXECUTE had one top-level (i.e. nonrecursive) copyObject call, coming from _SPI_prepare_plan. After the change, each EXECUTE has 4 such top-level copyObject calls, one each from CreateCachedPlan and CompleteCachedPlan and two from BuildCachedPlan. Cheers, Jeff
