I did three back-to-back runs using the same settings as in http://archives.postgresql.org/pgsql-performance/2012-11/msg00007.php Except: - use no prepared statement - use 40 db connections - build source from postgresql.git on the server box using: REL9_1_7, REL9_2_2, REL9_2_2 + this patch
NOTPM results: REL9_1_7: 46512.66 REL9_2_2: 42828.66 REL9_2_2 + this patch: 46973.70 Thanks, Dong PS, the top 20 lines of oprofile of these runs attached. -----Original Message----- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Tuesday, January 01, 2013 6:48 PM To: Peter Eisentraut Cc: Heikki Linnakangas; Pavel Stehule; PostgreSQL Hackers; Dong Ye Subject: Re: [HACKERS] dynamic SQL - possible performance regression in 9.2 I wrote: > I'm inclined to think that Heikki's patch doesn't go far enough, if we > want to optimize behavior in this case. What we really want to happen > is that parsing, planning, and execution all happen in the caller's > memory context, with no copying of parse or plan trees at all - and we > could do without overhead such as dependency extraction and invalidation > checking, too. This would make SPI_execute a lot more comparable to the > behavior of exec_simple_query(). Here's a draft patch for that. My initial hack at it had a disadvantage, which was that because no invalidation checking happened, a SPI_execute query string containing a DDL command (such as ALTER TABLE) followed by a command affected by the DDL would fail to reparse/replan the second command properly. (I suspect that Heikki's version had a related defect, but haven't looked closely.) Now that's not a huge deal IMO, because in many common cases parse analysis of the second command would fail anyway. For instance, this has never worked in any PG release: do $$ begin execute 'create table foo(f1 int); insert into foo values(1);'; end $$; However it troubled me that there might be some regression there, and after a bit of reflection I decided the right fix would be to rearrange the code in spi.c so that parse analysis of later parsetrees follows execution of earlier ones. This makes the behavior of SPI_execute() even more like that of exec_simple_query(), and shouldn't cost anything noticeable given the other changes here. I'm not entirely sure about performance of this fix, though. I got numbers varying between roughly-on-par with 9.1 and 10% slower than 9.1 for Pavel's example, depending on seemingly not-performance-related rearrangements of the code in spi.c. I think this must be chance effects of cache line alignment, but it would be good to hear what other people get, both on Pavel's example and the other ones alluded to. In any case this seems better than unmodified HEAD, which was 40% slower than 9.1 for me. regards, tom lane
==> REL9_1_7 <== CPU: Intel Core/i7, speed 2.266e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name app name symbol name 701093 4.8657 postgres postgres SearchCatCache 646611 4.4876 postgres postgres AllocSetAlloc 565140 3.9222 postgres postgres hash_search_with_hash_value 466490 3.2375 postgres postgres base_yyparse 291208 2.0210 postgres postgres LWLockAcquire 232944 1.6167 postgres postgres PinBuffer 220879 1.5329 postgres postgres MemoryContextAllocZeroAligned 213494 1.4817 postgres postgres core_yylex 193402 1.3423 postgres postgres heap_hot_search_buffer 191454 1.3287 postgres postgres expression_tree_walker 191247 1.3273 postgres postgres _bt_compare 189949 1.3183 libc-2.14.1.so libc-2.14.1.so __strcmp_sse42 173674 1.2053 postgres postgres XLogInsert 163930 1.1377 postgres postgres FunctionCall2Coll 150816 1.0467 libc-2.14.1.so libc-2.14.1.so __memcpy_ssse3_back 147756 1.0255 postgres postgres tbm_iterate 146191 1.0146 postgres postgres MemoryContextAlloc 140754 0.9769 postgres postgres nocachegetattr 134809 0.9356 postgres postgres heap_page_prune_opt 124110 0.8614 postgres postgres hash_any ==> REL9_2_2 <== CPU: Intel Core/i7, speed 2.266e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name app name symbol name 1086187 6.4059 postgres postgres AllocSetAlloc 773733 4.5632 postgres postgres SearchCatCache 534903 3.1546 postgres postgres base_yyparse 468215 2.7613 postgres postgres copyObject 413929 2.4412 postgres postgres MemoryContextAllocZeroAligned 387373 2.2846 postgres postgres hash_search_with_hash_value 293138 1.7288 postgres postgres expression_tree_walker 266981 1.5745 libc-2.14.1.so libc-2.14.1.so __memcpy_ssse3_back 260617 1.5370 libc-2.14.1.so libc-2.14.1.so _int_malloc 255529 1.5070 postgres postgres MemoryContextAlloc 233867 1.3793 postgres postgres core_yylex 223767 1.3197 libc-2.14.1.so libc-2.14.1.so __strcmp_sse42 206801 1.2196 postgres postgres _bt_compare 203589 1.2007 postgres postgres XLogInsert 200657 1.1834 postgres postgres LWLockAcquire 184643 1.0890 postgres postgres PinBuffer 171459 1.0112 postgres postgres nocachegetattr 171056 1.0088 postgres postgres FunctionCall2Coll 159389 0.9400 postgres postgres check_stack_depth 147873 0.8721 libc-2.14.1.so libc-2.14.1.so __strlen_sse42 ==> REL9_2_2 + patch <== CPU: Intel Core/i7, speed 2.266e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name app name symbol name 966153 5.1987 postgres postgres SearchCatCache 908404 4.8880 postgres postgres AllocSetAlloc 625522 3.3658 postgres postgres base_yyparse 476507 2.5640 postgres postgres hash_search_with_hash_value 287013 1.5444 postgres postgres MemoryContextAllocZeroAligned 284729 1.5321 postgres postgres core_yylex 281957 1.5172 postgres postgres expression_tree_walker 273250 1.4703 postgres postgres XLogInsert 258647 1.3917 postgres postgres _bt_compare 251712 1.3544 postgres postgres LWLockAcquire 251396 1.3527 libc-2.14.1.so libc-2.14.1.so __strcmp_sse42 230653 1.2411 postgres postgres PinBuffer 213339 1.1479 postgres postgres FunctionCall2Coll 210845 1.1345 libc-2.14.1.so libc-2.14.1.so __memcpy_ssse3_back 205990 1.1084 postgres postgres MemoryContextAlloc 201418 1.0838 postgres postgres nocachegetattr 186821 1.0053 postgres postgres heap_hot_search_buffer 172356 0.9274 libc-2.14.1.so libc-2.14.1.so _int_malloc 167646 0.9021 postgres postgres fmgr_info_cxt_security 155884 0.8388 postgres postgres lappend
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers