Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
F.Y.I. here is the results I did on my laptop (Ubuntu 14, i7-4600U, 16GB mem, 512GB SSD). Unlike Josh, I used Unix domain sockets. In summary: 9.4.3: 943.439840 9.4.4: 429.953953 9.4 stable as of June 30: 929.804491 So comparing with 9.4.3, 9.4.4 is 54% slow, and 9.4-stable is 1.4% slow. I think the difference between 9.4.3 seems a measurement error but the difference between 9.4.3 and 9.4.4 is real. Good thing is 9.4 stable fixes the issue as expected. Thanks Tom! Below is the details. 9.4.3: t-ishii@localhost: pgbench -p 5434 -s 100 -j 2 -c 6 -r -C -S -T 1200 test Scale option ignored, using pgbench_branches table count = 100 starting vacuum...end. transaction type: SELECT only scaling factor: 100 query mode: simple number of clients: 6 number of threads: 2 duration: 1200 s number of transactions actually processed: 1132130 latency average: 6.360 ms tps = 943.439840 (including connections establishing) tps = 64573.165915 (excluding connections establishing) statement latencies in milliseconds: 0.002809\set naccounts 10 * :scale 0.001427\setrandom aid 1 :naccounts 4.254619SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 9.4.4: t-ishii@localhost: pgbench -p 5433 -s 100 -j 2 -c 6 -r -C -S -T 1200 test Scale option ignored, using pgbench_branches table count = 100 starting vacuum...end. transaction type: SELECT only scaling factor: 100 query mode: simple number of clients: 6 number of threads: 2 duration: 1200 s number of transactions actually processed: 515946 latency average: 13.955 ms tps = 429.953953 (including connections establishing) tps = 56235.664740 (excluding connections establishing) statement latencies in milliseconds: 0.003311\set naccounts 10 * :scale 0.001653\setrandom aid 1 :naccounts 9.320204SELECT abalance FROM pgbench_accounts WHERE aid = :aid 9.4-stable: t-ishii@localhost: pgbench -p 5435 -s 100 -j 2 -c 6 -r -C -S -T 1200 test Scale option ignored, using pgbench_branches table count = 100 starting vacuum...end. transaction type: SELECT only scaling factor: 100 query mode: simple number of clients: 6 number of threads: 2 duration: 1200 s number of transactions actually processed: 1115767 latency average: 6.453 ms tps = 929.804491 (including connections establishing) tps = 64676.863921 (excluding connections establishing) statement latencies in milliseconds: 0.002803\set naccounts 10 * :scale 0.001395\setrandom aid 1 :naccounts 4.316686SELECT abalance FROM pgbench_accounts WHERE aid = :aid; > OK, this is pretty bad in its real performance effects. On a workload > which is dominated by new connection creation, we've lost about 17% > throughput. > > To test it, I ran pgbench -s 100 -j 2 -c 6 -r -C -S -T 1200 against a > database which fits in shared_buffers on two different m3.large > instances on AWS (across the network, not on unix sockets). A typical > run on 9.3.6 looks like this: > > scaling factor: 100 > query mode: simple > number of clients: 6 > number of threads: 2 > duration: 1200 s > number of transactions actually processed: 252322 > tps = 210.267219 (including connections establishing) > tps = 31958.233736 (excluding connections establishing) > statement latencies in milliseconds: > 0.002515\set naccounts 10 * :scale > 0.000963\setrandom aid 1 :naccounts > 19.042859 SELECT abalance FROM pgbench_accounts WHERE aid > = :aid; > > Whereas a typical run on 9.3.9 looks like this: > > scaling factor: 100 > query mode: simple > number of clients: 6 > number of threads: 2 > duration: 1200 s > number of transactions actually processed: 208180 > tps = 173.482259 (including connections establishing) > tps = 31092.866153 (excluding connections establishing) > statement latencies in milliseconds: > 0.002518\set naccounts 10 * :scale > 0.000988\setrandom aid 1 :naccounts > 23.076961 SELECT abalance FROM pgbench_accounts WHERE aid > = :aid; > > Numbers are pretty consistent on four runs each on two different > instances (+/- 4%), so I don't think this is Amazon variability we're > seeing. I think the syscache invalidation is really costing us 17%. :-( > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
On 6/25/15 12:51 PM, Josh Berkus wrote: On 06/25/2015 10:47 AM, Peter Geoghegan wrote: On Wed, Jun 24, 2015 at 2:52 PM, Josh Berkus wrote: OK, this is pretty bad in its real performance effects. On a workload which is dominated by new connection creation, we've lost about 17% throughput. Mistakes happen, but this is the kind of regression that automated performance testing could have caught. That's another area of testing that needs considerable improvement IMV. Well, I have a pgbench test script for it now. If only I had some place to check it in ... Well, there is https://github.com/slux/PostgreSQL-Performance-Farm; but it hasn't been touched in 5 years. :( -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
On 06/25/2015 11:04 AM, Tom Lane wrote: > Josh Berkus writes: >> On 06/25/2015 08:12 AM, Tom Lane wrote: >>> I can't see doing a release just for this. If we were due for releases >>> anyway, sure, but we've considerably overstressed our poor packagers of >>> late. Previous discussion was to the effect that we'd anticipate another >>> set of releases in a month or so, after some more multixact fixes have >>> landed. > >> FWIW, I know users who will not update because of this regression. >> Consider applications which regularly need to spin up 200 new >> connections in 90 seconds due to usage peaks. > > TBH, if your app is critically dependent on backend startup time, > You're Doing It Wrong, because that means you're pissing away significant > performance by not using a connection pooler. So I don't have a huge > amount of sympathy. Most of these apps are using a connection pooler. But usage is very spiky, and one doesn't want to keep around 300 idle connections all the time for the 5 minutes when you need them. Indeed, pgbouncer is configured to spin down idle connections after a configured amount of time idle, as are most connection poolers. Regardless, I don't think it's a good idea to release again sooner than planned. It is a good reason to look at making connection creation time a regular test, though; you can be sure I'll be testing 9.4.5 and 9.5a! -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
Josh Berkus writes: > On 06/25/2015 08:12 AM, Tom Lane wrote: >> I can't see doing a release just for this. If we were due for releases >> anyway, sure, but we've considerably overstressed our poor packagers of >> late. Previous discussion was to the effect that we'd anticipate another >> set of releases in a month or so, after some more multixact fixes have >> landed. > FWIW, I know users who will not update because of this regression. > Consider applications which regularly need to spin up 200 new > connections in 90 seconds due to usage peaks. TBH, if your app is critically dependent on backend startup time, You're Doing It Wrong, because that means you're pissing away significant performance by not using a connection pooler. So I don't have a huge amount of sympathy. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
On 06/25/2015 10:47 AM, Peter Geoghegan wrote: > On Wed, Jun 24, 2015 at 2:52 PM, Josh Berkus wrote: >> OK, this is pretty bad in its real performance effects. On a workload >> which is dominated by new connection creation, we've lost about 17% >> throughput. > > Mistakes happen, but this is the kind of regression that automated > performance testing could have caught. That's another area of testing > that needs considerable improvement IMV. Well, I have a pgbench test script for it now. If only I had some place to check it in ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
On Wed, Jun 24, 2015 at 2:52 PM, Josh Berkus wrote: > OK, this is pretty bad in its real performance effects. On a workload > which is dominated by new connection creation, we've lost about 17% > throughput. Mistakes happen, but this is the kind of regression that automated performance testing could have caught. That's another area of testing that needs considerable improvement IMV. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
On 06/25/2015 08:12 AM, Tom Lane wrote: > Tatsuo Ishii writes: That means that load_relcache_init_file *always* decides that the init file is busted and silently(!) ignores it. So we're taking a nontrivial hit in backend startup speed as of the last set of minor releases. > >>> OK, this is pretty bad in its real performance effects. On a workload >>> which is dominated by new connection creation, we've lost about 17% >>> throughput. > >> Are we going to release 9.4.5 etc. soon? > > I can't see doing a release just for this. If we were due for releases > anyway, sure, but we've considerably overstressed our poor packagers of > late. Previous discussion was to the effect that we'd anticipate another > set of releases in a month or so, after some more multixact fixes have > landed. FWIW, I know users who will not update because of this regression. Consider applications which regularly need to spin up 200 new connections in 90 seconds due to usage peaks. On the other hand, do I want to do another update release right away? No. Hard place, meet rock. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
Tatsuo Ishii writes: >>> That means that load_relcache_init_file *always* decides that the init >>> file is busted and silently(!) ignores it. So we're taking a nontrivial >>> hit in backend startup speed as of the last set of minor releases. >> OK, this is pretty bad in its real performance effects. On a workload >> which is dominated by new connection creation, we've lost about 17% >> throughput. > Are we going to release 9.4.5 etc. soon? I can't see doing a release just for this. If we were due for releases anyway, sure, but we've considerably overstressed our poor packagers of late. Previous discussion was to the effect that we'd anticipate another set of releases in a month or so, after some more multixact fixes have landed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
> On 06/23/2015 04:44 PM, Tom Lane wrote: >> Chasing a problem identified by my Salesforce colleagues led me to the >> conclusion that my commit f3b5565dd ("Use a safer method for determining >> whether relcache init file is stale") is rather borked. It causes >> pg_trigger_tgrelid_tgname_index to be omitted from the relcache init file, >> because that index is not used by any syscache. I had been aware of that >> actually, but considered it a minor issue. It's not so minor though, >> because RelationCacheInitializePhase3 marks that index as nailed for >> performance reasons, and includes it in NUM_CRITICAL_LOCAL_INDEXES. >> That means that load_relcache_init_file *always* decides that the init >> file is busted and silently(!) ignores it. So we're taking a nontrivial >> hit in backend startup speed as of the last set of minor releases. > > OK, this is pretty bad in its real performance effects. On a workload > which is dominated by new connection creation, we've lost about 17% > throughput. Are we going to release 9.4.5 etc. soon? Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oh, this is embarrassing: init file logic is still broken
On 06/23/2015 04:44 PM, Tom Lane wrote: > Chasing a problem identified by my Salesforce colleagues led me to the > conclusion that my commit f3b5565dd ("Use a safer method for determining > whether relcache init file is stale") is rather borked. It causes > pg_trigger_tgrelid_tgname_index to be omitted from the relcache init file, > because that index is not used by any syscache. I had been aware of that > actually, but considered it a minor issue. It's not so minor though, > because RelationCacheInitializePhase3 marks that index as nailed for > performance reasons, and includes it in NUM_CRITICAL_LOCAL_INDEXES. > That means that load_relcache_init_file *always* decides that the init > file is busted and silently(!) ignores it. So we're taking a nontrivial > hit in backend startup speed as of the last set of minor releases. OK, this is pretty bad in its real performance effects. On a workload which is dominated by new connection creation, we've lost about 17% throughput. To test it, I ran pgbench -s 100 -j 2 -c 6 -r -C -S -T 1200 against a database which fits in shared_buffers on two different m3.large instances on AWS (across the network, not on unix sockets). A typical run on 9.3.6 looks like this: scaling factor: 100 query mode: simple number of clients: 6 number of threads: 2 duration: 1200 s number of transactions actually processed: 252322 tps = 210.267219 (including connections establishing) tps = 31958.233736 (excluding connections establishing) statement latencies in milliseconds: 0.002515\set naccounts 10 * :scale 0.000963\setrandom aid 1 :naccounts 19.042859 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; Whereas a typical run on 9.3.9 looks like this: scaling factor: 100 query mode: simple number of clients: 6 number of threads: 2 duration: 1200 s number of transactions actually processed: 208180 tps = 173.482259 (including connections establishing) tps = 31092.866153 (excluding connections establishing) statement latencies in milliseconds: 0.002518\set naccounts 10 * :scale 0.000988\setrandom aid 1 :naccounts 23.076961 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; Numbers are pretty consistent on four runs each on two different instances (+/- 4%), so I don't think this is Amazon variability we're seeing. I think the syscache invalidation is really costing us 17%. :-( -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers