Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?
As a matter of fact you get the same slow down after a rollback until autovacuum, and if autovacuum can't keep up... Actually, this is not what we observe. The performance goes back to the normal level immediately after committing or aborting the transaction. On Wed, Sep 25, 2013 at 1:30 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Tue, Sep 24, 2013 at 11:03 AM, didier did...@gmail.com wrote: Hi On Tue, Sep 24, 2013 at 5:01 PM, jes...@krogh.cc wrote: Apparently it is waiting for locks, cant the check be make in a non-blocking way, so if it ends up waiting for a lock then it just assumes non-visible and moves onto the next non-blocking? Not only, it's a reason but you can get the same slow down with only one client and a bigger insert. The issue is that a btree search for one value degenerate to a linear search other 1000 or more tuples. As a matter of fact you get the same slow down after a rollback until autovacuum, and if autovacuum can't keep up... Have you experimentally verified the last part? btree indices have some special kill-tuple code which should remove aborted tuples from the index the first time they are encountered, without need for a vacuum. Cheers, Jeff
[PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?
Hi all We're experiencing a very strange performance issue. Our setup is a bit more complicated, but we've managed to isolate and replicate the core problem. Here's what we observe: We took a strong machine (128 GB RAM, 8-core CPU, SSD drives...) and installed a fresh copy of PostgreSQL 9.2 (Ubuntu Server 12.04 LTS, default configuration). Then, we created a test database with the following schema: CREATE TABLE offers ( id bigserial NOT NULL, name character varying NOT NULL, category_id bigint NOT NULL, CONSTRAINT offers_pkey PRIMARY KEY (id) ); CREATE TABLE categories ( id bigserial NOT NULL, name character varying NOT NULL, CONSTRAINT categories_pkey PRIMARY KEY (id) ); and populated it with in the following way: insert into categories (name) select 'category_' || x from generate_series(1,1000) as x; insert into offers (name, category_id) select 'offer_' || x, floor(random() * 1000) + 1 from generate_series(1,1000*1000) as x; Finally, we created a python script to make simple queries in a loop: while True: id = random.randrange(1, 1000 * 1000) db.execute('select offers.id, offers.name, categories.id, categories.name from offers left join categories on categories.id = offers.category_id where offers.id = %s', (id,)) print db.fetchall() We start 20 instances simultaneously and measure performance: parallel -j 20 ./test.py -- $(seq 1 20) | pv -i1 -l /dev/null Normally we observe about 30k QPS what's a satisfying result (without any tuning at all). The problem occurs when we open a second console, start psql and type: pgtest= begin; insert into categories (name) select 'category_' || x from generate_series(1,1000) as x; We start a transaction and insert 1k records to the 'categories' table. After that performance of the benchmark described above immediately drops to about 1-2k QPS. That's 20-30 times! After closing the transaction (committing or aborting - doesn't matter) it immediately jumps back to 30k QPS. Restarting the running script and other simple tricks do not change anything. The hanging, open transaction is causing a huge slowdown. What's more when doing similar insert (even much larger) to the 'offers' table we do not observe this effect. We analyzed the problem a bit deeper and find out that the actual query execution times are not affected that much. They are constantly close to 0.5 ms. This can be observed in a server log (after enabling appropriate option) and this can be found in 'explain analyze...' result. Also the query plan returned do not change and looks optimal (pkey scan for 'offers' + pkey scan for 'categories'). After a few random thought we've finally turned the 'log_planner_stats' option and found out that the planner executions times are highly affected by the hanging transaction. Here's the typical output in the initial situation: 2013-09-17 21:54:59 UTC LOG: PLANNER STATISTICS 2013-09-17 21:54:59 UTC DETAIL: ! system usage stats: ! 0.000137 elapsed 0.00 user 0.00 system sec ! [2.169670 user 0.383941 sys total] ! 0/0 [0/11520] filesystem blocks in/out ! 0/0 [0/7408] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [1362/7648] voluntary/involuntary context switches And here's a typical output with a hanging transaction: 2013-09-17 21:56:12 UTC LOG: PLANNER STATISTICS 2013-09-17 21:56:12 UTC DETAIL: ! system usage stats: ! 0.027251 elapsed 0.008999 user 0.001000 system sec ! [32.722025 user 3.550460 sys total] ! 0/0 [0/115128] filesystem blocks in/out ! 0/0 [0/7482] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 1/6 [12817/80202] voluntary/involuntary context switches As you can see it takes over 100 times more time when the extra transaction is open! Any ideas why's that? It extremely affects total query time. I know that using prepared statements to solves the problem completely, so we can deal with it, but we've already spend a lot of time to go that far and I'm personally a bit curious what's the fundamental reason for such a behavior. I'll be very thankful for any explanation what's going on here! Thanks, BR
Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?
Hi Julian, Here's my complete python script: import psycopg2 import random import math import time connection = psycopg2.connect('host=localhost dbname=pgtest user=pgtest password=pgtest') cursor = connection.cursor() while True: id = random.randrange(1, 1000 * 1000) cursor.execute('select offers.id, offers.name, categories.id, categories.name from offers left join categories on categories.id = offers.category_id where offers.id = %s', (id,)) print cursor.fetchall() So I assume that each of 20 instances opens and uses a single transaction. I've just tested the options with connection.autocommit = True at the begging, but it does not change anything. Also in production (where we first noticed the problem) we use a new transaction for every select. I start 20 instances of this python script (I use pv to measure performance): parallel -j 20 ./test.py -- $(seq 1 20) | pv -i1 -l /dev/null And then in a second console, I open psql and type: pgtest= begin; insert into categories (name) select 'category_' || x from generate_series(1,1000) as x; The QPS displayed be the first command drops immediately 20-30 times and stays low as long as the transaction with insert is open. Here's the script that I use to initiate the database. You should be able to replicate the situation easily yourself if you want. drop database pgtest; drop user pgtest; create user pgtest with password 'pgtest'; create database pgtest; grant all privileges on database pgtest to pgtest; \connect pgtest CREATE TABLE categories ( id bigserial NOT NULL, name character varying NOT NULL, CONSTRAINT categories_pkey PRIMARY KEY (id) ); CREATE TABLE offers ( id bigserial NOT NULL, name character varying NOT NULL, category_id bigint NOT NULL, CONSTRAINT offers_pkey PRIMARY KEY (id) ); insert into categories (name) select 'category_' || x from generate_series(1,1000) as x; insert into offers (name, category_id) select 'offer_' || x, floor(random() * 1000) + 1 from generate_series(1,1000*1000) as x; alter table categories owner to pgtest; alter table offers owner to pgtest; Thanks, Bartek On Fri, Sep 20, 2013 at 3:42 AM, Julian temp...@internode.on.net wrote: Hi Bart, You are doing heavy random reads in addition to a 1000k row insert within a single transaction. Also it is not clear whether your query within the python loop is itself within a transaction. (i.e. pyscopg2.connection.autocommit to True, disables transactional queries). Depending on your pg adapter, it may open a transaction by default even though it may not be required. So please clarify whether this is the case. Regards, Julian. On 20/09/13 10:49, Bartłomiej Romański wrote: Hi all We're experiencing a very strange performance issue. Our setup is a bit more complicated, but we've managed to isolate and replicate the core problem. Here's what we observe: We took a strong machine (128 GB RAM, 8-core CPU, SSD drives...) and installed a fresh copy of PostgreSQL 9.2 (Ubuntu Server 12.04 LTS, default configuration). Then, we created a test database with the following schema: CREATE TABLE offers ( id bigserial NOT NULL, name character varying NOT NULL, category_id bigint NOT NULL, CONSTRAINT offers_pkey PRIMARY KEY (id) ); CREATE TABLE categories ( id bigserial NOT NULL, name character varying NOT NULL, CONSTRAINT categories_pkey PRIMARY KEY (id) ); and populated it with in the following way: insert into categories (name) select 'category_' || x from generate_series(1,1000) as x; insert into offers (name, category_id) select 'offer_' || x, floor(random() * 1000) + 1 from generate_series(1,1000*1000) as x; Finally, we created a python script to make simple queries in a loop: while True: id = random.randrange(1, 1000 * 1000) db.execute('select offers.id http://offers.id/, offers.name http://offers.name/, categories.id http://categories.id/, categories.name http://categories.name/ from offers left join categories on categories.id http://categories.id/ = offers.category_id where offers.id http://offers.id/ = %s', (id,)) print db.fetchall() We start 20 instances simultaneously and measure performance: parallel -j 20 ./test.py -- $(seq 1 20) | pv -i1 -l /dev/null Normally we observe about 30k QPS what's a satisfying result (without any tuning at all). The problem occurs when we open a second console, start psql and type: pgtest= begin; insert into categories (name) select 'category_' || x from generate_series(1,1000) as x; We start a transaction and insert 1k records to the 'categories' table. After that performance of the benchmark described above immediately drops to about 1-2k QPS. That's 20-30 times! After closing the transaction (committing or aborting - doesn't matter) it immediately jumps back to 30k QPS. Restarting the running script and other simple tricks do not change anything. The hanging, open
[PERFORM] Re: Planner performance extremely affected by an hanging transaction (20-30 times)?
What's more doing similar insert (even much larger) to the 'offers' table does not affect the benchmark results in any significant way... Just want clarify myself here: Insert to 'offers' table does not cause the slowdown. Only insert to 'categories' causes the problem.