Re: [PERFORM] Delete performance
On 05/30/2011 08:08 PM, Jarrod Chesney wrote: My database uses joined table inheritance and my server version is 9.0 I have about 120,000 records in the table that everything else inherits from, if i truncate-cascaded this table it happens almost instantly. If i run 30,000 prepared "DELETE FROM xxx WHERE "ID" = ?" commands it takes close to 10 minutes. My foreign keys to the base table are all set with "ON DELETE CASCADE". You may also want to make them DEFERRABLE and then use "SET CONSTRAINTS ALL DEFERRABLE" so that the constraint checking all happens at one time. This will cause more memory to be used, but all the constraint related work will happen in a batch. You mentioned inheritance. That can cause some unexpected problems sometimes. You might want to do: EXPLAIN DELETE FROM ... To see how this is executing. EXPLAIN works fine on DELETE statements, too, and it may highlight something strange about how the deletion is happening. If you can, use EXPLAIN ANALYZE, but note that this will actually execute the statement--the deletion will happen, it's not just a test. There may be a problem with the query plan for the deletion that's actually causing the issue here, such as missing the right indexes. If you have trouble reading it, http://explain.depesz.com/ is a good web resources to help break down where the time is going. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Delete performance
I'm executing 30,000 single delete statements in one transaction. At this point i'm looking into combining the multiple deletes into one statement and breaking my big transaction into smaller ones of about 100 deletes or so. On 01/06/2011, at 11:40 AM, Craig Ringer wrote: > On 1/06/2011 7:11 AM, Pierre C wrote: >>> If i run 30,000 prepared "DELETE FROM xxx WHERE "ID" = ?" commands it >>> takes close to 10 minutes. >> >> Do you run those in a single transaction or do you use one transaction >> per DELETE ? >> >> In the latter case, postgres will ensure each transaction is commited to >> disk, at each commit. Since this involves waiting for the physical I/O >> to happen, it is slow. If you do it 30.000 times, it will be 30.000 >> times slow. > > Not only that, but if you're doing it via some application the app has to > wait for Pg to respond before it can send the next query. This adds even more > delay, as do all the processor switches between Pg and your application. > > If you really must issue individual DELETE commands one-by-one, I *think* you > can use synchronous_commit=off or > > SET LOCAL synchronous_commit TO OFF; > > See: > > http://www.postgresql.org/docs/current/static/runtime-config-wal.html > > > -- > Craig Ringer > > Tech-related writing at http://soapyfrogs.blogspot.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Delete performance
On 1/06/2011 7:11 AM, Pierre C wrote: If i run 30,000 prepared "DELETE FROM xxx WHERE "ID" = ?" commands it takes close to 10 minutes. Do you run those in a single transaction or do you use one transaction per DELETE ? In the latter case, postgres will ensure each transaction is commited to disk, at each commit. Since this involves waiting for the physical I/O to happen, it is slow. If you do it 30.000 times, it will be 30.000 times slow. Not only that, but if you're doing it via some application the app has to wait for Pg to respond before it can send the next query. This adds even more delay, as do all the processor switches between Pg and your application. If you really must issue individual DELETE commands one-by-one, I *think* you can use synchronous_commit=off or SET LOCAL synchronous_commit TO OFF; See: http://www.postgresql.org/docs/current/static/runtime-config-wal.html -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash Anti Join performance degradation
2011/5/31 Robert Haas : > On Thu, May 26, 2011 at 8:33 AM, panam wrote: >> Any third party confirmation? > > Yeah, it definitely looks like there is some kind of bug here. Or if > not a bug, then a very surprising feature. EXPLAIN ANALYZE outputs > from your proposed test attached. Here's a unified diff of the two > outputs: > > > QUERY PLAN > -- > - Seq Scan on box b (cost=0.00..3669095.76 rows=128 width=8) (actual > time=0.147..431517.693 rows=128 loops=1) > + Seq Scan on box b (cost=0.00..3669095.76 rows=128 width=8) (actual > time=0.047..6938.165 rows=128 loops=1) > SubPlan 1 > - -> Hash Anti Join (cost=14742.77..28664.79 rows=19239 width=8) > (actual time=2960.176..3370.425 rows=1 loops=128) > + -> Hash Anti Join (cost=14742.77..28664.79 rows=19239 width=8) > (actual time=48.385..53.361 rows=1 loops=128) > Hash Cond: (m1.box_id = m2.box_id) > Join Filter: (m1.id < m2.id) > - -> Bitmap Heap Scan on message m1 (cost=544.16..13696.88 > rows=28858 width=16) (actual time=2.320..6.204 rows=18487 loops=128) > + -> Bitmap Heap Scan on message m1 (cost=544.16..13696.88 > rows=28858 width=16) (actual time=1.928..5.502 rows=17875 loops=128) > Recheck Cond: (box_id = b.id) > - -> Bitmap Index Scan on "message_box_Idx" > (cost=0.00..536.94 rows=28858 width=0) (actual time=2.251..2.251 > rows=18487 loops=128) > + -> Bitmap Index Scan on "message_box_Idx" > (cost=0.00..536.94 rows=28858 width=0) (actual time=1.797..1.797 > rows=18487 loops=128) > Index Cond: (box_id = b.id) > - -> Hash (cost=13696.88..13696.88 rows=28858 width=16) > (actual time=12.632..12.632 rows=19720 loops=120) > - Buckets: 4096 Batches: 4 (originally 2) Memory Usage: > 1787kB > - -> Bitmap Heap Scan on message m2 > (cost=544.16..13696.88 rows=28858 width=16) (actual time=1.668..6.619 > rows=19720 loops=120) > + -> Hash (cost=13696.88..13696.88 rows=28858 width=16) > (actual time=11.603..11.603 rows=20248 loops=113) > + Buckets: 4096 Batches: 4 (originally 2) Memory Usage: > 1423kB > + -> Bitmap Heap Scan on message m2 > (cost=544.16..13696.88 rows=28858 width=16) (actual time=1.838..6.886 > rows=20248 loops=113) > Recheck Cond: (box_id = b.id) > - -> Bitmap Index Scan on "message_box_Idx" > (cost=0.00..536.94 rows=28858 width=0) (actual time=1.602..1.602 > rows=19720 loops=120) > + -> Bitmap Index Scan on "message_box_Idx" > (cost=0.00..536.94 rows=28858 width=0) (actual time=1.743..1.743 > rows=20903 loops=113) > Index Cond: (box_id = b.id) > - Total runtime: 431520.186 ms > + Total runtime: 6940.369 ms > > That's pretty odd. Yes, while here I noticed that the query was long to be killed. I added a CHECK_FOR_INTERRUPT() in the for(;;) loop in nodeHashjoin.c. It fixes the delay when trying to kill but I don't know about performance impact this can have in this place of the code. -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c index 26da3b2..09c56f9 100644 --- a/src/backend/executor/nodeHashjoin.c +++ b/src/backend/executor/nodeHashjoin.c @@ -14,6 +14,7 @@ */ #include "postgres.h" +#include "miscadmin.h" #include "executor/executor.h" #include "executor/hashjoin.h" @@ -108,6 +109,7 @@ ExecHashJoin(HashJoinState *node) */ for (;;) { + CHECK_FOR_INTERRUPTS(); switch (node->hj_JoinState) { case HJ_BUILD_HASHTABLE: -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Delete performance
If i run 30,000 prepared "DELETE FROM xxx WHERE "ID" = ?" commands it takes close to 10 minutes. Do you run those in a single transaction or do you use one transaction per DELETE ? In the latter case, postgres will ensure each transaction is commited to disk, at each commit. Since this involves waiting for the physical I/O to happen, it is slow. If you do it 30.000 times, it will be 30.000 times slow. Note that you should really do : DELETE FROM table WHERE id IN (huge list of ids). or DELETE FROM table JOIN VALUES (list of ids) ON (...) Also, check your foreign keys using cascading deletes have indexes in the referencing tables. Without an index, finding the rows to cascade-delete will be slow. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash Anti Join performance degradation
On Thu, May 26, 2011 at 8:33 AM, panam wrote: > Any third party confirmation? Yeah, it definitely looks like there is some kind of bug here. Or if not a bug, then a very surprising feature. EXPLAIN ANALYZE outputs from your proposed test attached. Here's a unified diff of the two outputs: QUERY PLAN -- - Seq Scan on box b (cost=0.00..3669095.76 rows=128 width=8) (actual time=0.147..431517.693 rows=128 loops=1) + Seq Scan on box b (cost=0.00..3669095.76 rows=128 width=8) (actual time=0.047..6938.165 rows=128 loops=1) SubPlan 1 - -> Hash Anti Join (cost=14742.77..28664.79 rows=19239 width=8) (actual time=2960.176..3370.425 rows=1 loops=128) + -> Hash Anti Join (cost=14742.77..28664.79 rows=19239 width=8) (actual time=48.385..53.361 rows=1 loops=128) Hash Cond: (m1.box_id = m2.box_id) Join Filter: (m1.id < m2.id) - -> Bitmap Heap Scan on message m1 (cost=544.16..13696.88 rows=28858 width=16) (actual time=2.320..6.204 rows=18487 loops=128) + -> Bitmap Heap Scan on message m1 (cost=544.16..13696.88 rows=28858 width=16) (actual time=1.928..5.502 rows=17875 loops=128) Recheck Cond: (box_id = b.id) - -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..536.94 rows=28858 width=0) (actual time=2.251..2.251 rows=18487 loops=128) + -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..536.94 rows=28858 width=0) (actual time=1.797..1.797 rows=18487 loops=128) Index Cond: (box_id = b.id) - -> Hash (cost=13696.88..13696.88 rows=28858 width=16) (actual time=12.632..12.632 rows=19720 loops=120) - Buckets: 4096 Batches: 4 (originally 2) Memory Usage: 1787kB - -> Bitmap Heap Scan on message m2 (cost=544.16..13696.88 rows=28858 width=16) (actual time=1.668..6.619 rows=19720 loops=120) + -> Hash (cost=13696.88..13696.88 rows=28858 width=16) (actual time=11.603..11.603 rows=20248 loops=113) + Buckets: 4096 Batches: 4 (originally 2) Memory Usage: 1423kB + -> Bitmap Heap Scan on message m2 (cost=544.16..13696.88 rows=28858 width=16) (actual time=1.838..6.886 rows=20248 loops=113) Recheck Cond: (box_id = b.id) - -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..536.94 rows=28858 width=0) (actual time=1.602..1.602 rows=19720 loops=120) + -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..536.94 rows=28858 width=0) (actual time=1.743..1.743 rows=20903 loops=113) Index Cond: (box_id = b.id) - Total runtime: 431520.186 ms + Total runtime: 6940.369 ms That's pretty odd. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company QUERY PLAN -- Seq Scan on box b (cost=0.00..3669095.76 rows=128 width=8) (actual time=0.147..431517.693 rows=128 loops=1) SubPlan 1 -> Hash Anti Join (cost=14742.77..28664.79 rows=19239 width=8) (actual time=2960.176..3370.425 rows=1 loops=128) Hash Cond: (m1.box_id = m2.box_id) Join Filter: (m1.id < m2.id) -> Bitmap Heap Scan on message m1 (cost=544.16..13696.88 rows=28858 width=16) (actual time=2.320..6.204 rows=18487 loops=128) Recheck Cond: (box_id = b.id) -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..536.94 rows=28858 width=0) (actual time=2.251..2.251 rows=18487 loops=128) Index Cond: (box_id = b.id) -> Hash (cost=13696.88..13696.88 rows=28858 width=16) (actual time=12.632..12.632 rows=19720 loops=120) Buckets: 4096 Batches: 4 (originally 2) Memory Usage: 1787kB -> Bitmap Heap Scan on message m2 (cost=544.16..13696.88 rows=28858 width=16) (actual time=1.668..6.619 rows=19720 loops=120) Recheck Cond: (box_id = b.id) -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..536.94 rows=28858 width=0) (actual time=1.602..1.602 rows=19720 loops=120) Index Cond: (box_id = b.id) Total runtime: 431520.186 ms QUERY PLAN -- Seq Scan on box b (cost=0.00..366
Re: [PERFORM] picking a filesystem
On Tue, May 31, 2011 at 8:35 AM, Robert Haas wrote: > > So if you're running a RHEL5.4 or RHEL5.5 system, are you basically > stuck with ext3? I'm not sure if I'm remembering correctly, but ISTM > that you've been uncomfortable with BOTH ext4 and XFS prior to RHEL6; > but OK with both beginning with RHEL6. > > Also, any tips on mount options for XFS/ext4/ext3? > Greg's book has a whole chapter that goes through the pros and cons of each type of fs and offers suggestions for configuring most of them for postgres. I haven't actually read the chapter in detail yet, so I won't try to summarize its content here. It appeared to be pretty comprehensive during my quick scan of the chapter
Re: [PERFORM] picking a filesystem
On 05/31/2011 10:35 AM, Robert Haas wrote: So if you're running a RHEL5.4 or RHEL5.5 system, are you basically stuck with ext3? I'm not sure if I'm remembering correctly, but ISTM that you've been uncomfortable with BOTH ext4 and XFS prior to RHEL6; but OK with both beginning with RHEL6. We haven't had any problems (yet) running XFS on CentOS 5.5. Sure, it doesn't have a lot of the recent kernel advances that made it faster, but it out-performed our EXT3 filesystem in some cases by 40%. Also, any tips on mount options for XFS/ext4/ext3? We got the best performance by increasing the agcount during formatting. But we also used some of the advanced logging options. I set the size to 128m, enabled lazy-count to reduce logging overhead, and set version to 2 so we could use a bigger log buffer in the mount options. So: mkfs.xfs -d agcount=256 -l size=128m,lazy-count=1,version=2 For mounting, aside from the usual noatime and nodiratime, we set the allocsize to 256m to reduce fragmentation, maxed out the logbufs at 8, and the logbsize to 256k to improve file deletion performance, and set the attr2 option to better handle inodes. So: mount -o allocsize=256m,logbufs=8,noatime,nodiratime,attr2,logbsize=256k Maybe more recent XFS kernels have other options we're not aware of, but we've had good luck with these so far. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@peak6.com __ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] picking a filesystem
On Wed, May 25, 2011 at 4:41 PM, Greg Smith wrote: > On 05/23/2011 06:16 PM, John Rouillard wrote: >> >> OS: centos 5.5 >> Filesystem: data - ext4 (note 4 not 3); 6.6T formatted >> wal - ext4; 1.5T formatted >> Raid: data - level 10, 8 disk wd2003; controller LSI MegaRAID SAS 9260-4i >> wal - level 1, 2 disk wd2003; controller LSI MegaRAID SAS 9260-4i >> >> Could it be an ext4 issue? It seems that ext4 may still be at the >> bleeding edge for postgres use. >> > > I would not trust ext4 on CentOS 5.5 at all. ext4 support in 5.5 is labeled > by RedHat as being in "Technology Preview" state. I believe that if you had > a real RedHat system instead of CentOS kernel, you'd discover it's hard to > even get it installed--you need to basically say "yes, I know it's not for > production, I want it anyway" to get preview packages. It's not really > intended for production use. > > What I'm hearing from people is that they run into the occasional ext4 bug > with PostgreSQL, but the serious ones aren't happening very often now, on > systems running RHEL6 or Debian Squeeze. Those kernels are way, way ahead > of the ext4 backport in RHEL5 based systems, and they're just barely stable. So if you're running a RHEL5.4 or RHEL5.5 system, are you basically stuck with ext3? I'm not sure if I'm remembering correctly, but ISTM that you've been uncomfortable with BOTH ext4 and XFS prior to RHEL6; but OK with both beginning with RHEL6. Also, any tips on mount options for XFS/ext4/ext3? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Strange behavior of child table.
Hi All, I have created partition on table Round_Action , which has 3 child partition tables. When I am firing a simple select query with limit on parent table it is taking huge time to execute. But when I am firing this query directly on child table it is taking few milliseconds. EXP. select * from Round_Action where action_id =5 limit 100 → execution time 80 sec select * from Round_Action_CH1 action_id =5 limit 100 → execution time 0.1 sec Round_Action is the parent table and has no record in the tables, all the records are lying in child tables. Table is having index on action_id. Partition is trigger based. Postgres Version : (PostgreSQL) 8.4.6 Why there is difference in execution time? What I am doing wrong? -- Thanks & regards, JENISH
Re: [PERFORM] The shared buffers challenge
On Fri, May 27, 2011 at 7:19 PM, Jeff Davis wrote: > On Thu, 2011-05-26 at 09:31 -0500, Merlin Moncure wrote: >> Where they are most helpful is for masking of i/o if >> a page gets dirtied >1 times before it's written out to the heap > > Another possible benefit of higher shared_buffers is that it may reduce > WAL flushes. A page cannot be evicted from shared_buffers until the WAL > has been flushed up to the page's LSN ("WAL before data"); so if there > is memory pressure to evict dirty buffers, it may cause extra WAL > flushes. > > I'm not sure what the practical effects of this are, however, but it > might be an interesting thing to test. Hm, I bet it could make a fairly big difference if wal data is not on a separate volume. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Delete performance
9.0rc1 ? You know that the stable 9.0 has been out for quite a while now. Its not going to affect the delete speed in any way, but I would generally advice you to upgrade it to the lates 9.0.x As for the delete it self, check if you have indices on the tables that refer the main table on the referred column. Often times that's the issue. Other thing is , number of triggers on the other tables. -- GJ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance