Re: [PERFORM] IDE x SAS RAID 0 on HP DL 380 G5 P400i controller performance problem
If you are using ext3, your performance on the RAID card may also improve if the postgres xlog is not on the same partition as the data. Otherwise, for every transaction commit, all of the data on the whole partition will have to be sync()'d not just the xlog. Also, what is the performance difference between all the inserts in one script if you do: * all your statements in the script vs. * first line is "BEGIN;" then all your statements, then "COMMIT;" at the end? If these two are about the same on your old IDE drive, then your I/O stack (file system + OS + hardware) is lying to you about fsync(). The latter should be a lot faster on your RAID card if write-back caching is not on. On Jul 19, 2010, at 12:53 PM, Daniel Ferreira de Lima wrote: That said, a look into the write-caching+BBU policy on your controller is worthwhile. If you're running this application successfully on some hardware but not others, that could be a source for the difference. I think it's really a BBU/BBWC problem. The tests that we made in the lab with HP Blade G5 (G6 doesn't support kernel version 2.4) turning the battery off show us the 'same' performance of the "pizza-box" HP DL 380 G5.. an old joke. 40 secs to 2000 insertions: like a chariot. We're finding the cache expansion and batteries (and.. why it's not default???). I think that this thread is now finished. Thanks! -- Daniel Ferreira
Re: [PERFORM] potential performance gain by query planner optimization
Hello 2010/7/20 Kneringer, Armin : > Hi there. > > I think I found a potential performance gain if the query planner would be > optimized. All Tests has been performed with 8.4.1 (and earlier versions) on > CentOS 5.3 (x64) > > The following query will run on my database (~250 GB) for ca. 1600 seconds > and the sort will result in a disk merge deploying ca. 200 GB of data to the > local disk (ca. 180.000 tmp-files) can you try show check explain with set enable_hashjoin to off; ? Regards Pavel Stehule > > explain SELECT DISTINCT t4.objid > FROM fscsubfile t4, cooobject t6 > NOT EXISTS ( > WHERE t6.objid = t4.objid AND > t4.fileresporgid = 573936067464397682 AND > NOT EXISTS ( > SELECT 1 > FROM ataggval q1_1, > atdateval t5 > WHERE q1_1.objid = t4.objid AND > q1_1.attrid = 281479288456451 AND > q1_1.aggrid = 0 AND > t5.aggrid = q1_1.aggval AND > t5.objid = t4.objid AND > t5.attrid = 281479288456447 ) AND > ((t6.objclassid IN (285774255832590,285774255764301))) AND > ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) > ORDER BY t4.objid; > > > QUERY PLAN > -- > Unique (cost=2592137103.99..2592137104.00 rows=1 width=8) > -> Sort (cost=2592137103.99..2592137104.00 rows=1 width=8) > Sort Key: t4.objid > -> Nested Loop (cost=1105592553.38..2592137103.98 rows=1 width=8) > -> Hash Anti Join (cost=1105592553.38..2592137095.75 rows=1 > width=8) > Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid = > t5.objid)) > -> Bitmap Heap Scan on fscsubfile t4 > (cost=154.42...14136.40 rows=5486 width=8) > Recheck Cond: ((fileresporgid = > 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND > (objid < 573936101807357952::bigint)) > -> Bitmap Index Scan on ind_fscsubfile_filerespons > (cost=0.00..153.05 rows=5486 width=0) > Index Cond: ((fileresporgid = > 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND > (objid < 573936101807357952::bigint)) > -> Hash (cost=11917516.57..11917516.57 rows=55006045159 > width=16) > -> Nested Loop (cost=0.00..11917516.57 > rows=55006045159 width=16) > -> Seq Scan on atdateval t5 > (cost=0.00...294152.40 rows=1859934 width=12) > Filter: (attrid = > 281479288456447::bigint) > -> Index Scan using ind_ataggval on ataggval > q1_1 (cost=0.00..6.20 rows=4 width=12) > Index Cond: ((q1_1.attrid = > 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid)) > Filter: (q1_1.aggrid = 0) > -> Index Scan using cooobjectix on cooobject t6 > (cost=0.00..8.22 rows=1 width=8) > Index Cond: (t6.objid = t4.objid) > Filter: (t6.objclassid = ANY > ('{285774255832590,285774255764301}'::bigint[])) > (20 rows) > > > As the disks pace is limited on my test system I can't provide the "explain > analyze" output > If I change the query as follows the query takes only 12 seconds and only > needs 2 tmp files for sorting. > (Changed lines are marked with [!] as I don't know HTML-Mails will be > delivered without conversion > > explain SELECT DISTINCT t4.objid > FROM fscsubfile t4, cooobject t6 > WHERE t6.objid = t4.objid AND > t4.fileresporgid = 573936067464397682 AND > NOT EXISTS ( > SELECT 1 > FROM ataggval q1_1, > atdateval t5 > WHERE q1_1.objid = t4.objid AND > q1_1.attrid = 281479288456451 AND > q1_1.aggrid = 0 AND > t5.aggrid = q1_1.aggval AND > t5.objid = q1_1.objid AND [!] > t5.attrid = 281479288456447 ) AND > ((t6.objclassid IN (285774255832590,285774255764301))) AND > ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) > ORDER BY t4.objid; > > QUERY PLAN > -- > Unique (cost=918320.29..971968.88 rows=1 width=8) > -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8) > -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8) > Merge Cond: (t4.objid = q1_1.objid) > -> Index Scan using ind_fscsubfile_filerespons on fscsubfile > t4 (cost=0.00..19016.05 rows=5486 width=8) > Index Cond: ((fileresporgid = 573936067464397682::bigint) > AND (objid > 57
[PERFORM] potential performance gain by query planner optimization
Hi there. I think I found a potential performance gain if the query planner would be optimized. All Tests has been performed with 8.4.1 (and earlier versions) on CentOS 5.3 (x64) The following query will run on my database (~250 GB) for ca. 1600 seconds and the sort will result in a disk merge deploying ca. 200 GB of data to the local disk (ca. 180.000 tmp-files) explain SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject t6 NOT EXISTS ( WHERE t6.objid = t4.objid AND t4.fileresporgid = 573936067464397682 AND NOT EXISTS ( SELECT 1 FROM ataggval q1_1, atdateval t5 WHERE q1_1.objid = t4.objid AND q1_1.attrid = 281479288456451 AND q1_1.aggrid = 0 AND t5.aggrid = q1_1.aggval AND t5.objid = t4.objid AND t5.attrid = 281479288456447 ) AND ((t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) ORDER BY t4.objid; QUERY PLAN -- Unique (cost=2592137103.99..2592137104.00 rows=1 width=8) -> Sort (cost=2592137103.99..2592137104.00 rows=1 width=8) Sort Key: t4.objid -> Nested Loop (cost=1105592553.38..2592137103.98 rows=1 width=8) -> Hash Anti Join (cost=1105592553.38..2592137095.75 rows=1 width=8) Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid)) -> Bitmap Heap Scan on fscsubfile t4 (cost=154.42...14136.40 rows=5486 width=8) Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) -> Bitmap Index Scan on ind_fscsubfile_filerespons (cost=0.00..153.05 rows=5486 width=0) Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) -> Hash (cost=11917516.57..11917516.57 rows=55006045159 width=16) -> Nested Loop (cost=0.00..11917516.57 rows=55006045159 width=16) -> Seq Scan on atdateval t5 (cost=0.00...294152.40 rows=1859934 width=12) Filter: (attrid = 281479288456447::bigint) -> Index Scan using ind_ataggval on ataggval q1_1 (cost=0.00..6.20 rows=4 width=12) Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid)) Filter: (q1_1.aggrid = 0) -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) Index Cond: (t6.objid = t4.objid) Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[])) (20 rows) As the disks pace is limited on my test system I can't provide the "explain analyze" output If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files for sorting. (Changed lines are marked with [!] as I don't know HTML-Mails will be delivered without conversion explain SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject t6 WHERE t6.objid = t4.objid AND t4.fileresporgid = 573936067464397682 AND NOT EXISTS ( SELECT 1 FROM ataggval q1_1, atdateval t5 WHERE q1_1.objid = t4.objid AND q1_1.attrid = 281479288456451 AND q1_1.aggrid = 0 AND t5.aggrid = q1_1.aggval AND t5.objid = q1_1.objid AND [!] t5.attrid = 281479288456447 ) AND ((t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952)) ORDER BY t4.objid; QUERY PLAN -- Unique (cost=918320.29..971968.88 rows=1 width=8) -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8) -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8) Merge Cond: (t4.objid = q1_1.objid) -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8) Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint)) -> Materialize (cost=912418.42..956599.36 rows=22689 width=8) -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8) Merge Cond: ((t5.objid = q1
Re: [PERFORM] dates and partitioning
Tuesday, July 20, 2010, 5:36:07 PM you wrote: > 2) the above query creates a plan that does a sequential scan & filter on > every partition. Why won't it only hit the correct partitions? Is it due to > the way the date was specified? or maybe the "at time zone" syntax? Quick guess: How is your 'constraint_exclusion'-setting? Which version of postgres? -- Jochen Erwied | home: joc...@erwied.eu +49-208-38800-18, FAX: -19 Sauerbruchstr. 17 | work: j...@mbs-software.de +49-2151-7294-24, FAX: -50 D-45470 Muelheim | mobile: jochen.erw...@vodafone.de +49-173-5404164 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dates and partitioning
On Tue, 2010-07-20 at 09:36 -0600, Kevin Kempter wrote: > Hi All; > > we have a table partitioned by day, the check constraint on the child tables > looks like this (this is the may 31st partition): > > CHECK > (stime >= '2010-05-30 00:00:00+00'::timestamp with time zone > AND stime <= '2010-05-30 23:59:59+00'::timestamp with time zone) > > > We have a python based app that creates code like this: > > select > * > from > table_a a, > table_b b, > table_d d > where a.id = b.id > and b.id = d.id > and stime >= timestamp %s at time zone \'UTC\' > and stime < timestamp %s at time zone \'UTC\' > and stime >= timestamp %s at time zone d.name > and stime < timestamp %s at time zone d.name >... > > > so here's my questions: > > 1) the above app generated query pshows up like this in pg_stat_activity: > > and stime >= timestamp E'2010-07-17' at time zone 'UTC' > and stime < timestamp E'2010-07-21' at time zone 'UTC' > and stime >= timestamp E'2010-07-18' at time zone d.name > and stime < timestamp E'2010-07-19' at time zone d.name > > what's the E'date' from? and why does it show up this way? That's E is an escape character. Python is likely putting that in. See http://www.postgresql.org/docs/8.4/static/sql-syntax-lexical.html - section 4.1.2.2 > > 2) the above query creates a plan that does a sequential scan & filter on > every partition. Why won't it only hit the correct partitions? Is it due to > the way the date was specified? or maybe the "at time zone" syntax? Do you have constraint_exclusion turned on? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] dates and partitioning
Hi All; we have a table partitioned by day, the check constraint on the child tables looks like this (this is the may 31st partition): CHECK (stime >= '2010-05-30 00:00:00+00'::timestamp with time zone AND stime <= '2010-05-30 23:59:59+00'::timestamp with time zone) We have a python based app that creates code like this: select * from table_a a, table_b b, table_d d where a.id = b.id and b.id = d.id and stime >= timestamp %s at time zone \'UTC\' and stime < timestamp %s at time zone \'UTC\' and stime >= timestamp %s at time zone d.name and stime < timestamp %s at time zone d.name ... so here's my questions: 1) the above app generated query pshows up like this in pg_stat_activity: and stime >= timestamp E'2010-07-17' at time zone 'UTC' and stime < timestamp E'2010-07-21' at time zone 'UTC' and stime >= timestamp E'2010-07-18' at time zone d.name and stime < timestamp E'2010-07-19' at time zone d.name what's the E'date' from? and why does it show up this way? 2) the above query creates a plan that does a sequential scan & filter on every partition. Why won't it only hit the correct partitions? Is it due to the way the date was specified? or maybe the "at time zone" syntax? Thanks in advance... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance