Hello
2010/7/20 Kneringer, Armin <[email protected]>:
> Hi there.
>
> I think I found a potential performance gain if the query planner would be
> optimized. All Tests has been performed with 8.4.1 (and earlier versions) on
> CentOS 5.3 (x64)
>
> The following query will run on my database (~250 GB) for ca. 1600 seconds
> and the sort will result in a disk merge deploying ca. 200 GB of data to the
> local disk (ca. 180.000 tmp-files)
can you try show check explain with set enable_hashjoin to off; ?
Regards
Pavel Stehule
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> NOT EXISTS (
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t4.objid AND
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = t4.objid AND
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=2592137103.99..2592137104.00 rows=1 width=8)
> -> Sort (cost=2592137103.99..2592137104.00 rows=1 width=8)
> Sort Key: t4.objid
> -> Nested Loop (cost=1105592553.38..2592137103.98 rows=1 width=8)
> -> Hash Anti Join (cost=1105592553.38..2592137095.75 rows=1
> width=8)
> Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid =
> t5.objid))
> -> Bitmap Heap Scan on fscsubfile t4
> (cost=154.42...14136.40 rows=5486 width=8)
> Recheck Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND
> (objid < 573936101807357952::bigint))
> -> Bitmap Index Scan on ind_fscsubfile_filerespons
> (cost=0.00..153.05 rows=5486 width=0)
> Index Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND
> (objid < 573936101807357952::bigint))
> -> Hash (cost=11917516.57..11917516.57 rows=55006045159
> width=16)
> -> Nested Loop (cost=0.00..11917516.57
> rows=55006045159 width=16)
> -> Seq Scan on atdateval t5
> (cost=0.00...294152.40 rows=1859934 width=12)
> Filter: (attrid =
> 281479288456447::bigint)
> -> Index Scan using ind_ataggval on ataggval
> q1_1 (cost=0.00..6.20 rows=4 width=12)
> Index Cond: ((q1_1.attrid =
> 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid))
> Filter: (q1_1.aggrid = 0)
> -> Index Scan using cooobjectix on cooobject t6
> (cost=0.00..8.22 rows=1 width=8)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> (20 rows)
>
>
> As the disks pace is limited on my test system I can't provide the "explain
> analyze" output
> If I change the query as follows the query takes only 12 seconds and only
> needs 2 tmp files for sorting.
> (Changed lines are marked with [!!!!!] as I don't know HTML-Mails will be
> delivered without conversion
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t4.objid AND
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = q1_1.objid AND [!!!!!]
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=918320.29..971968.88 rows=1 width=8)
> -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8)
> -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8)
> Merge Cond: (t4.objid = q1_1.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile
> t4 (cost=0.00..19016.05 rows=5486 width=8)
> Index Cond: ((fileresporgid = 573936067464397682::bigint)
> AND (objid > 573936097512390656::bigint) AND (objid <
> 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8)
> -> Merge Join (cost=912418.42..956372.47 rows=22689
> width=8)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid
> = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934
> width=12)
> Sort Key: t5.objid, t5.aggrid
> -> Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12)
> Recheck Cond: (attrid =
> 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval
> (cost=0.00..43284.08 rows=1859934 width=0)
> Index Cond: (attrid =
> 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97
> rows=1701738 width=12)
> -> Sort (cost=510392.25..514646.59
> rows=1701738 width=12)
> Sort Key: q1_1.objid, q1_1.aggval
> -> Bitmap Heap Scan on ataggval q1_1
> (cost=44666.00..305189.47 rows=1701738 width=12)
> Recheck Cond: (attrid =
> 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on
> ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0)
> Index Cond: (attrid =
> 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22
> rows=1 width=8)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
> explain analyze SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t4.objid AND
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = q1_1.objid AND [!!!!!]
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=918320.29..971968.88 rows=1 width=8) (actual
> time=12079.598..12083.048 rows=64 loops=1)
> -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8) (actual
> time=12079.594..12083.010 rows=64 loops=1)
> -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8)
> (actual time=12037.524..12081.989 rows=108 loops=1)
> Merge Cond: (t4.objid = q1_1.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile
> t4 (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.073..83.498
> rows=63436 loops=1)
> Index Cond: ((fileresporgid = 573936067464397682::bigint)
> AND (objid > 573936097512390656::bigint) AND (objid <
> 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8)
> (actual time=8866.253..11753.055 rows=1299685 loops=1)
> -> Merge Join (cost=912418.42..956372.47 rows=22689
> width=8) (actual time=8866.246..11413.397 rows=1299685 loops=1)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid
> = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934
> width=12) (actual time=3133.362..3774.076 rows=1299685 loops=1)
> Sort Key: t5.objid, t5.aggrid
> Sort Method: external merge Disk: 47192kB
> -> Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12) (actual
> time=282.454..1079.038 rows=1857906 loops=1)
> Recheck Cond: (attrid =
> 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval
> (cost=0.00..43284.08 rows=1859934 width=0) (actual time=258.749...258.749
> rows=1857906 loops=1)
> Index Cond: (attrid =
> 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97
> rows=1701738 width=12) (actual time=5732.872..6683.784 rows=1299685 loops=1)
> -> Sort (cost=510392.25..514646.59
> rows=1701738 width=12) (actual time=5732.866..6387.188 rows=1299685 loops=1)
> Sort Key: q1_1.objid, q1_1.aggval
> Sort Method: external merge Disk:
> 39920kB
> -> Bitmap Heap Scan on ataggval q1_1
> (cost=44666.00..305189.47 rows=1701738 width=12) (actual
> time=1644.983..3634.044 rows=1857906 loops=1)
> Recheck Cond: (attrid =
> 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on
> ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) (actual
> time=1606.325..1606.325 rows=1877336 loops=1)
> Index Cond: (attrid =
> 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22
> rows=1 width=8) (actual time=0.009..0.009 rows=1 loops=108)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12108.663 ms
> (29 rows)
>
>
> Another way to optimize my query is to change it as follows:
> (Once again changes are marked with [!!!!!]
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t5.objid AND [!!!!!]
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = t4.objid AND
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=916978.86..969139.72 rows=1 width=8)
> -> Nested Loop (cost=916978.86..969139.72 rows=1 width=8)
> -> Merge Anti Join (cost=916978.86..969131.49 rows=1 width=8)
> Merge Cond: (t4.objid = t5.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile
> t4 (cost=0.00..19016.05 rows=5486 width=8)
> Index Cond: ((fileresporgid = 573936067464397682::bigint)
> AND (objid > 573936097512390656::bigint) AND (objid <
> 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8)
> -> Merge Join (cost=912418.42..956372.47 rows=22689
> width=8)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid
> = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934
> width=12)
> Sort Key: t5.objid, t5.aggrid
> -> Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12)
> Recheck Cond: (attrid =
> 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval
> (cost=0.00..43284.08 rows=1859934 width=0)
> Index Cond: (attrid =
> 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97
> rows=1701738 width=12)
> -> Sort (cost=510392.25..514646.59
> rows=1701738 width=12)
> Sort Key: q1_1.objid, q1_1.aggval
> -> Bitmap Heap Scan on ataggval q1_1
> (cost=44666.00..305189.47 rows=1701738 width=12)
> Recheck Cond: (attrid =
> 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on
> ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0)
> Index Cond: (attrid =
> 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22
> rows=1 width=8)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
>
> explain analyze SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t5.objid AND [!!!!!]
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = t4.objid AND
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Unique (cost=916978.86..969139.72 rows=1 width=8) (actual
> time=12102.964..12106.409 rows=64 loops=1)
> -> Nested Loop (cost=916978.86..969139.72 rows=1 width=8) (actual
> time=12102.959..12106.375 rows=64 loops=1)
> -> Merge Anti Join (cost=916978.86..969131.49 rows=1 width=8)
> (actual time=12060.916..12105.374 rows=108 loops=1)
> Merge Cond: (t4.objid = t5.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile
> t4 (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.080..81.397
> rows=63436 loops=1)
> Index Cond: ((fileresporgid = 573936067464397682::bigint)
> AND (objid > 573936097512390656::bigint) AND (objid <
> 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8)
> (actual time=8874.492..11778.254 rows=1299685 loops=1)
> -> Merge Join (cost=912418.42..956372.47 rows=22689
> width=8) (actual time=8874.484..11437.175 rows=1299685 loops=1)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid
> = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934
> width=12) (actual time=3117.555..3756.062 rows=1299685 loops=1)
> Sort Key: t5.objid, t5.aggrid
> Sort Method: external merge Disk: 39920kB
> -> Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12) (actual
> time=289.475..1079.624 rows=1857906 loops=1)
> Recheck Cond: (attrid =
> 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval
> (cost=0.00..43284.08 rows=1859934 width=0) (actual time=265.720...265.720
> rows=1857906 loops=1)
> Index Cond: (attrid =
> 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97
> rows=1701738 width=12) (actual time=5756.915..6707.864 rows=1299685 loops=1)
> -> Sort (cost=510392.25..514646.59
> rows=1701738 width=12) (actual time=5756.909..6409.819 rows=1299685 loops=1)
> Sort Key: q1_1.objid, q1_1.aggval
> Sort Method: external merge Disk:
> 39920kB
> -> Bitmap Heap Scan on ataggval q1_1
> (cost=44666.00..305189.47 rows=1701738 width=12) (actual
> time=1646.955..3628.918 rows=1857906 loops=1)
> Recheck Cond: (attrid =
> 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on
> ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) (actual
> time=1608.233..1608.233 rows=1877336 loops=1)
> Index Cond: (attrid =
> 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22
> rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=108)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12129.613 ms
> (29 rows)
>
>
>
> As the query performs in roughly 12 seconds in both (changed) cases you might
> advise to change my queries :-)
> (In fact we are working on this)
> As the primary performance impact can also be reproduced in a small database
> (querytime > 1 minute) I checked this issue on MS-SQL server and Oracle. On
> MSSQL server there is no difference in the execution plan if you change the
> query an the performance is well. Oralce shows a slightly difference but the
> performance is also well.
> As I mentioned we are looking forward to change our query but in my opinion
> there could be a general performance gain if this issue is addressed.
> (especially if you don't know you run into this issue on the query
> performance is sufficient enough)
>
> greets
> Armin
>
> --
> Sent via pgsql-performance mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance