Re: [PERFORM] Slow query with joins
On Wed, Jan 11, 2006 at 10:30:58PM +0100, Bendik Rognlien Johansen wrote: > The sort is definitively the culprit. When I removed it the query was > instant. I tried setting work_mem = 131072 but it did not seem to > help. I really don't understand this :-( Any other ideas? What's explain analyze show with the sort in? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Slow query with joins
The sort is definitively the culprit. When I removed it the query was instant. I tried setting work_mem = 131072 but it did not seem to help. I really don't understand this :-( Any other ideas? Thanks! On Jan 11, 2006, at 9:23 PM, Jim C. Nasby wrote: I'd try figuring out if the join is the culprit or the sort is (by dropping the ORDER BY). work_mem is probably forcing the sort to spill to disk, and if your drives are rather busy... You might also get a win if you re-order the joins to people, contacts, addresses, if you know it will have the same result. In this case LIMIT won't have any real effect, because you have to go all the way through with the ORDER BY anyway. On Wed, Jan 11, 2006 at 08:55:32PM +0100, Bendik Rognlien Johansen wrote: Yes, the rowcount estimates are real, however, it has been a long time since the last VACUUM FULL (there is never a good time). I have clustered the tables, reindexed, analyzed, vacuumed and the plan now looks like this: no_people=# explain SELECT r.id AS r_id, r.firstname || ' ' || r.lastname AS r_name, ad.id AS ad_id, ad.type AS ad_type, ad.address AS ad_address, ad.postalcode AS ad_postalcode, ad.postalsite AS ad_postalsite, ad.priority AS ad_priority, ad.position[0] AS ad_lat, ad.position[1] AS ad_lon, ad.uncertainty AS ad_uncertainty, ad.extra AS ad_extra, ad.deleted AS ad_deleted, co.id AS co_id, co.type AS co_type, co.value AS co_value, co.description AS co_description, co.priority AS co_priority, co.visible AS co_visible, co.searchable AS co_searchable, co.deleted AS co_deleted FROM people r LEFT OUTER JOIN addresses ad ON(r.id = ad.record) LEFT OUTER JOIN contacts co ON (r.id = co.record) WHERE NOT r.deleted AND r.original IS NULL ORDER BY r.id; QUERY PLAN - --- -- Sort (cost=182866.49..182943.12 rows=30655 width=587) Sort Key: r.id -> Nested Loop Left Join (cost=0.00..170552.10 rows=30655 width=587) -> Nested Loop Left Join (cost=0.00..75054.96 rows=26325 width=160) -> Index Scan using people_deleted_original_is_null on people r (cost=0.00..1045.47 rows=23861 width=27) Filter: ((NOT deleted) AND (original IS NULL)) -> Index Scan using addresses_record_idx on addresses ad (cost=0.00..3.05 rows=4 width=137) Index Cond: ("outer".id = ad.record) -> Index Scan using contacts_record_idx on contacts co (cost=0.00..3.32 rows=24 width=431) Index Cond: ("outer".id = co.record) (10 rows) Looks faster, but still very slow. I added limit 1000 and it has been running for about 25 minutes now with no output. top shows: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 29994 postgres 18 0 95768 78m 68m R 17.0 7.7 0:53.27 postmaster which is unusual, I usually get 99.9 %cpu for just about any query, which leads me to believe this is disk related. postgresql.conf: shared_buffers = 8192 work_mem = 8192 maintenance_work_mem = 524288 Hardware 2x2.8GHz cpu 1GB ram Could this be an issue related to lack of VACUUM FULL? The tables get a lot of updates. Thank you very much so far! On Jan 11, 2006, at 4:45 PM, Tom Lane wrote: Bendik Rognlien Johansen <[EMAIL PROTECTED]> writes: Has anyone got any tips for speeding up this query? It currently takes hours to start. Are the rowcount estimates close to reality? The plan doesn't look unreasonable to me if they are. It might help to increase work_mem to ensure that the hash tables don't spill to disk. Indexes: "people_original_is_null" btree (original) WHERE original IS NULL This index seems poorly designed: the actual index entries are dead weight since all of them are necessarily NULL. You might as well make the index carry something that you frequently test in conjunction with "original IS NULL". For instance, if this particular query is a common case, you could replace this index with CREATE INDEX people_deleted_original_is_null ON people(deleted) WHERE original IS NULL; This index is still perfectly usable for queries that only say "original IS NULL", but it can also filter out rows with the wrong value of deleted. Now, if there are hardly any rows with deleted = true, maybe this won't help much for your problem. But in any case you ought to consider whether you can make the index entries do something useful. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)---
Re: [PERFORM] Slow query with joins
I'd try figuring out if the join is the culprit or the sort is (by dropping the ORDER BY). work_mem is probably forcing the sort to spill to disk, and if your drives are rather busy... You might also get a win if you re-order the joins to people, contacts, addresses, if you know it will have the same result. In this case LIMIT won't have any real effect, because you have to go all the way through with the ORDER BY anyway. On Wed, Jan 11, 2006 at 08:55:32PM +0100, Bendik Rognlien Johansen wrote: > Yes, the rowcount estimates are real, however, it has been a long > time since the last VACUUM FULL (there is never a good time). > > I have clustered the tables, reindexed, analyzed, vacuumed and the > plan now looks like this: > > > no_people=# explain SELECT r.id AS r_id, r.firstname || ' ' || > r.lastname AS r_name, ad.id AS ad_id, ad.type AS ad_type, ad.address > AS ad_address, ad.postalcode AS ad_postalcode, ad.postalsite AS > ad_postalsite, ad.priority AS ad_priority, ad.position[0] AS ad_lat, > ad.position[1] AS ad_lon, ad.uncertainty AS ad_uncertainty, ad.extra > AS ad_extra, ad.deleted AS ad_deleted, co.id AS co_id, co.type AS > co_type, co.value AS co_value, co.description AS co_description, > co.priority AS co_priority, co.visible AS co_visible, co.searchable > AS co_searchable, co.deleted AS co_deleted FROM people r LEFT OUTER > JOIN addresses ad ON(r.id = ad.record) LEFT OUTER JOIN contacts co ON > (r.id = co.record) WHERE NOT r.deleted AND r.original IS NULL ORDER > BY r.id; > QUERY PLAN > > -- > Sort (cost=182866.49..182943.12 rows=30655 width=587) >Sort Key: r.id >-> Nested Loop Left Join (cost=0.00..170552.10 rows=30655 > width=587) > -> Nested Loop Left Join (cost=0.00..75054.96 rows=26325 > width=160) >-> Index Scan using people_deleted_original_is_null > on people r (cost=0.00..1045.47 rows=23861 width=27) > Filter: ((NOT deleted) AND (original IS NULL)) >-> Index Scan using addresses_record_idx on > addresses ad (cost=0.00..3.05 rows=4 width=137) > Index Cond: ("outer".id = ad.record) > -> Index Scan using contacts_record_idx on contacts co > (cost=0.00..3.32 rows=24 width=431) >Index Cond: ("outer".id = co.record) > (10 rows) > > > > > > > Looks faster, but still very slow. I added limit 1000 and it has been > running for about 25 minutes now with no output. top shows: > > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 29994 postgres 18 0 95768 78m 68m R 17.0 7.7 0:53.27 postmaster > > > > which is unusual, I usually get 99.9 %cpu for just about any query, > which leads me to believe this is disk related. > > > > postgresql.conf: > shared_buffers = 8192 > work_mem = 8192 > maintenance_work_mem = 524288 > > > > > Hardware 2x2.8GHz cpu > 1GB ram > > Could this be an issue related to lack of VACUUM FULL? The tables get > a lot of updates. > > > Thank you very much so far! > > > > > On Jan 11, 2006, at 4:45 PM, Tom Lane wrote: > > >Bendik Rognlien Johansen <[EMAIL PROTECTED]> writes: > >>Has anyone got any tips for speeding up this query? It currently > >>takes hours to start. > > > >Are the rowcount estimates close to reality? The plan doesn't look > >unreasonable to me if they are. It might help to increase work_mem > >to ensure that the hash tables don't spill to disk. > > > >Indexes: > > "people_original_is_null" btree (original) WHERE original IS NULL > > > >This index seems poorly designed: the actual index entries are dead > >weight since all of them are necessarily NULL. You might as well make > >the index carry something that you frequently test in conjunction with > >"original IS NULL". For instance, if this particular query is a > >common > >case, you could replace this index with > > > >CREATE INDEX people_deleted_original_is_null ON people(deleted) > > WHERE original IS NULL; > > > >This index is still perfectly usable for queries that only say > >"original > >IS NULL", but it can also filter out rows with the wrong value of > >deleted. Now, if there are hardly any rows with deleted = true, maybe > >this won't help much for your problem. But in any case you ought to > >consider whether you can make the index entries do something useful. > > > > regards, tom lane > > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings > -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of b
Re: [PERFORM] Slow query with joins
Yes, the rowcount estimates are real, however, it has been a long time since the last VACUUM FULL (there is never a good time). I have clustered the tables, reindexed, analyzed, vacuumed and the plan now looks like this: no_people=# explain SELECT r.id AS r_id, r.firstname || ' ' || r.lastname AS r_name, ad.id AS ad_id, ad.type AS ad_type, ad.address AS ad_address, ad.postalcode AS ad_postalcode, ad.postalsite AS ad_postalsite, ad.priority AS ad_priority, ad.position[0] AS ad_lat, ad.position[1] AS ad_lon, ad.uncertainty AS ad_uncertainty, ad.extra AS ad_extra, ad.deleted AS ad_deleted, co.id AS co_id, co.type AS co_type, co.value AS co_value, co.description AS co_description, co.priority AS co_priority, co.visible AS co_visible, co.searchable AS co_searchable, co.deleted AS co_deleted FROM people r LEFT OUTER JOIN addresses ad ON(r.id = ad.record) LEFT OUTER JOIN contacts co ON (r.id = co.record) WHERE NOT r.deleted AND r.original IS NULL ORDER BY r.id; QUERY PLAN -- Sort (cost=182866.49..182943.12 rows=30655 width=587) Sort Key: r.id -> Nested Loop Left Join (cost=0.00..170552.10 rows=30655 width=587) -> Nested Loop Left Join (cost=0.00..75054.96 rows=26325 width=160) -> Index Scan using people_deleted_original_is_null on people r (cost=0.00..1045.47 rows=23861 width=27) Filter: ((NOT deleted) AND (original IS NULL)) -> Index Scan using addresses_record_idx on addresses ad (cost=0.00..3.05 rows=4 width=137) Index Cond: ("outer".id = ad.record) -> Index Scan using contacts_record_idx on contacts co (cost=0.00..3.32 rows=24 width=431) Index Cond: ("outer".id = co.record) (10 rows) Looks faster, but still very slow. I added limit 1000 and it has been running for about 25 minutes now with no output. top shows: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 29994 postgres 18 0 95768 78m 68m R 17.0 7.7 0:53.27 postmaster which is unusual, I usually get 99.9 %cpu for just about any query, which leads me to believe this is disk related. postgresql.conf: shared_buffers = 8192 work_mem = 8192 maintenance_work_mem = 524288 Hardware 2x2.8GHz cpu 1GB ram Could this be an issue related to lack of VACUUM FULL? The tables get a lot of updates. Thank you very much so far! On Jan 11, 2006, at 4:45 PM, Tom Lane wrote: Bendik Rognlien Johansen <[EMAIL PROTECTED]> writes: Has anyone got any tips for speeding up this query? It currently takes hours to start. Are the rowcount estimates close to reality? The plan doesn't look unreasonable to me if they are. It might help to increase work_mem to ensure that the hash tables don't spill to disk. Indexes: "people_original_is_null" btree (original) WHERE original IS NULL This index seems poorly designed: the actual index entries are dead weight since all of them are necessarily NULL. You might as well make the index carry something that you frequently test in conjunction with "original IS NULL". For instance, if this particular query is a common case, you could replace this index with CREATE INDEX people_deleted_original_is_null ON people(deleted) WHERE original IS NULL; This index is still perfectly usable for queries that only say "original IS NULL", but it can also filter out rows with the wrong value of deleted. Now, if there are hardly any rows with deleted = true, maybe this won't help much for your problem. But in any case you ought to consider whether you can make the index entries do something useful. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Slow query with joins
Bendik Rognlien Johansen <[EMAIL PROTECTED]> writes: > Has anyone got any tips for speeding up this query? It currently > takes hours to start. Are the rowcount estimates close to reality? The plan doesn't look unreasonable to me if they are. It might help to increase work_mem to ensure that the hash tables don't spill to disk. Indexes: "people_original_is_null" btree (original) WHERE original IS NULL This index seems poorly designed: the actual index entries are dead weight since all of them are necessarily NULL. You might as well make the index carry something that you frequently test in conjunction with "original IS NULL". For instance, if this particular query is a common case, you could replace this index with CREATE INDEX people_deleted_original_is_null ON people(deleted) WHERE original IS NULL; This index is still perfectly usable for queries that only say "original IS NULL", but it can also filter out rows with the wrong value of deleted. Now, if there are hardly any rows with deleted = true, maybe this won't help much for your problem. But in any case you ought to consider whether you can make the index entries do something useful. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend