Tom Lane wrote:Tom,Why are the rowcount estimates so far off? Maybe you need to increase the statistics target for this table. regards, tom lane How does one tell the rowcount is off in a query plan? I've never found a great reference on interpreting the query analyze output! Upping the stats target made little difference in the estimated row count, but a huge difference in the query. It also mattered only on the context_key column, not the other columns. Why would that be? lyell5=> \pset format wrapped lyell5=> select * from pg_stats where tablename='article_words'; +------------+-------------+----------+-----------+-----------+------------+--------------------+-------------------+---------------------+-------------+ | schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | +------------+-------------+----------+-----------+-----------+------------+--------------------+-------------------+---------------------+-------------+ | public | article_wor | word_key | 0 | 4 | 6361 | {382,226,1155,2130 | {0.003,0.00233333 | {3,623,1376,2074,31 | 0.0292482 | | ; ds ; ,2246,3328,8999,73 ; ,0.00233333,0.002 ; 54,4601,7269,12770, | | ; 3,1760,1796} ; 33333,0.00233333, ; 30017,212496,582068 | | ; 0.00233333,0.0023 ; 0} | | ; 3333,0.002,0.002, | | ; 0.002} | | public | article_wor | context_ | 0 | 4 | 4671 | {639302,113013,133 | {0.00233333,0.001 | {28,42838,92697,140 | 0.867505 | | ; ds ; key ; 052,211978,508496, ; 66667,0.00166667, ; 684,202950,248442,3 | | ; 545123,590829,5985 ; 0.00166667,0.0016 ; 38954,403025,498952 | | ; 95,649645,37658} ; 6667,0.00166667,0 ; ,584048,654070} | | ; .00166667,0.00166 | | ; 667,0.00166667,0. | | ; 00133333} | +------------+-------------+----------+-----------+-----------+------------+--------------------+-------------------+---------------------+-------------+ lyell5=> analyze verbose article_words; INFO: "article_words": scanned 3000 of 1125489 pages, containing 555000 live rows and 0 dead rows; 3000 rows in sample, 208,215,465 estimated total rows lyell5=> set default_statistics_target to 500; lyell5=> analyze verbose article_words; INFO: "article_words": scanned 150000 of 1125489 pages, containing 27749820 live rows and 180 dead rows; 150000 rows in sample, 208,214,114 estimated total rows But it makes a huge impact on this query:
Is there any good reason not to set stats target to 1000 always? lyell5=> alter table article_words alter word_key set statistics 30;analyze verbose article_words; explain analyze select words.* from article_words join words using (word_key) where context_key=535462; INFO: "article_words": scanned 9000 of 1125489 pages, containing 1665000 live rows and 0 dead rows; 9000 rows in sample, 208215465 estimated total rows +------------------------------------------------------------------------------------------------------------------------------------------------+ | QUERY PLAN | +------------------------------------------------------------------------------------------------------------------------------------------------+ | Nested Loop (cost=0.00..160395.31 rows=16170 width=13) (actual time=0.034..4.025 rows=777 loops=1) | | -> Index Scan using article_word_idx on article_words (cost=0.00..18258.96 rows=16170 width=4) (actual time=0.023..0.344 rows=777 loops=1) | | Index Cond: (context_key = 535462) | | -> Index Scan using words_pkey on words (cost=0.00..8.78 rows=1 width=13) (actual time=0.003..0.004 rows=1 loops=777) | | Index Cond: (words.word_key = article_words.word_key) | | Total runtime: 4.231 ms | +------------------------------------------------------------------------------------------------------------------------------------------------+ lyell5=> alter table article_words alter word_key set statistics 10;analyze verbose article_words; explain analyze select words.* from article_words join words using (word_key) where context_key=535462; INFO: "article_words": scanned 3000 of 1125489 pages, containing 555000 live rows and 0 dead rows; 3000 rows in sample, 208215465 estimated total rows +------------------------------------------------------------------------------------------------------------------------------------------------+ | QUERY PLAN | +------------------------------------------------------------------------------------------------------------------------------------------------+ | Hash Join (cost=192089.54..265653.38 rows=44589 width=13) (actual time=18809.184..80685.239 rows=777 loops=1) | | Hash Cond: (article_words.word_key = words.word_key) | | -> Index Scan using article_word_idx on article_words (cost=0.00..44717.05 rows=44589 width=4) (actual time=0.025..0.597 rows=777 loops=1) | | Index Cond: (context_key = 535462) | | -> Hash (cost=93818.13..93818.13 rows=5653313 width=13) (actual time=18808.872..18808.872 rows=5651551 loops=1) | | -> Seq Scan on words (cost=0.00..93818.13 rows=5653313 width=13) (actual time=0.007..7845.824 rows=5651551 loops=1) | | Total runtime: 80686.217 ms | +------------------------------------------------------------------------------------------------------------------------------------------------+ |
- [SQL] Seq scan on join, not on subselect? an... Bryce Nesbitt
- Re: [SQL] Seq scan on join, not on subs... Tom Lane
- Re: [SQL] Seq scan on join, not on ... Bryce Nesbitt
- Re: [SQL] Seq scan on join, not on subs... Helio Campos Mello de Andrade
- [SQL] Seq scan on join, not on subselec... Bryce Nesbitt