Re: [PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Tom Lane wrote: Given that it estimated 1 row out of words (quite correctly) and 12264 rows out of each scan on article_words, you'd think that the join size estimate would be 12264, which would be off by only a factor of 3 from the true result. Instead it's 23, off by a factor of 200 :-(. Has anyone every proposed a learning query planner? One that eventually clues in to estimate mismatches like this? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
On Wed, Feb 10, 2010 at 8:52 PM, Bryce Nesbitt bry...@obviously.com wrote: If you guys succeed in making this class of query perform, you'll have beat out the professional consulting firm we hired, which was all but useless! The query is usually slow, but particular combinations of words seem to make it obscenely slow. Heh heh heh professional consulting firm. production=# EXPLAIN ANALYZE SELECT context_key FROM article_words WHERE word_key = 3675; --- Index Scan using article_words_wc on article_words (cost=0.00..21433.53 rows=11309 width=4) (actual time=0.025..7.579 rows=4003 loops=1) Index Cond: (word_key = 3675) Total runtime: 11.704 ms That's surprisingly inaccurate. Since this table is large: production=# explain analyze select count(*) from article_words; Aggregate (cost=263831.63..263831.64 rows=1 width=0) (actual time=35851.654..35851.655 rows=1 loops=1) - Seq Scan on words (cost=0.00..229311.30 rows=13808130 width=0) (actual time=0.043..21281.124 rows=13808184 loops=1) Total runtime: 35851.723 ms ...you may need to crank up the statistics target. I would probably try cranking it all the way up to the max, though there is a risk that might backfire, in which case you'll need to decrease it again. ALTER TABLE article_words ALTER COLUMN word_key SET STATISTICS 1000; That's probably not going to fix your whole problem, but it should be interesting to see whether it makes things better or worse and by how much. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Or, if you want to actually read that query plan, try: http://explain.depesz.com/s/qYq -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
2010/2/10 Bryce Nesbitt bry...@obviously.com: Or, if you want to actually read that query plan, try: http://explain.depesz.com/s/qYq hello, check your work_mem sesttings. Hash join is very slow in your case. Pavel -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
On Wed, Feb 10, 2010 at 3:29 AM, Bryce Nesbitt bry...@obviously.com wrote: Or, if you want to actually read that query plan, try: http://explain.depesz.com/s/qYq Much better, though I prefer a text attachment... anyhow, I think the root of the problem may be that both of the subquery scans under the append node are seeing hundreds of times more rows than they're expecting, which is causing the planner to choose nested loops higher up that it otherwise might have preferred to implement in some other way. I'm not quite sure why, though. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
That sure looks like the source of the problem to me too. I've seen similar behavior in queries not very different from that. It's hard to guess what the problem is exactly without having more knowledge of the data distribution in article_words though. Given the results of analyze, I'd try to run the deepest subquery and try to see if I could get the estimate to match reality, either by altering statistics targets, or tweaking the query to give more information to the planner. For example, i'd check if the number of expected rows from SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider' is much less accurate than the estimate for SELECT context_key FROM article_words WHERE word_key = (whatever the actual word_key for insider is) Robert Haas robertmh...@gmail.com 02/10/10 2:31 PM On Wed, Feb 10, 2010 at 3:29 AM, Bryce Nesbitt bry...@obviously.com wrote: Or, if you want to actually read that query plan, try: http://explain.depesz.com/s/qYq Much better, though I prefer a text attachment... anyhow, I think the root of the problem may be that both of the subquery scans under the append node are seeing hundreds of times more rows than they're expecting, which is causing the planner to choose nested loops higher up that it otherwise might have preferred to implement in some other way. I'm not quite sure why, though. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
If you guys succeed in making this class of query perform, you'll have beat out the professional consulting firm we hired, which was all but useless! The query is usually slow, but particular combinations of words seem to make it obscenely slow. The query plans are now attached (sorry I did not start there: many lists reject attachments). Or you can click on text at the query planner analysis site http://explain.depesz.com/s/qYq _Here's typical server load:_ Tasks: 166 total, 1 running, 165 sleeping, 0 stopped, 0 zombie Cpu(s): 1.2%us, 0.9%sy, 0.0%ni, 86.5%id, 11.2%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 32966936k total, 32873860k used,93076k free, 2080k buffers Swap: 33554424k total, 472k used, 33553952k free, 30572904k cached _ Configurations modified from Postgres 8.3 default are:_ listen_addresses = '10.100.2.11, 10.101.2.11' # what IP address(es) to listen on; port = 5432 # (change requires restart) max_connections = 400 # (change requires restart) shared_buffers = 4096MB # min 128kB or max_connections*16kB work_mem = 16MB # min 64kB max_fsm_pages = 50 # default:2 min:max_fsm_relations*16,6 bytes see:MAIN-5740 max_fsm_relations = 2700# min 100, ~70 bytes each checkpoint_segments = 20# in logfile segments, min 1, 16MB each random_page_cost = 2.0 # same scale as above effective_cache_size = 28672MB default_statistics_target = 150 # range 1-1000 log_destination = 'syslog' # Valid values are combinations of log_min_error_statement = error # values in order of decreasing detail: log_min_duration_statement = 5000 # -1 is disabled, 0 logs all statements log_checkpoints = on# default off autovacuum_naptime = 5min # time between autovacuum runs escape_string_warning = off # default:on (See bepress MAIN-4857) standard_conforming_strings = off # deafult:off (See bepress MAIN-4857) production=# EXPLAIN ANALYZE SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider'; - Nested Loop (cost=0.00..23393.15 rows=23 width=4) (actual time=0.077..15.637 rows=4003 loops=1) - Index Scan using words_word on words (cost=0.00..5.47 rows=1 width=4) (actual time=0.049..0.051 rows=1 loops=1) Index Cond: ((word)::text = 'insider'::text) - Index Scan using article_words_wc on article_words (cost=0.00..23234.38 rows=12264 width=8) (actual time=0.020..7.237 rows=4003 loops=1) Index Cond: (article_words.word_key = words.word_key) Total runtime: 19.776 ms production=# EXPLAIN ANALYZE SELECT context_key FROM article_words WHERE word_key = 3675; --- Index Scan using article_words_wc on article_words (cost=0.00..21433.53 rows=11309 width=4) (actual time=0.025..7.579 rows=4003 loops=1) Index Cond: (word_key = 3675) Total runtime: 11.704 ms production=# explain analyze select count(*) from article_words; Aggregate (cost=263831.63..263831.64 rows=1 width=0) (actual time=35851.654..35851.655 rows=1 loops=1) - Seq Scan on words (cost=0.00..229311.30 rows=13808130 width=0) (actual time=0.043..21281.124 rows=13808184 loops=1) Total runtime: 35851.723 ms production=# select count(*) from words; 13,808,184 production=# explain analyze select count(*) from article_words; Aggregate (cost=5453242.40..5453242.41 rows=1 width=0) (actual time=776504.017..776504.018 rows=1 loops=1) - Seq Scan on article_words (cost=0.00..4653453.52 rows=319915552 width=0) (actual time=0.034..438969.347 rows=319956663 loops=1) Total runtime: 776504.177 ms production=# select count(*) from article_words; 319,956,720 Aggregate (cost=139279.85..139279.86 rows=1 width=4) (actual time=502493.000..502493.001 rows=1 loops=1) - Hash Join (cost=131314.31..139279.84 rows=2 width=4) (actual time=501787.397..502492.316 rows=622 loops=1) Hash Cond: (matview_82034.context_key = articles.context_key) - Seq Scan on matview_82034 (cost=0.00..6322.20 rows=438220 width=4) (actual time=0.014..462.312 rows=438220 loops=1) - Hash (cost=131314.30..131314.30 rows=1 width=16) (actual time=501553.755..501553.755 rows=622 loops=1) - Nested Loop IN Join (cost=46291.79..131314.30 rows=1 width=16) (actual time=467.546..501550.735 rows=622 loops=1) Join Filter: (a.context_key = articles.context_key) - Nested Loop (cost=46291.79..46323.15 rows=2 width=12) (actual time=179.760..303.474 rows=1473 loops=1) - Nested Loop (cost=46291.79..46314.35
Re: [PERFORM] Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Bryce Nesbitt bry...@obviously.com writes: The query plans are now attached (sorry I did not start there: many lists reject attachments). Or you can click on text at the query planner analysis site http://explain.depesz.com/s/qYq At least some of the problem is the terrible quality of the rowcount estimates in the IN subquery, as you extracted here: Nested Loop (cost=0.00..23393.15 rows=23 width=4) (actual time=0.077..15.637 rows=4003 loops=1) - Index Scan using words_word on words (cost=0.00..5.47 rows=1 width=4) (actual time=0.049..0.051 rows=1 loops=1) Index Cond: ((word)::text = 'insider'::text) - Index Scan using article_words_wc on article_words (cost=0.00..23234.38 rows=12264 width=8) (actual time=0.020..7.237 rows=4003 loops=1) Index Cond: (article_words.word_key = words.word_key) Total runtime: 19.776 ms Given that it estimated 1 row out of words (quite correctly) and 12264 rows out of each scan on article_words, you'd think that the join size estimate would be 12264, which would be off by only a factor of 3 from the true result. Instead it's 23, off by a factor of 200 :-(. Running a roughly similar test case here, I see that 8.4 gives significantly saner estimates, which I think is because of this patch: http://archives.postgresql.org/pgsql-committers/2008-10/msg00191.php At the time I didn't want to risk back-patching it, because there were a lot of other changes in the same general area in 8.4. But it would be interesting to see what happens with your example if you patch 8.3 similarly. (Note: I think only the first diff hunk is relevant to 8.3.) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance