On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote: > I don't remember any clear resolution to the wild variations in plan > time mentioned here: > > http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php > > I think it would be prudent to try to figure out why small changes in > the query caused the large changes in the plan times Andres was > seeing. Has anyone else ever seen such behavior? Can we get > examples? (It should be enough to get the statistics and the schema, > since this is about planning time, not run time.)
With joins between statistically indistinguishable columns, I see planning times change by a factor of ~4 for each join added or removed (postgres 8.3). Varying join_collapse_limit in the neighborhood of the actual number of joins has a similar effect. See attachment with annotated timings. The example uses a single table joined to itself, but using distinct tables with identical contents yields the same figures. The expontential factor seems smaller for real queries. I have a query of sixteen joins that takes 71s to plan deterministically; it looks like this: SELECT 1 FROM fact JOIN dim0 ... JOIN dim6 JOIN t t0 ON fact.key = t.key AND t.x = MCV0 LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1 JOIN t t2 ON fact.key = t.key AND t.x = MCV2 LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0 LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1 LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2 LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3 LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4 For the real query, removing one join drops plan time to 26s, and removing two drops the time to 11s. I don't have a good theory for the multiplier changing from 4 for the trivial demonstration to ~2.5 for this real query. Re-enabling geqo drops plan time to .5s. These tests used default_statistics_target = 1000, but dropping that to 100 does not change anything dramatically. > I guess the question is whether there is anyone who has had a contrary > experience. (There must have been some benchmarks to justify adding > geqo at some point?) I have queries with a few more joins (19-21), and I cancelled attempts to plan them deterministically after 600+ seconds and 10+ GiB of memory usage. Even with geqo_effort = 10, they plan within 5-15s with good results. All that being said, I've never encountered a situation where a value other than 1 or <inf> for *_collapse_limit appeared optimal. nm
SET geqo = off; SET join_collapse_limit = 100; CREATE TEMP TABLE t AS SELECT * FROM generate_series(1, 1000) f(n); ANALYZE t; --- Vary join count -- 242.4s EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11 NATURAL JOIN t t12; -- 31.2s EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11; -- 8.1s EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; -- 2.0s EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09; -- 0.5s EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08; --- Vary join_collapse_limit -- 8.1s SET join_collapse_limit = 100; EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; -- 8.0s SET join_collapse_limit = 11; EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; -- 2.2s SET join_collapse_limit = 10; EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; -- 0.5s SET join_collapse_limit = 9; EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; -- 0.1s SET join_collapse_limit = 8; EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
pgpODtXjGVDUt.pgp
Description: PGP signature