Re: [PERFORM] Query planner wants to use seq scan
On 27.10.2015 23:56, Bertrand Paquet wrote: So, Tonight, the index on the three field is used, may be my yesterday vacuum updated stats. Thx you for your help. Regards, Bertrand 2015-10-27 18:33 GMT+01:00 Bertrand Paquet>: Hi tom, I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan. May be I forgot to analyze vacuum after. I will retry tonight. I use a btree index. Is it the good solution, even with the In clause ? Regards, Bertrand Le mardi 27 octobre 2015, Tom Lane > a écrit : Bertrand Paquet writes: > We have a slow query. After analyzing, the planner decision seems to be > discutable : the query is faster when disabling seqscan. See below the two > query plan, and an extract from pg_stats. > Any idea about what to change to help the planner ? Neither one of those plans is very good: you're just hoping that the Filter condition will let a tuple through sooner rather than later. If you care about the performance of this type of query, I'd consider creating an index on (organization_id, status, handled_by) so that all the conditions can be checked in the index. regards, tom lane Hello Bertrand once again! What's your status? Does the plan changed after deploying three field index ? -- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PERFORM] Query planner wants to use seq scan
Yes, the three fields index AND vacuum solve the issue. Regards, Bertrand 2015-10-29 13:27 GMT+01:00 Alex Ignatov: > > > On 27.10.2015 23:56, Bertrand Paquet wrote: > > So, > > Tonight, the index on the three field is used, may be my yesterday vacuum > updated stats. > > Thx you for your help. > > Regards, > > Bertrand > > > > > 2015-10-27 18:33 GMT+01:00 Bertrand Paquet : > >> Hi tom, >> >> I did the test yesterday with an index on the three fields, and with a >> partial index on organization and status and where is null condition on >> handled. I saw no modification on query plan. >> May be I forgot to analyze vacuum after. I will retry tonight. >> >> I use a btree index. Is it the good solution, even with the In clause ? >> >> Regards, >> >> Bertrand >> >> Le mardi 27 octobre 2015, Tom Lane < t...@sss.pgh.pa.us> >> a écrit : >> >>> Bertrand Paquet writes: >>> > We have a slow query. After analyzing, the planner decision seems to be >>> > discutable : the query is faster when disabling seqscan. See below the >>> two >>> > query plan, and an extract from pg_stats. >>> >>> > Any idea about what to change to help the planner ? >>> >>> Neither one of those plans is very good: you're just hoping that the >>> Filter condition will let a tuple through sooner rather than later. >>> >>> If you care about the performance of this type of query, I'd consider >>> creating an index on (organization_id, status, handled_by) so that all >>> the conditions can be checked in the index. >>> >>> regards, tom lane >>> >> > Hello Bertrand once again! > What's your status? Does the plan changed after deploying three field > index ? > > -- > Alex Ignatov > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > > >
Re: [PERFORM] Scalability to more than 64 cores With PG 9.4 and RHEL 7.1 Kernel 3.10
On 28.10.2015 8:04, Javier Muro wrote: Has PostgreSQL 9.4 official Support on a system with more than 64 legitimate cores? ( 72 Cores , 4 CPUs Intel(R) Xeon(R) CPU E7-8890 ) The work Robert Haas did to fix the CPU locking way back when showed significant improvements up to 64, but so far as I know. Thanks in Advance. Hello Javier! Our tests shows that PG 9.4 scales well up to 60 Intel cores. I.E pgbech -S and DB on tmpfs gave us 700 000 tps. After 60 соres s_lock is dominating in cpu usage%. 9.5 scales way better. -- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Query optimizer plans with very small selectivity estimates
This related to a post in the general bugs forum, but I found this forum, and this seems more appropriate. This is my second attempt to post, I believe the first attempt last week did not work, apologies if I'm duplicating. http://comments.gmane.org/gmane.comp.db.postgresql.bugs/39011 I made have several users encounter performance problems, which all seem to come down to this problem: multiplying selectivity estimates can cause tuple estimates to grow very small very quickly, once the estimator gets to 1 row, the planner may choose plans that are very good ONLY WHEN there is exactly 1 row (maybe even O(N^large)). Unfortunately, these may be the worst plans if the estimate is even slightly off (even just returning 2 or 3 rows versus 1). Using the patch below, I discovered that clamping relation tuple estimates to a number as small as 2 seemed to avoid all the catastrophic query plans. In the scenarios I'm seeing, I have several examples of queries that take >1m to run that should run in <1s. The estimate of 1 row (versus thousands actual) leads the planner to tee up several nest loop joins which causes thousands of table scans. I have been working on a more complete which tracks uniqueness along with selectivity so that optimizer can benefit from knowing when a relation must have 1 (or fewer) tuples, while clamping all other relations to 2 rather than 1. typedef struct { double selectivity; boolean unique; } Selectivity; I am interested in hearing discussion about this problem, and if the community is open to a patch if I continue pursuing the development. Matt FIRST ARTIFACT plan with expensive (80s join) and join estimate of 1 note the first Nested Loop join and 81s join (I gave up trying to post the full explain, because of the 80 char limit) "Sort (cost=7000.04..7000.04 rows=1 width=49) (actual time=81739.426..81740.023 rows=5091 loops=1)" " Sort Key: c.ten DESC" " Sort Method: quicksort Memory: 948kB" " CTE cte" "-> Values Scan on "*VALUES*" (cost=0.00..0.06 rows=5 width=4) (actual time=0.001..0.001 rows=5 loops=1)" " -> Nested Loop (cost=1.36..6999.97 rows=1 width=49) (actual time=0.059..81725.475 rows=5091 loops=1)" "Planning time: 1.912 ms" "Execution time: 81740.328 ms" SECOND ARTIFACT force join row estimate to be minimun of 2 query completes very quickly "Sort (cost=7000.06..7000.06 rows=2 width=49) (actual time=84.610..85.192 rows=5142 loops=1)" " Sort Key: c.ten DESC" " Sort Method: quicksort Memory: 956kB" " CTE cte" "-> Values Scan on "*VALUES*" (cost=0.00..0.06 rows=5 width=4) (actual time=0.002..0.003 rows=5 loops=1)" " -> Hash Join (cost=2518.99..6999.98 rows=2 width=49) (actual time=17.629..82.886 rows=5142 loops=1)" "Planning time: 2.982 ms" "Execution time: 85.514 ms" THIRD ARTIFACT patch I used to make experimenting easier w/o recompiling index 1b61fd9..444703c 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -68,6 +68,12 @@ *- */ + + +/* These parameters are set by GUC */ +int join_row_estimate_clamp=1; + + #include "postgres.h" #ifdef _MSC_VER @@ -175,6 +181,17 @@ clamp_row_est(double nrows) } +double +clamp_join_row_est(double nrows) +{ + nrows = clamp_row_est(nrows); + if (nrows >= (double)join_row_estimate_clamp) + return nrows; +return (double)join_row_estimate_clamp; +} + + + /* * cost_seqscan * Determines and returns the cost of scanning a relation sequentially. @@ -3886,7 +3903,7 @@ calc_joinrel_size_estimate(PlannerInfo *root, break; } - return clamp_row_est(nrows); + return clamp_join_row_est(nrows); } /* diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 71090f2..fabb8ac 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -2664,6 +2664,16 @@ static struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, + { + {"join_row_estimate_clamp", PGC_USERSET, QUERY_TUNING_OTHER, + gettext_noop("Set the minimum estimated size of a join result."), +NULL + }, + _row_estimate_clamp, + 1, 1, 1, + NULL, NULL, NULL + }, + /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index 25a7303..0161c4b 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -67,8 +67,10 @@ extern bool enable_material; extern bool enable_mergejoin; extern bool enable_hashjoin; extern int constraint_exclusion; +extern int join_row_estimate_clamp; extern double clamp_row_est(double nrows); +extern double clamp_join_row_est(double nrows); extern double index_pages_fetched(double tuples_fetched, BlockNumber pages, double index_pages, PlannerInfo *root); extern void cost_seqscan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
Re: [PERFORM] Query optimizer plans with very small selectivity estimates
Matthew Bellewwrites: > I made have several users encounter performance problems, which all > seem to come down to this problem: multiplying selectivity estimates can > cause tuple estimates to grow very small very quickly, once the estimator > gets to 1 row, the planner may choose plans that are very good ONLY WHEN > there is exactly 1 row (maybe even O(N^large)). Unfortunately, these may > be the worst plans if the estimate is even slightly off (even just > returning 2 or 3 rows versus 1). Yeah, this is a well-known problem. There has been prior discussion along the same lines as you mention (only believe 1-row estimates when it's provably true that there's at most one row), but it hasn't looked like an easy change. See the pgsql-hackers archives for previous threads. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query optimizer plans with very small selectivity estimates
On 10/29/2015 11:24 AM, Tom Lane wrote: > Matthew Bellewwrites: >> I made have several users encounter performance problems, which all >> seem to come down to this problem: multiplying selectivity estimates can >> cause tuple estimates to grow very small very quickly, once the estimator >> gets to 1 row, the planner may choose plans that are very good ONLY WHEN >> there is exactly 1 row (maybe even O(N^large)). Unfortunately, these may >> be the worst plans if the estimate is even slightly off (even just >> returning 2 or 3 rows versus 1). > > Yeah, this is a well-known problem. There has been prior discussion along > the same lines as you mention (only believe 1-row estimates when it's > provably true that there's at most one row), but it hasn't looked like an > easy change. See the pgsql-hackers archives for previous threads. Also see Tomas's correlated stats patch submitted for 9.6. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance