Slow Planning Times
I have the following query: *explain* (*analyze*, costs, timing) *SELECT* rr.* *FROM* rpc rpc *INNER* *JOIN* rr rr *ON* rr.uuid = rpc.rr_id *INNER* *JOIN* rs rs *ON* rs.r_id = rpc.r_id *INNER* *JOIN* *role* r *ON* r.uuid = rs.r_id *LEFT* *JOIN* spc spc *ON* spc.rr_id = rpc.rr_id *WHERE* rs.s_id = 'caa767b8-8371-43a3-aa11-d1dba1893601' *and* spc.s_id = 'caa767b8-8371-43a3-aa11-d1dba1893601' *and* spc.rd_id = '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' *AND* rpc.rd_id = '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' *AND* rpc.c_id = '9fd29fdc-15fd-40bb-b85d-8cfe99734987' *and* spc.c_id = '9fd29fdc-15fd-40bb-b85d-8cfe99734987' *AND* rr.b_id = 'xyz' *AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr. p_id = 'GLOBAL')) *AND* spc.permission_type *IS* *null* *and* spc. is_active = *true* *AND* rpc.is_active = *true* *AND* rr.is_active = *true* *AND* rs.is_active = *true* *AND* r.is_active = *true* I don't think it is super complex. But when I run explain analyze on this I get the following: Planning Time: 578.068 ms Execution Time: 0.113 ms This is a huge deviation in planning vs. execution times. The explain plan looks good since the execution time is < 1ms. It doesn't matter though since the planning time is high. I don't see anything in the explain analyze output that tells me why the planning time is high. On average, the tables being joined have 3 indexes/table. How can I debug this? Been stuck on this for weeks. Any help is appreciated. Thank you! Saurabh
Re: Slow Planning Times
To clarify - I have run "vaccum full" and "vacuum analyze" on every single table involved in the query and the planning times are still around the same and were not impacted. On Wed, Apr 6, 2022 at 5:26 PM Saurabh Sehgal wrote: > > I have the following query: > > *explain* (*analyze*, costs, timing) *SELECT* rr.* *FROM* rpc rpc > >*INNER* *JOIN* rr rr > >*ON* rr.uuid = rpc.rr_id > >*INNER* *JOIN* rs rs > >*ON* rs.r_id = rpc.r_id > >*INNER* *JOIN* *role* r > >*ON* r.uuid = rs.r_id > >*LEFT* *JOIN* spc spc > >*ON* spc.rr_id = rpc.rr_id > >*WHERE* rs.s_id = > 'caa767b8-8371-43a3-aa11-d1dba1893601' > >*and* spc.s_id = > 'caa767b8-8371-43a3-aa11-d1dba1893601' > >*and* spc.rd_id = > '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' > >*AND* rpc.rd_id = > '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' > >*AND* rpc.c_id = > '9fd29fdc-15fd-40bb-b85d-8cfe99734987' > >*and* spc.c_id = > '9fd29fdc-15fd-40bb-b85d-8cfe99734987' > >*AND* rr.b_id = 'xyz' > >*AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr. > p_id = 'GLOBAL')) > >*AND* spc.permission_type *IS* *null* *and* spc. > is_active = *true* > >*AND* rpc.is_active = *true* *AND* rr.is_active = > *true* *AND* rs.is_active = *true* *AND* r.is_active = *true* > > > I don't think it is super complex. But when I run explain analyze on this > I get the following: > > Planning Time: 578.068 ms > Execution Time: 0.113 ms > > This is a huge deviation in planning vs. execution times. The explain plan > looks good since the execution time is < 1ms. It doesn't matter though > since the planning time is high. I don't see anything in the explain > analyze output that tells me why the planning time is high. On average, the > tables being joined have 3 indexes/table. How can I debug this? > > Been stuck on this for weeks. Any help is appreciated. Thank you! > > Saurabh > -- Saurabh Sehgal E-mail: saurabh@gmail.com Phone: 425-269-1324 LinkedIn: https://www.linkedin.com/in/saurabh-s-4367a31/
Re: Slow Planning Times
On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal wrote: > > I have the following query: > > *explain* (*analyze*, costs, timing) *SELECT* rr.* *FROM* rpc rpc > >*INNER* *JOIN* rr rr > >*ON* rr.uuid = rpc.rr_id > >*INNER* *JOIN* rs rs > >*ON* rs.r_id = rpc.r_id > >*INNER* *JOIN* *role* r > >*ON* r.uuid = rs.r_id > >*LEFT* *JOIN* spc spc > >*ON* spc.rr_id = rpc.rr_id > >*WHERE* rs.s_id = > 'caa767b8-8371-43a3-aa11-d1dba1893601' > >*and* spc.s_id = > 'caa767b8-8371-43a3-aa11-d1dba1893601' > >*and* spc.rd_id = > '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' > >*AND* rpc.rd_id = > '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' > >*AND* rpc.c_id = > '9fd29fdc-15fd-40bb-b85d-8cfe99734987' > >*and* spc.c_id = > '9fd29fdc-15fd-40bb-b85d-8cfe99734987' > >*AND* rr.b_id = 'xyz' > >*AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr. > p_id = 'GLOBAL')) > >*AND* spc.permission_type *IS* *null* *and* spc. > is_active = *true* > >*AND* rpc.is_active = *true* *AND* rr.is_active = > *true* *AND* rs.is_active = *true* *AND* r.is_active = *true* > > > I don't think it is super complex. But when I run explain analyze on this > I get the following: > > Planning Time: 578.068 ms > Execution Time: 0.113 ms > > This is a huge deviation in planning vs. execution times. The explain plan > looks good since the execution time is < 1ms. It doesn't matter though > since the planning time is high. I don't see anything in the explain > analyze output that tells me why the planning time is high. On average, the > tables being joined have 3 indexes/table. How can I debug this? > > Been stuck on this for weeks. Any help is appreciated. Thank you! > > The fundamental issue here is that you have basically 12 conditions across 5 tables that need to be evaluated to determine which one of the 1,680 possible join orders is the most efficient. The fact that you have 5 is_active checks and 3 pairs of matching UUID checks seems odd and if you could reduce those 11 to 4 I suspect you'd get a better planning time. Though it also may produce an inferior plan...thus consider the following option: Assuming the ideal plan shape for your data doesn't change you can read the following and basically tell the planner to stop trying so hard and just trust the join order that exists in the query. https://www.postgresql.org/docs/current/explicit-joins.html Lastly, if you can leverage prepared statements you can at least amortize the cost (depending on whether a generic plan performs sufficiently quickly). I'll admit I'm no expert at this. I'd probably just follow the join_collapse_limit advice and move on if it works. Maybe adding a periodic check to see if anything has changed. David J.
Re: Slow Planning Times
I added the additional where clauses to remove needing to join multiple columns which I guess didn't really help. This is the original query: *SELECT* rr.* *FROM* rpc rpc *INNER* *JOIN rr* rr *ON* rr.uuid = rpc.rr_id *INNER* *JOIN* rs rs *ON* rs.r_d = rpc.r_id *INNER* *JOIN* *role* r *ON* r.uuid = rs.r_id *inner* *JOIN* subject_permission_control spc *ON* spc.rr_id = rpc.rr_id *AND* spc.s_id = rs.s_id *AND* spc.c_id = rpc.c_id *AND* spc.is_active = *true* *WHERE* rs.s_id = 'caa767b8-8371-43a3-aa11-d1dba1893601' *AND* rpc.rr_id = '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' *AND* rpc.c_id = '9fd29fdc-15fd-40bb-b85d-8cfe99734987' *AND* rr.b_id = 'testb1' *AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr. p_id = 'GLOBAL')) *AND* spc.type *IS* *NULL* *AND* rpc.is_active = *true* *AND* rr.is_active = *true* *AND* rs.is_active = *true* *AND* r.is_active = *true* I tied prepared statements and I am stuck. Using prepared statement almost always chooses a crappy generic plan that runs slow. If I don't user prepared statement, the plan is efficient but the planning time is slow. I'll try the join_collapse_limit advice and see if that helps. Thank you! On Wed, Apr 6, 2022 at 5:54 PM David G. Johnston wrote: > On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal > wrote: > >> >> I have the following query: >> >> *explain* (*analyze*, costs, timing) *SELECT* rr.* *FROM* rpc rpc >> >>*INNER* *JOIN* rr rr >> >>*ON* rr.uuid = rpc.rr_id >> >>*INNER* *JOIN* rs rs >> >>*ON* rs.r_id = rpc.r_id >> >>*INNER* *JOIN* *role* r >> >>*ON* r.uuid = rs.r_id >> >>*LEFT* *JOIN* spc spc >> >>*ON* spc.rr_id = rpc.rr_id >> >>*WHERE* rs.s_id = >> 'caa767b8-8371-43a3-aa11-d1dba1893601' >> >>*and* spc.s_id = >> 'caa767b8-8371-43a3-aa11-d1dba1893601' >> >>*and* spc.rd_id = >> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' >> >>*AND* rpc.rd_id = >> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa' >> >>*AND* rpc.c_id = >> '9fd29fdc-15fd-40bb-b85d-8cfe99734987' >> >>*and* spc.c_id = >> '9fd29fdc-15fd-40bb-b85d-8cfe99734987' >> >>*AND* rr.b_id = 'xyz' >> >>*AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr >> .p_id = 'GLOBAL')) >> >>*AND* spc.permission_type *IS* *null* *and* spc. >> is_active = *true* >> >>*AND* rpc.is_active = *true* *AND* rr.is_active = >> *true* *AND* rs.is_active = *true* *AND* r.is_active = *true* >> >> >> I don't think it is super complex. But when I run explain analyze on this >> I get the following: >> >> Planning Time: 578.068 ms >> Execution Time: 0.113 ms >> >> This is a huge deviation in planning vs. execution times. The explain >> plan looks good since the execution time is < 1ms. It doesn't matter though >> since the planning time is high. I don't see anything in the explain >> analyze output that tells me why the planning time is high. On average, the >> tables being joined have 3 indexes/table. How can I debug this? >> >> Been stuck on this for weeks. Any help is appreciated. Thank you! >> >> > The fundamental issue here is that you have basically 12 conditions across > 5 tables that need to be evaluated to determine which one of the 1,680 > possible join orders is the most efficient. The fact that you have 5 > is_active checks and 3 pairs of matching UUID checks seems odd and if you > could reduce those 11 to 4 I suspect you'd get a better planning time. > Though it also may produce an inferior plan...thus consider the following > option: > > Assuming the ideal plan shape for your data doesn't change you can read > the following and basically tell the planner to stop trying so hard and > just trust the join order that exists in the query. > > https://www.postgresql.org/docs/current/explicit-joins.html > > Lastly, if you can leverage prepared statements you can at least amortize > the cost (depending on whether a generic plan performs sufficiently > quickly). > > I'll admit I'm no expert at this. I'd probably just follow the > join_collapse_limit advice and move on if it works. Maybe adding a > periodic check to see if anything has changed. > David J. > > -- Saurabh Sehgal E-mail: saurabh@gmail.com Phone: 425-269-1324
Re: Slow Planning Times
"David G. Johnston" writes: > On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal wrote: >> I have the following query: >> I don't think it is super complex. But when I run explain analyze on this >> I get the following: >> Planning Time: 578.068 ms >> Execution Time: 0.113 ms > The fundamental issue here is that you have basically 12 conditions across > 5 tables that need to be evaluated to determine which one of the 1,680 > possible join orders is the most efficient. A 5-way join doesn't seem particularly outrageous. But I'm wondering if these are all plain tables or if some of them are actually complex views. Another possibility is that the statistics target has been cranked to the moon and the planner is spending all its time sifting through huge statistics arrays. It'd be interesting to see the actual schemas for the tables, as well as EXPLAIN's output for this query. I'm wondering exactly which PG version this is, too. regards, tom lane
Re: Slow Planning Times
I just figured this out. Someone had set the default_statistics_target to 5000 instead of 500 I think. I changed it to 500, ran analyze and planning time is much better. In case someone runs into this problem, sending this out here. Thank you all. On Wed, Apr 6, 2022 at 7:57 PM Tom Lane wrote: > "David G. Johnston" writes: > > On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal > wrote: > >> I have the following query: > >> I don't think it is super complex. But when I run explain analyze on > this > >> I get the following: > >> Planning Time: 578.068 ms > >> Execution Time: 0.113 ms > > > The fundamental issue here is that you have basically 12 conditions > across > > 5 tables that need to be evaluated to determine which one of the 1,680 > > possible join orders is the most efficient. > > A 5-way join doesn't seem particularly outrageous. But I'm wondering > if these are all plain tables or if some of them are actually complex > views. Another possibility is that the statistics target has been > cranked to the moon and the planner is spending all its time sifting > through huge statistics arrays. > > It'd be interesting to see the actual schemas for the tables, > as well as EXPLAIN's output for this query. I'm wondering > exactly which PG version this is, too. > > regards, tom lane > -- Saurabh Sehgal E-mail: saurabh@gmail.com Phone: 425-269-1324 LinkedIn: https://www.linkedin.com/in/saurabh-s-4367a31/