Slow Planning Times

2022-04-06 Thread Saurabh Sehgal
I have the following query:

 *explain* (*analyze*, costs, timing) *SELECT*  rr.* *FROM* rpc rpc

   *INNER* *JOIN* rr rr

   *ON* rr.uuid = rpc.rr_id

   *INNER* *JOIN* rs rs

   *ON* rs.r_id = rpc.r_id

   *INNER* *JOIN* *role* r

   *ON* r.uuid = rs.r_id

   *LEFT* *JOIN* spc spc

   *ON* spc.rr_id = rpc.rr_id

   *WHERE* rs.s_id = 'caa767b8-8371-43a3-aa11-d1dba1893601'

   *and* spc.s_id  =
'caa767b8-8371-43a3-aa11-d1dba1893601'

   *and* spc.rd_id  =
'9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'

   *AND* rpc.rd_id =
'9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'

   *AND* rpc.c_id =
'9fd29fdc-15fd-40bb-b85d-8cfe99734987'

   *and* spc.c_id  =
'9fd29fdc-15fd-40bb-b85d-8cfe99734987'

   *AND* rr.b_id = 'xyz'

   *AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr.
p_id = 'GLOBAL'))

   *AND* spc.permission_type *IS* *null* *and* spc.
is_active  = *true*

   *AND* rpc.is_active = *true* *AND* rr.is_active =
*true* *AND* rs.is_active = *true* *AND* r.is_active = *true*


I don't think it is super complex. But when I run explain analyze on this I
get the following:

Planning Time: 578.068 ms
Execution Time: 0.113 ms

This is a huge deviation in planning vs. execution times. The explain plan
looks good since the execution time is < 1ms. It doesn't matter though
since the planning time is high. I don't see anything in the explain
analyze output that tells me why the planning time is high. On average, the
tables being joined have 3 indexes/table. How can I debug this?

Been stuck on this for weeks. Any help is appreciated. Thank you!

Saurabh


Re: Slow Planning Times

2022-04-06 Thread Saurabh Sehgal
To clarify -  I have run "vaccum full" and "vacuum analyze" on every single
table involved in the query and the planning times are still around the
same and were not impacted.

On Wed, Apr 6, 2022 at 5:26 PM Saurabh Sehgal  wrote:

>
> I have the following query:
>
>  *explain* (*analyze*, costs, timing) *SELECT*  rr.* *FROM* rpc rpc
>
>*INNER* *JOIN* rr rr
>
>*ON* rr.uuid = rpc.rr_id
>
>*INNER* *JOIN* rs rs
>
>*ON* rs.r_id = rpc.r_id
>
>*INNER* *JOIN* *role* r
>
>*ON* r.uuid = rs.r_id
>
>*LEFT* *JOIN* spc spc
>
>*ON* spc.rr_id = rpc.rr_id
>
>*WHERE* rs.s_id =
> 'caa767b8-8371-43a3-aa11-d1dba1893601'
>
>*and* spc.s_id  =
> 'caa767b8-8371-43a3-aa11-d1dba1893601'
>
>*and* spc.rd_id  =
> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'
>
>*AND* rpc.rd_id =
> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'
>
>*AND* rpc.c_id =
> '9fd29fdc-15fd-40bb-b85d-8cfe99734987'
>
>*and* spc.c_id  =
> '9fd29fdc-15fd-40bb-b85d-8cfe99734987'
>
>*AND* rr.b_id = 'xyz'
>
>*AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr.
> p_id = 'GLOBAL'))
>
>*AND* spc.permission_type *IS* *null* *and* spc.
> is_active  = *true*
>
>*AND* rpc.is_active = *true* *AND* rr.is_active =
> *true* *AND* rs.is_active = *true* *AND* r.is_active = *true*
>
>
> I don't think it is super complex. But when I run explain analyze on this
> I get the following:
>
> Planning Time: 578.068 ms
> Execution Time: 0.113 ms
>
> This is a huge deviation in planning vs. execution times. The explain plan
> looks good since the execution time is < 1ms. It doesn't matter though
> since the planning time is high. I don't see anything in the explain
> analyze output that tells me why the planning time is high. On average, the
> tables being joined have 3 indexes/table. How can I debug this?
>
> Been stuck on this for weeks. Any help is appreciated. Thank you!
>
> Saurabh
>


-- 
Saurabh Sehgal
E-mail: saurabh@gmail.com
Phone: 425-269-1324
LinkedIn: https://www.linkedin.com/in/saurabh-s-4367a31/


Re: Slow Planning Times

2022-04-06 Thread David G. Johnston
On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal  wrote:

>
> I have the following query:
>
>  *explain* (*analyze*, costs, timing) *SELECT*  rr.* *FROM* rpc rpc
>
>*INNER* *JOIN* rr rr
>
>*ON* rr.uuid = rpc.rr_id
>
>*INNER* *JOIN* rs rs
>
>*ON* rs.r_id = rpc.r_id
>
>*INNER* *JOIN* *role* r
>
>*ON* r.uuid = rs.r_id
>
>*LEFT* *JOIN* spc spc
>
>*ON* spc.rr_id = rpc.rr_id
>
>*WHERE* rs.s_id =
> 'caa767b8-8371-43a3-aa11-d1dba1893601'
>
>*and* spc.s_id  =
> 'caa767b8-8371-43a3-aa11-d1dba1893601'
>
>*and* spc.rd_id  =
> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'
>
>*AND* rpc.rd_id =
> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'
>
>*AND* rpc.c_id =
> '9fd29fdc-15fd-40bb-b85d-8cfe99734987'
>
>*and* spc.c_id  =
> '9fd29fdc-15fd-40bb-b85d-8cfe99734987'
>
>*AND* rr.b_id = 'xyz'
>
>*AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr.
> p_id = 'GLOBAL'))
>
>*AND* spc.permission_type *IS* *null* *and* spc.
> is_active  = *true*
>
>*AND* rpc.is_active = *true* *AND* rr.is_active =
> *true* *AND* rs.is_active = *true* *AND* r.is_active = *true*
>
>
> I don't think it is super complex. But when I run explain analyze on this
> I get the following:
>
> Planning Time: 578.068 ms
> Execution Time: 0.113 ms
>
> This is a huge deviation in planning vs. execution times. The explain plan
> looks good since the execution time is < 1ms. It doesn't matter though
> since the planning time is high. I don't see anything in the explain
> analyze output that tells me why the planning time is high. On average, the
> tables being joined have 3 indexes/table. How can I debug this?
>
> Been stuck on this for weeks. Any help is appreciated. Thank you!
>
>
The fundamental issue here is that you have basically 12 conditions across
5 tables that need to be evaluated to determine which one of the 1,680
possible join orders is the most efficient.  The fact that you have 5
is_active checks and 3 pairs of matching UUID checks seems odd and if you
could reduce those 11 to 4 I suspect you'd get a better planning time.
Though it also may produce an inferior plan...thus consider the following
option:

Assuming the ideal plan shape for your data doesn't change you can read the
following and basically tell the planner to stop trying so hard and just
trust the join order that exists in the query.

https://www.postgresql.org/docs/current/explicit-joins.html

Lastly, if you can leverage prepared statements you can at least amortize
the cost (depending on whether a generic plan performs sufficiently
quickly).

I'll admit I'm no expert at this.  I'd probably just follow the
join_collapse_limit advice and move on if it works.  Maybe adding a
periodic check to see if anything has changed.
David J.


Re: Slow Planning Times

2022-04-06 Thread Saurabh Sehgal
I added the additional where clauses to remove needing to join multiple
columns which I guess didn't really help. This is the original query:

  *SELECT*  rr.* *FROM* rpc rpc

   *INNER* *JOIN rr* rr

   *ON* rr.uuid = rpc.rr_id

   *INNER* *JOIN* rs rs

   *ON* rs.r_d = rpc.r_id

   *INNER* *JOIN* *role* r

   *ON* r.uuid = rs.r_id

   *inner* *JOIN* subject_permission_control spc

   *ON* spc.rr_id = rpc.rr_id

   *AND* spc.s_id = rs.s_id

   *AND* spc.c_id = rpc.c_id

   *AND* spc.is_active = *true*

   *WHERE* rs.s_id = 'caa767b8-8371-43a3-aa11-d1dba1893601'

   *AND* rpc.rr_id =
'9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'

   *AND* rpc.c_id =
'9fd29fdc-15fd-40bb-b85d-8cfe99734987'

   *AND* rr.b_id = 'testb1'

   *AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr.
p_id = 'GLOBAL'))

   *AND* spc.type *IS* *NULL*

   *AND* rpc.is_active = *true* *AND* rr.is_active =
*true* *AND* rs.is_active = *true* *AND* r.is_active = *true*




I tied prepared statements and I am stuck. Using prepared statement almost
always chooses a crappy generic plan that runs slow. If I don't user
prepared statement, the plan is efficient but the planning time is slow.
I'll try the join_collapse_limit advice and see if that helps. Thank you!

On Wed, Apr 6, 2022 at 5:54 PM David G. Johnston 
wrote:

> On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal 
> wrote:
>
>>
>> I have the following query:
>>
>>  *explain* (*analyze*, costs, timing) *SELECT*  rr.* *FROM* rpc rpc
>>
>>*INNER* *JOIN* rr rr
>>
>>*ON* rr.uuid = rpc.rr_id
>>
>>*INNER* *JOIN* rs rs
>>
>>*ON* rs.r_id = rpc.r_id
>>
>>*INNER* *JOIN* *role* r
>>
>>*ON* r.uuid = rs.r_id
>>
>>*LEFT* *JOIN* spc spc
>>
>>*ON* spc.rr_id = rpc.rr_id
>>
>>*WHERE* rs.s_id =
>> 'caa767b8-8371-43a3-aa11-d1dba1893601'
>>
>>*and* spc.s_id  =
>> 'caa767b8-8371-43a3-aa11-d1dba1893601'
>>
>>*and* spc.rd_id  =
>> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'
>>
>>*AND* rpc.rd_id =
>> '9f33c45a-90c2-4e05-a42e-048ec1f2b2fa'
>>
>>*AND* rpc.c_id =
>> '9fd29fdc-15fd-40bb-b85d-8cfe99734987'
>>
>>*and* spc.c_id  =
>> '9fd29fdc-15fd-40bb-b85d-8cfe99734987'
>>
>>*AND* rr.b_id = 'xyz'
>>
>>*AND* (('GLOBAL' = ' NO_PROJECT_ID + "' ) *OR* (rr
>> .p_id = 'GLOBAL'))
>>
>>*AND* spc.permission_type *IS* *null* *and* spc.
>> is_active  = *true*
>>
>>*AND* rpc.is_active = *true* *AND* rr.is_active =
>> *true* *AND* rs.is_active = *true* *AND* r.is_active = *true*
>>
>>
>> I don't think it is super complex. But when I run explain analyze on this
>> I get the following:
>>
>> Planning Time: 578.068 ms
>> Execution Time: 0.113 ms
>>
>> This is a huge deviation in planning vs. execution times. The explain
>> plan looks good since the execution time is < 1ms. It doesn't matter though
>> since the planning time is high. I don't see anything in the explain
>> analyze output that tells me why the planning time is high. On average, the
>> tables being joined have 3 indexes/table. How can I debug this?
>>
>> Been stuck on this for weeks. Any help is appreciated. Thank you!
>>
>>
> The fundamental issue here is that you have basically 12 conditions across
> 5 tables that need to be evaluated to determine which one of the 1,680
> possible join orders is the most efficient.  The fact that you have 5
> is_active checks and 3 pairs of matching UUID checks seems odd and if you
> could reduce those 11 to 4 I suspect you'd get a better planning time.
> Though it also may produce an inferior plan...thus consider the following
> option:
>
> Assuming the ideal plan shape for your data doesn't change you can read
> the following and basically tell the planner to stop trying so hard and
> just trust the join order that exists in the query.
>
> https://www.postgresql.org/docs/current/explicit-joins.html
>
> Lastly, if you can leverage prepared statements you can at least amortize
> the cost (depending on whether a generic plan performs sufficiently
> quickly).
>
> I'll admit I'm no expert at this.  I'd probably just follow the
> join_collapse_limit advice and move on if it works.  Maybe adding a
> periodic check to see if anything has changed.
> David J.
>
>

-- 
Saurabh Sehgal
E-mail: saurabh@gmail.com
Phone: 425-269-1324

Re: Slow Planning Times

2022-04-06 Thread Tom Lane
"David G. Johnston"  writes:
> On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal  wrote:
>> I have the following query:
>> I don't think it is super complex. But when I run explain analyze on this
>> I get the following:
>> Planning Time: 578.068 ms
>> Execution Time: 0.113 ms

> The fundamental issue here is that you have basically 12 conditions across
> 5 tables that need to be evaluated to determine which one of the 1,680
> possible join orders is the most efficient.

A 5-way join doesn't seem particularly outrageous.  But I'm wondering
if these are all plain tables or if some of them are actually complex
views.  Another possibility is that the statistics target has been
cranked to the moon and the planner is spending all its time sifting
through huge statistics arrays.

It'd be interesting to see the actual schemas for the tables,
as well as EXPLAIN's output for this query.  I'm wondering
exactly which PG version this is, too.

regards, tom lane




Re: Slow Planning Times

2022-04-06 Thread Saurabh Sehgal
I just figured this out. Someone had set the default_statistics_target to
5000  instead of 500 I think. I changed it to 500, ran analyze and
planning time is much better. In case someone runs into this problem,
sending this out here. Thank you all.

On Wed, Apr 6, 2022 at 7:57 PM Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Wed, Apr 6, 2022 at 5:27 PM Saurabh Sehgal 
> wrote:
> >> I have the following query:
> >> I don't think it is super complex. But when I run explain analyze on
> this
> >> I get the following:
> >> Planning Time: 578.068 ms
> >> Execution Time: 0.113 ms
>
> > The fundamental issue here is that you have basically 12 conditions
> across
> > 5 tables that need to be evaluated to determine which one of the 1,680
> > possible join orders is the most efficient.
>
> A 5-way join doesn't seem particularly outrageous.  But I'm wondering
> if these are all plain tables or if some of them are actually complex
> views.  Another possibility is that the statistics target has been
> cranked to the moon and the planner is spending all its time sifting
> through huge statistics arrays.
>
> It'd be interesting to see the actual schemas for the tables,
> as well as EXPLAIN's output for this query.  I'm wondering
> exactly which PG version this is, too.
>
> regards, tom lane
>


-- 
Saurabh Sehgal
E-mail: saurabh@gmail.com
Phone: 425-269-1324
LinkedIn: https://www.linkedin.com/in/saurabh-s-4367a31/