Re: [postgis-users] Nested loop join = very bad performance

2010-03-31 Thread Mike Leahy
Hi,

Thanks again for the suggestions.  The default values for join_collapse_limit 
and from_collapse_limit were set to 8 - I upped these both to 128, with no 
observable difference.

I guess this issue is better suited for the general PostgreSQL mailing 
lists...

Regards,
Mike

On Wednesday 31 March 2010 15:51:14 Paragon Corporation wrote:
> Mike,
> 
> Couple of  thoughts.  Given you have so many joins, could be you are
> reaching the join collapse limit and the planner is kicking out before
> making an optimal plan.
> 
> Try increasing the join_collapse_limit and from_collapse_limit
> 
> As was detailed in this thread
> 
> http://archives.postgresql.org/pgsql-performance/2009-04/msg00258.php
> 
> 
> 
> Alternatively could be your actual and estimated costs are out of wack and
> might help upping your default targets and reanalyzing data. You can
> probably get a sense of this by doing a an explain analyze of your query
>  and comparing the actual cost/row count with the estimated cost/row count
>  where its doing a nested loop.
> 
>  Admittedly this hasn't helped much for us.
> 
> http://archives.postgresql.org/pgsql-performance/2009-02/msg00336.php
> 
> Leo and Regina,
> 
> http:///www.postgis.us
> 
> 
> 
> 
> -Original Message-
> From: postgis-users-boun...@postgis.refractions.net
> [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Mike
> Leahy
> Sent: Wednesday, March 31, 2010 1:17 PM
> To: Mark Cave-Ayland
> Cc: PostGIS Users Discussion
> Subject: Re: [postgis-users] Nested loop join = very bad performance
> 
> Hi Mark,
> 
> I set effective_cache_size to 3072mb, and shared_buffers to 1024mb (as my
> system has a total of 4gb).  This only slightly (if at all) improves the
> performance, maybe reducing the query by somewhere around 500 ms (down to
> ~14700 ms).  All other parameters in the postgresql.conf are defaults.  I
> don't recall exactly what I changed before (I was just tried increasing
> memory limits and other things pretty much without knowing what I was
> doing), but none of that really seems to have a significant impact on the
> performance.
> 
> The challenge with trying to reduce this query is that the nested loop join
> only happens with the query as a whole (in general).  The briefest example
>  I could put together was presented in the thread last week (see the
>  attachment here:
>  http://postgis.org/pipermail/postgis-users/2010-March/026239.html). If I
>  pull any more parameters or parts out of the query, the nest loop (and the
>  resulting errors/crashes I was encountering at the time) would not happen.
> 
> Regards,
> Mike
> 
> On Wednesday 31 March 2010 04:40:09 Mark Cave-Ayland wrote:
> > Mike Leahy wrote:
> > > Mark/List,
> > >
> > > I just replaced my postgresql.conf with the default copy that
> > > appears in /etc/postgresql/8.4/main/ after a fresh install.  The
> > > performance is pretty much the same as before (maybe even about 400
> > > ms worse than before).
> > >
> > > Is there anything else I should try?
> > >
> > > Mike
> >
> > Hi Mike,
> >
> > Which parameters did you change? effective_cache_size and
> > shared_buffers should be tweaked to suit the RAM available in your
> > machine but the rest of the defaults are fairly sensible.
> >
> > You probably want to set effective_cache_size to ~75% of your physical
> > RAM and shared_buffers to ~25%. Does that make any difference at all?
> >
> > Otherwise, you'll need to start breaking down your query into parts to
> > see which bit is causing the slowdown. Start with the innermost query
> > and then add one join at a time until you find the part which is
> > causing the slowdown.
> >
> >
> > ATB,
> >
> > Mark.
> 
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Nested loop join = very bad performance

2010-03-31 Thread Paragon Corporation
Mike,

Couple of  thoughts.  Given you have so many joins, could be you are
reaching the join collapse limit and the planner is kicking out before
making an optimal plan.

Try increasing the join_collapse_limit and from_collapse_limit

As was detailed in this thread

http://archives.postgresql.org/pgsql-performance/2009-04/msg00258.php



Alternatively could be your actual and estimated costs are out of wack and
might help upping your default targets and reanalyzing data. You can
probably get a sense of this by doing a an explain analyze of your query and
comparing the actual cost/row count with the estimated cost/row count where
its doing a nested loop.

 Admittedly this hasn't helped much for us.

http://archives.postgresql.org/pgsql-performance/2009-02/msg00336.php

Leo and Regina,

http:///www.postgis.us


 

-Original Message-
From: postgis-users-boun...@postgis.refractions.net
[mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Mike
Leahy
Sent: Wednesday, March 31, 2010 1:17 PM
To: Mark Cave-Ayland
Cc: PostGIS Users Discussion
Subject: Re: [postgis-users] Nested loop join = very bad performance

Hi Mark,

I set effective_cache_size to 3072mb, and shared_buffers to 1024mb (as my
system has a total of 4gb).  This only slightly (if at all) improves the
performance, maybe reducing the query by somewhere around 500 ms (down to
~14700 ms).  All other parameters in the postgresql.conf are defaults.  I
don't recall exactly what I changed before (I was just tried increasing
memory limits and other things pretty much without knowing what I was
doing), but none of that really seems to have a significant impact on the
performance.

The challenge with trying to reduce this query is that the nested loop join
only happens with the query as a whole (in general).  The briefest example I
could put together was presented in the thread last week (see the attachment
here: http://postgis.org/pipermail/postgis-users/2010-March/026239.html).
If I pull any more parameters or parts out of the query, the nest loop (and
the resulting errors/crashes I was encountering at the time) would not
happen.

Regards,
Mike


On Wednesday 31 March 2010 04:40:09 Mark Cave-Ayland wrote:
> Mike Leahy wrote:
> > Mark/List,
> >
> > I just replaced my postgresql.conf with the default copy that 
> > appears in /etc/postgresql/8.4/main/ after a fresh install.  The 
> > performance is pretty much the same as before (maybe even about 400 
> > ms worse than before).
> >
> > Is there anything else I should try?
> >
> > Mike
> 
> Hi Mike,
> 
> Which parameters did you change? effective_cache_size and 
> shared_buffers should be tweaked to suit the RAM available in your 
> machine but the rest of the defaults are fairly sensible.
> 
> You probably want to set effective_cache_size to ~75% of your physical 
> RAM and shared_buffers to ~25%. Does that make any difference at all?
> 
> Otherwise, you'll need to start breaking down your query into parts to 
> see which bit is causing the slowdown. Start with the innermost query 
> and then add one join at a time until you find the part which is 
> causing the slowdown.
> 
> 
> ATB,
> 
> Mark.
> 
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Nested loop join = very bad performance

2010-03-31 Thread Mike Leahy
Hi Mark,

I set effective_cache_size to 3072mb, and shared_buffers to 1024mb (as my 
system 
has a total of 4gb).  This only slightly (if at all) improves the performance, 
maybe reducing the query by somewhere around 500 ms (down to ~14700 ms).  All 
other parameters in the postgresql.conf are defaults.  I don't recall exactly 
what I changed before (I was just tried increasing memory limits and other 
things pretty much without knowing what I was doing), but none of that really 
seems to have a significant impact on the performance.

The challenge with trying to reduce this query is that the nested loop join 
only happens with the query as a whole (in general).  The briefest example I 
could put together was presented in the thread last week (see the attachment 
here: http://postgis.org/pipermail/postgis-users/2010-March/026239.html).  If 
I pull any more parameters or parts out of the query, the nest loop (and the 
resulting errors/crashes I was encountering at the time) would not happen.

Regards,
Mike


On Wednesday 31 March 2010 04:40:09 Mark Cave-Ayland wrote:
> Mike Leahy wrote:
> > Mark/List,
> >
> > I just replaced my postgresql.conf with the default copy that appears in
> > /etc/postgresql/8.4/main/ after a fresh install.  The performance is
> > pretty much the same as before (maybe even about 400 ms worse than
> > before).
> >
> > Is there anything else I should try?
> >
> > Mike
> 
> Hi Mike,
> 
> Which parameters did you change? effective_cache_size and shared_buffers
> should be tweaked to suit the RAM available in your machine but the rest
> of the defaults are fairly sensible.
> 
> You probably want to set effective_cache_size to ~75% of your physical
> RAM and shared_buffers to ~25%. Does that make any difference at all?
> 
> Otherwise, you'll need to start breaking down your query into parts to
> see which bit is causing the slowdown. Start with the innermost query
> and then add one join at a time until you find the part which is causing
> the slowdown.
> 
> 
> ATB,
> 
> Mark.
> 
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Nested loop join = very bad performance

2010-03-31 Thread Mark Cave-Ayland

Mike Leahy wrote:


Mark/List,

I just replaced my postgresql.conf with the default copy that appears in 
/etc/postgresql/8.4/main/ after a fresh install.  The performance is pretty 
much the same as before (maybe even about 400 ms worse than before).


Is there anything else I should try?

Mike


Hi Mike,

Which parameters did you change? effective_cache_size and shared_buffers 
should be tweaked to suit the RAM available in your machine but the rest 
of the defaults are fairly sensible.


You probably want to set effective_cache_size to ~75% of your physical 
RAM and shared_buffers to ~25%. Does that make any difference at all?


Otherwise, you'll need to start breaking down your query into parts to 
see which bit is causing the slowdown. Start with the innermost query 
and then add one join at a time until you find the part which is causing 
the slowdown.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Nested loop join = very bad performance

2010-03-29 Thread Mike Leahy
Mark/List,

I just replaced my postgresql.conf with the default copy that appears in 
/etc/postgresql/8.4/main/ after a fresh install.  The performance is pretty 
much the same as before (maybe even about 400 ms worse than before).

Is there anything else I should try?

Mike

On Monday 29 March 2010 17:35:29 Mike Leahy wrote:
> Hi Mark,
> 
> I don't recall making any changes...at least not until the error I
>  encountered last week cropped up - but I think I reversed those changes (I
>  will double- check tonight).  But I think I have encountered performance
>  problems with this query before on other systems - I had originally
>  dismissed it as hardware limitations I was dealing in my testing system at
>  the time.  After recent hardware upgrades and after dealing with that
>  specific bug, that's when the slow performance of this query really stood
>  out.
> 
> Mike
> 
> On Monday 29 March 2010 07:53:50 Mark Cave-Ayland wrote:
> > Mike Leahy wrote:
> > > Is this something that should be looked into, or should I just
> > > incorporate the workaround of disabling nested loop joins into my code?
> >
> > I find that generally PostgreSQL is fairly good at its estimates unless
> > someone has already tried to tune the database. Have you changed any of
> > the settings in postgresql.conf from their defaults?
> >
> >
> > ATB,
> >
> > Mark.
> 
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Nested loop join = very bad performance

2010-03-29 Thread Mike Leahy
Hi Mark,

I don't recall making any changes...at least not until the error I encountered 
last week cropped up - but I think I reversed those changes (I will double-
check tonight).  But I think I have encountered performance problems with this 
query before on other systems - I had originally dismissed it as hardware 
limitations I was dealing in my testing system at the time.  After recent 
hardware upgrades and after dealing with that specific bug, that's when the 
slow performance of this query really stood out.

Mike

On Monday 29 March 2010 07:53:50 Mark Cave-Ayland wrote:
> Mike Leahy wrote:
> > Is this something that should be looked into, or should I just
> > incorporate the workaround of disabling nested loop joins into my code?
> 
> I find that generally PostgreSQL is fairly good at its estimates unless
> someone has already tried to tune the database. Have you changed any of
> the settings in postgresql.conf from their defaults?
> 
> 
> ATB,
> 
> Mark.
> 
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Nested loop join = very bad performance

2010-03-29 Thread Mark Cave-Ayland

Mike Leahy wrote:

Is this something that should be looked into, or should I just incorporate the 
workaround of disabling nested loop joins into my code?


I find that generally PostgreSQL is fairly good at its estimates unless 
someone has already tried to tune the database. Have you changed any of 
the settings in postgresql.conf from their defaults?



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Nested loop join = very bad performance

2010-03-28 Thread Mike Leahy
Sorry all...here's the explain output I promised in my previous message...

On Sunday 28 March 2010 11:57:01 Mike Leahy wrote:
> Hello list,
> 
> Following up from the bug that was successfully resolved last week
> (http://trac.osgeo.org/postgis/ticket/469), I've found that nested loop
>  joins are actually causing very bad performance with my queries.
> 
> Thanks to the observations of Leo and Regina in that discussion, I have
>  found that I can work around this by setting enable_nestloop = off.  Once
>  that is specified, queries start working about as fast.  With
>  enable_nestloop = on, a query may take ~15000 ms, while enable_nestloop =
>  off returns the same query in ~90 ms.
> 
> I have attached a textfile showing the different 'explain analzye' results
>  for the full query with/without nestloop enabled.  You could also refer to
>  the somewhat simplified queries I put together in discussion thread
>  related to bug #469 (see
>  http://postgis.refractions.net/pipermail/postgis-users/2010-
>  March/026235.html for some examples).
> 
> Is this something that should be looked into, or should I just incorporate
>  the workaround of disabling nested loop joins into my code?
> 
> Thanks,
> Mike
> 
enable_nestloop = on:



 QUERY PLAN  

 Sort  (cost=260.10..260.10 rows=1 width=385) (actual time=15084.727..15084.730 
rows=25 loops=1)
   Sort Key: cl.id
   Sort Method:  quicksort  Memory: 32kB
   ->  Limit  (cost=260.07..260.08 rows=1 width=360) (actual 
time=15084.700..15084.708 rows=25 loops=1)
 ->  Sort  (cost=260.07..260.08 rows=1 width=360) (actual 
time=15084.699..15084.704 rows=25 loops=1)
   Sort Key: cl.id
   Sort Method:  top-N heapsort  Memory: 32kB
   ->  Nested Loop Left Join  (cost=212.58..260.06 rows=1 
width=360) (actual time=68.629..15083.483 rows=204 loops=1)
 Join Filter: (csl1.chat_id = cl.id)
 ->  Nested Loop Left Join  (cost=159.35..202.70 rows=1 
width=328) (actual time=68.070..10418.323 rows=204 loops=1)
   Join Filter: (csl2.chat_id = cl.id)
   ->  Nested Loop Left Join  (cost=99.82..134.88 
rows=1 width=296) (actual time=7.868..18.477 rows=204 loops=1)
 Join Filter: (cl.id = chat_log.parent_id)
 ->  Hash Left Join  (cost=87.20..120.09 rows=1 
width=264) (actual time=0.040..1.787 rows=204 loops=1)
   Hash Cond: (cl.id = ctu.id)
   Filter: (((ctu.num_target_users IS NULL) 
AND (ctg.num_target_groups IS NULL)) OR ((ctu.num_target_users + 
ctg.num_target_groups) = 0) OR cl.username)::text = 'admin'::text) OR 
('admin'::text = ANY (ctu.target_users_array)) OR ('{admin,everyone}'::text[] 
&& ctg.target_groups_array)) AND (cl."time" > '2010-03-28 
10:54:44.382616-04'::timestamp with time zone)))
   ->  Hash Left Join  (cost=43.60..63.86 
rows=205 width=256) (actual time=0.030..1.302 rows=204 loops=1)
 Hash Cond: (cl.id = ctg.id)
 ->  Seq Scan on chat_log cl  
(cost=0.00..12.62 rows=205 width=184) (actual time=0.015..0.833 rows=204 
loops=1)
   Filter: ((NOT deleted) AND 
(id < 5090) AND (depth = 0))
 ->  Hash  (cost=41.10..41.10 
rows=200 width=76) (actual time=0.004..0.004 rows=0 loops=1)
   ->  Subquery Scan ctg  
(cost=34.60..41.10 rows=200 width=76) (actual time=0.004..0.004 rows=0 loops=1)
 ->  HashAggregate  
(cost=34.60..39.10 rows=200 width=36) (actual time=0.004..0.004 rows=0 loops=1)
   ->  Seq Scan on 
chat_target_groups  (cost=0.00..22.30 rows=1230 width=36) (actual 
time=0.001..0.001 rows=0 loops=1)
   ->  Hash  (cost=41.10..41.10 rows=200 
width=76) (actual time=0.003..0.003 rows=0 loops=1)
 ->  Subquery Scan ctu  
(cost=34.60..41.10 rows=200 width=76) (actual time=0.002..0.002 rows=0 loops=1)
   ->  Has

[postgis-users] Nested loop join = very bad performance

2010-03-28 Thread Mike Leahy
Hello list,

Following up from the bug that was successfully resolved last week 
(http://trac.osgeo.org/postgis/ticket/469), I've found that nested loop joins 
are actually causing very bad performance with my queries.

Thanks to the observations of Leo and Regina in that discussion, I have found 
that I can work around this by setting enable_nestloop = off.  Once that is 
specified, queries start working about as fast.  With enable_nestloop = on, a 
query may take ~15000 ms, while enable_nestloop = off returns the same query in 
~90 ms.

I have attached a textfile showing the different 'explain analzye' results for 
the full query with/without nestloop enabled.  You could also refer to the 
somewhat simplified queries I put together in discussion thread related to bug 
#469 (see http://postgis.refractions.net/pipermail/postgis-users/2010-
March/026235.html for some examples).

Is this something that should be looked into, or should I just incorporate the 
workaround of disabling nested loop joins into my code?

Thanks,
Mike




___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users