Re: [postgis-users] Nested loop join = very bad performance
Hi, Thanks again for the suggestions. The default values for join_collapse_limit and from_collapse_limit were set to 8 - I upped these both to 128, with no observable difference. I guess this issue is better suited for the general PostgreSQL mailing lists... Regards, Mike On Wednesday 31 March 2010 15:51:14 Paragon Corporation wrote: > Mike, > > Couple of thoughts. Given you have so many joins, could be you are > reaching the join collapse limit and the planner is kicking out before > making an optimal plan. > > Try increasing the join_collapse_limit and from_collapse_limit > > As was detailed in this thread > > http://archives.postgresql.org/pgsql-performance/2009-04/msg00258.php > > > > Alternatively could be your actual and estimated costs are out of wack and > might help upping your default targets and reanalyzing data. You can > probably get a sense of this by doing a an explain analyze of your query > and comparing the actual cost/row count with the estimated cost/row count > where its doing a nested loop. > > Admittedly this hasn't helped much for us. > > http://archives.postgresql.org/pgsql-performance/2009-02/msg00336.php > > Leo and Regina, > > http:///www.postgis.us > > > > > -Original Message- > From: postgis-users-boun...@postgis.refractions.net > [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Mike > Leahy > Sent: Wednesday, March 31, 2010 1:17 PM > To: Mark Cave-Ayland > Cc: PostGIS Users Discussion > Subject: Re: [postgis-users] Nested loop join = very bad performance > > Hi Mark, > > I set effective_cache_size to 3072mb, and shared_buffers to 1024mb (as my > system has a total of 4gb). This only slightly (if at all) improves the > performance, maybe reducing the query by somewhere around 500 ms (down to > ~14700 ms). All other parameters in the postgresql.conf are defaults. I > don't recall exactly what I changed before (I was just tried increasing > memory limits and other things pretty much without knowing what I was > doing), but none of that really seems to have a significant impact on the > performance. > > The challenge with trying to reduce this query is that the nested loop join > only happens with the query as a whole (in general). The briefest example > I could put together was presented in the thread last week (see the > attachment here: > http://postgis.org/pipermail/postgis-users/2010-March/026239.html). If I > pull any more parameters or parts out of the query, the nest loop (and the > resulting errors/crashes I was encountering at the time) would not happen. > > Regards, > Mike > > On Wednesday 31 March 2010 04:40:09 Mark Cave-Ayland wrote: > > Mike Leahy wrote: > > > Mark/List, > > > > > > I just replaced my postgresql.conf with the default copy that > > > appears in /etc/postgresql/8.4/main/ after a fresh install. The > > > performance is pretty much the same as before (maybe even about 400 > > > ms worse than before). > > > > > > Is there anything else I should try? > > > > > > Mike > > > > Hi Mike, > > > > Which parameters did you change? effective_cache_size and > > shared_buffers should be tweaked to suit the RAM available in your > > machine but the rest of the defaults are fairly sensible. > > > > You probably want to set effective_cache_size to ~75% of your physical > > RAM and shared_buffers to ~25%. Does that make any difference at all? > > > > Otherwise, you'll need to start breaking down your query into parts to > > see which bit is causing the slowdown. Start with the innermost query > > and then add one join at a time until you find the part which is > > causing the slowdown. > > > > > > ATB, > > > > Mark. > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Nested loop join = very bad performance
Mike, Couple of thoughts. Given you have so many joins, could be you are reaching the join collapse limit and the planner is kicking out before making an optimal plan. Try increasing the join_collapse_limit and from_collapse_limit As was detailed in this thread http://archives.postgresql.org/pgsql-performance/2009-04/msg00258.php Alternatively could be your actual and estimated costs are out of wack and might help upping your default targets and reanalyzing data. You can probably get a sense of this by doing a an explain analyze of your query and comparing the actual cost/row count with the estimated cost/row count where its doing a nested loop. Admittedly this hasn't helped much for us. http://archives.postgresql.org/pgsql-performance/2009-02/msg00336.php Leo and Regina, http:///www.postgis.us -Original Message- From: postgis-users-boun...@postgis.refractions.net [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Mike Leahy Sent: Wednesday, March 31, 2010 1:17 PM To: Mark Cave-Ayland Cc: PostGIS Users Discussion Subject: Re: [postgis-users] Nested loop join = very bad performance Hi Mark, I set effective_cache_size to 3072mb, and shared_buffers to 1024mb (as my system has a total of 4gb). This only slightly (if at all) improves the performance, maybe reducing the query by somewhere around 500 ms (down to ~14700 ms). All other parameters in the postgresql.conf are defaults. I don't recall exactly what I changed before (I was just tried increasing memory limits and other things pretty much without knowing what I was doing), but none of that really seems to have a significant impact on the performance. The challenge with trying to reduce this query is that the nested loop join only happens with the query as a whole (in general). The briefest example I could put together was presented in the thread last week (see the attachment here: http://postgis.org/pipermail/postgis-users/2010-March/026239.html). If I pull any more parameters or parts out of the query, the nest loop (and the resulting errors/crashes I was encountering at the time) would not happen. Regards, Mike On Wednesday 31 March 2010 04:40:09 Mark Cave-Ayland wrote: > Mike Leahy wrote: > > Mark/List, > > > > I just replaced my postgresql.conf with the default copy that > > appears in /etc/postgresql/8.4/main/ after a fresh install. The > > performance is pretty much the same as before (maybe even about 400 > > ms worse than before). > > > > Is there anything else I should try? > > > > Mike > > Hi Mike, > > Which parameters did you change? effective_cache_size and > shared_buffers should be tweaked to suit the RAM available in your > machine but the rest of the defaults are fairly sensible. > > You probably want to set effective_cache_size to ~75% of your physical > RAM and shared_buffers to ~25%. Does that make any difference at all? > > Otherwise, you'll need to start breaking down your query into parts to > see which bit is causing the slowdown. Start with the innermost query > and then add one join at a time until you find the part which is > causing the slowdown. > > > ATB, > > Mark. > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Nested loop join = very bad performance
Hi Mark, I set effective_cache_size to 3072mb, and shared_buffers to 1024mb (as my system has a total of 4gb). This only slightly (if at all) improves the performance, maybe reducing the query by somewhere around 500 ms (down to ~14700 ms). All other parameters in the postgresql.conf are defaults. I don't recall exactly what I changed before (I was just tried increasing memory limits and other things pretty much without knowing what I was doing), but none of that really seems to have a significant impact on the performance. The challenge with trying to reduce this query is that the nested loop join only happens with the query as a whole (in general). The briefest example I could put together was presented in the thread last week (see the attachment here: http://postgis.org/pipermail/postgis-users/2010-March/026239.html). If I pull any more parameters or parts out of the query, the nest loop (and the resulting errors/crashes I was encountering at the time) would not happen. Regards, Mike On Wednesday 31 March 2010 04:40:09 Mark Cave-Ayland wrote: > Mike Leahy wrote: > > Mark/List, > > > > I just replaced my postgresql.conf with the default copy that appears in > > /etc/postgresql/8.4/main/ after a fresh install. The performance is > > pretty much the same as before (maybe even about 400 ms worse than > > before). > > > > Is there anything else I should try? > > > > Mike > > Hi Mike, > > Which parameters did you change? effective_cache_size and shared_buffers > should be tweaked to suit the RAM available in your machine but the rest > of the defaults are fairly sensible. > > You probably want to set effective_cache_size to ~75% of your physical > RAM and shared_buffers to ~25%. Does that make any difference at all? > > Otherwise, you'll need to start breaking down your query into parts to > see which bit is causing the slowdown. Start with the innermost query > and then add one join at a time until you find the part which is causing > the slowdown. > > > ATB, > > Mark. > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Nested loop join = very bad performance
Mike Leahy wrote: Mark/List, I just replaced my postgresql.conf with the default copy that appears in /etc/postgresql/8.4/main/ after a fresh install. The performance is pretty much the same as before (maybe even about 400 ms worse than before). Is there anything else I should try? Mike Hi Mike, Which parameters did you change? effective_cache_size and shared_buffers should be tweaked to suit the RAM available in your machine but the rest of the defaults are fairly sensible. You probably want to set effective_cache_size to ~75% of your physical RAM and shared_buffers to ~25%. Does that make any difference at all? Otherwise, you'll need to start breaking down your query into parts to see which bit is causing the slowdown. Start with the innermost query and then add one join at a time until you find the part which is causing the slowdown. ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Nested loop join = very bad performance
Mark/List, I just replaced my postgresql.conf with the default copy that appears in /etc/postgresql/8.4/main/ after a fresh install. The performance is pretty much the same as before (maybe even about 400 ms worse than before). Is there anything else I should try? Mike On Monday 29 March 2010 17:35:29 Mike Leahy wrote: > Hi Mark, > > I don't recall making any changes...at least not until the error I > encountered last week cropped up - but I think I reversed those changes (I > will double- check tonight). But I think I have encountered performance > problems with this query before on other systems - I had originally > dismissed it as hardware limitations I was dealing in my testing system at > the time. After recent hardware upgrades and after dealing with that > specific bug, that's when the slow performance of this query really stood > out. > > Mike > > On Monday 29 March 2010 07:53:50 Mark Cave-Ayland wrote: > > Mike Leahy wrote: > > > Is this something that should be looked into, or should I just > > > incorporate the workaround of disabling nested loop joins into my code? > > > > I find that generally PostgreSQL is fairly good at its estimates unless > > someone has already tried to tune the database. Have you changed any of > > the settings in postgresql.conf from their defaults? > > > > > > ATB, > > > > Mark. > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Nested loop join = very bad performance
Hi Mark, I don't recall making any changes...at least not until the error I encountered last week cropped up - but I think I reversed those changes (I will double- check tonight). But I think I have encountered performance problems with this query before on other systems - I had originally dismissed it as hardware limitations I was dealing in my testing system at the time. After recent hardware upgrades and after dealing with that specific bug, that's when the slow performance of this query really stood out. Mike On Monday 29 March 2010 07:53:50 Mark Cave-Ayland wrote: > Mike Leahy wrote: > > Is this something that should be looked into, or should I just > > incorporate the workaround of disabling nested loop joins into my code? > > I find that generally PostgreSQL is fairly good at its estimates unless > someone has already tried to tune the database. Have you changed any of > the settings in postgresql.conf from their defaults? > > > ATB, > > Mark. > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Nested loop join = very bad performance
Mike Leahy wrote: Is this something that should be looked into, or should I just incorporate the workaround of disabling nested loop joins into my code? I find that generally PostgreSQL is fairly good at its estimates unless someone has already tried to tune the database. Have you changed any of the settings in postgresql.conf from their defaults? ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Nested loop join = very bad performance
Sorry all...here's the explain output I promised in my previous message... On Sunday 28 March 2010 11:57:01 Mike Leahy wrote: > Hello list, > > Following up from the bug that was successfully resolved last week > (http://trac.osgeo.org/postgis/ticket/469), I've found that nested loop > joins are actually causing very bad performance with my queries. > > Thanks to the observations of Leo and Regina in that discussion, I have > found that I can work around this by setting enable_nestloop = off. Once > that is specified, queries start working about as fast. With > enable_nestloop = on, a query may take ~15000 ms, while enable_nestloop = > off returns the same query in ~90 ms. > > I have attached a textfile showing the different 'explain analzye' results > for the full query with/without nestloop enabled. You could also refer to > the somewhat simplified queries I put together in discussion thread > related to bug #469 (see > http://postgis.refractions.net/pipermail/postgis-users/2010- > March/026235.html for some examples). > > Is this something that should be looked into, or should I just incorporate > the workaround of disabling nested loop joins into my code? > > Thanks, > Mike > enable_nestloop = on: QUERY PLAN Sort (cost=260.10..260.10 rows=1 width=385) (actual time=15084.727..15084.730 rows=25 loops=1) Sort Key: cl.id Sort Method: quicksort Memory: 32kB -> Limit (cost=260.07..260.08 rows=1 width=360) (actual time=15084.700..15084.708 rows=25 loops=1) -> Sort (cost=260.07..260.08 rows=1 width=360) (actual time=15084.699..15084.704 rows=25 loops=1) Sort Key: cl.id Sort Method: top-N heapsort Memory: 32kB -> Nested Loop Left Join (cost=212.58..260.06 rows=1 width=360) (actual time=68.629..15083.483 rows=204 loops=1) Join Filter: (csl1.chat_id = cl.id) -> Nested Loop Left Join (cost=159.35..202.70 rows=1 width=328) (actual time=68.070..10418.323 rows=204 loops=1) Join Filter: (csl2.chat_id = cl.id) -> Nested Loop Left Join (cost=99.82..134.88 rows=1 width=296) (actual time=7.868..18.477 rows=204 loops=1) Join Filter: (cl.id = chat_log.parent_id) -> Hash Left Join (cost=87.20..120.09 rows=1 width=264) (actual time=0.040..1.787 rows=204 loops=1) Hash Cond: (cl.id = ctu.id) Filter: (((ctu.num_target_users IS NULL) AND (ctg.num_target_groups IS NULL)) OR ((ctu.num_target_users + ctg.num_target_groups) = 0) OR cl.username)::text = 'admin'::text) OR ('admin'::text = ANY (ctu.target_users_array)) OR ('{admin,everyone}'::text[] && ctg.target_groups_array)) AND (cl."time" > '2010-03-28 10:54:44.382616-04'::timestamp with time zone))) -> Hash Left Join (cost=43.60..63.86 rows=205 width=256) (actual time=0.030..1.302 rows=204 loops=1) Hash Cond: (cl.id = ctg.id) -> Seq Scan on chat_log cl (cost=0.00..12.62 rows=205 width=184) (actual time=0.015..0.833 rows=204 loops=1) Filter: ((NOT deleted) AND (id < 5090) AND (depth = 0)) -> Hash (cost=41.10..41.10 rows=200 width=76) (actual time=0.004..0.004 rows=0 loops=1) -> Subquery Scan ctg (cost=34.60..41.10 rows=200 width=76) (actual time=0.004..0.004 rows=0 loops=1) -> HashAggregate (cost=34.60..39.10 rows=200 width=36) (actual time=0.004..0.004 rows=0 loops=1) -> Seq Scan on chat_target_groups (cost=0.00..22.30 rows=1230 width=36) (actual time=0.001..0.001 rows=0 loops=1) -> Hash (cost=41.10..41.10 rows=200 width=76) (actual time=0.003..0.003 rows=0 loops=1) -> Subquery Scan ctu (cost=34.60..41.10 rows=200 width=76) (actual time=0.002..0.002 rows=0 loops=1) -> Has
[postgis-users] Nested loop join = very bad performance
Hello list, Following up from the bug that was successfully resolved last week (http://trac.osgeo.org/postgis/ticket/469), I've found that nested loop joins are actually causing very bad performance with my queries. Thanks to the observations of Leo and Regina in that discussion, I have found that I can work around this by setting enable_nestloop = off. Once that is specified, queries start working about as fast. With enable_nestloop = on, a query may take ~15000 ms, while enable_nestloop = off returns the same query in ~90 ms. I have attached a textfile showing the different 'explain analzye' results for the full query with/without nestloop enabled. You could also refer to the somewhat simplified queries I put together in discussion thread related to bug #469 (see http://postgis.refractions.net/pipermail/postgis-users/2010- March/026235.html for some examples). Is this something that should be looked into, or should I just incorporate the workaround of disabling nested loop joins into my code? Thanks, Mike ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users