Re: [PERFORM] Poor performance on a simple join
On 11/02/2011 09:04 PM, Scott Marlowe wrote: Take a look here: http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views Not sure materialized views are the approach I would take here. We actually see a lot of these kinds of queries with giant result sets, here. If they actually need all 12k rows for every execution (not likely, but possible) and 300ms is just too darn slow for that, there's always client-side caching. We have a couple queries that we need to keep cached at all times. Stock quotes and positions, for example, have to be available in sub-ms time thanks to the level of parallelism involved. One query in particular effectively grabs the entire set of current positions and every optimization in the book brings its execution time down to about two seconds. We can't have thousands of clients executing that all the time, so it gets shoved into a local memcached on each webserver. But if he's getting back 12k rows even *after* specifying a contact ID, a materialized view is still going to return 12k rows, and still has to perform at least an index scan unless he creates an MV for each contact ID (eww). This doesn't really look like fact-table territory either. I think the real question is: Why isn't 300ms fast enough? Is it because the client executes this repeatedly? If so, what changes often enough it must fetch all 12k rows every single time? Would implementing a timestamp and only grabbing newer rows work better? Is it because of several connections each running it in parallel? Why not cache a local copy and refresh periodically? Do they actually need all 12k rows every time? maybe some limit and offset clauses are in order. There's very little a human can do with 12k results. An automated tool shouldn't be grabbing them either, unless they're actually changing with every execution. If they're not, the tool really wants items since the last change, or it's doing too much work. If it were a report, 300ms is nothing compared to most reporting queries which can run for several minutes. I think we're missing a lot of details here. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@peak6.com __ See http://www.peak6.com/email-disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Poor performance on a simple join
On Wed, Nov 2, 2011 at 3:53 PM, CS DBA wrote: > On 11/02/2011 02:45 PM, Scott Marlowe wrote: >> >> On Wed, Nov 2, 2011 at 2:21 PM, CS DBA wrote: >>> >>> Hi All; >>> >>> The below contab2 table conmtains ~400,000 rows. This query should not >>> take >>> this long. We've tweaked work_mem up to 50MB, ensured that the >>> appropriate >>> indexes are in place, etc... >>> >>> Thoughts? >>> >>> Thanks in advance >> >> How long should it take? 300 milliseconds is fairly fast for mushing >> 129k rows up against 26k rows and getting 12k rows back. That's 40 >> rows / millisecond, which isn't too bad really. >> >> >> What pg version are you running? What evidence do you have that this >> is slow? i.e. other machines you've run it on where it's faster? What >> hardware (CPU, RAM, IO subsystem, OS) Are you running on? >> >>> >>> Explain analyze: >>> SELECT contab2.contacts_tab >>> FROM contab2 >>> INNER JOIN sctab >>> ON sctab.id = contab2.to_service_id >>> AND sctab.type IN ('FService', 'FqService', 'LService', >>> 'TService') >>> WHERE contab2.from_contact_id=402513; >>> >>> QUERY >>> PLAN >>> >>> --- >>> Hash Join (cost=16904.28..25004.54 rows=26852 width=4) (actual >>> time=302.621..371.599 rows=12384 loops=1) >>> Hash Cond: (contab2.to_service_id = sctab.id) >>> -> Bitmap Heap Scan on contab2 (cost=1036.49..8566.14 rows=26852 >>> width=20) (actual time=5.191..32.701 rows=26963 loops=1) >>> Recheck Cond: (from_contact_id = 402513) >>> -> Bitmap Index Scan on index_contab2_on_from_user_id >>> (cost=0.00..1029.78 rows=26852 width=0) (actual time=4.779..4.779 >>> rows=26963 loops=1) >>> Index Cond: (from_contact_id = 402513) >>> -> Hash (cost=14445.19..14445.19 rows=113808 width=16) (actual >>> time=297.332..297.332 rows=129945 loops=1) >>> Buckets: 16384 Batches: 1 Memory Usage: 6092kB >>> -> Bitmap Heap Scan on sctab (cost=2447.07..14445.19 >>> rows=113808 >>> width=16) (actual time=29.480..187.166 rows=129945 loops=1) >>> Recheck Cond: ((type)::text = ANY >>> ('{FService,FqService,LService,TService}'::text[])) >>> -> Bitmap Index Scan on index_sctab_on_type >>> (cost=0.00..2418.62 rows=113808 width=0) (actual time=27.713..27.713 >>> rows=130376 loops=1) >>> Index Cond: ((type)::text = ANY >>> ('{FService,FqService,LService,TService}'::text[])) >>> Total runtime: 382.514 ms >>> (13 rows) >>> >>> -- >>> - >>> Kevin Kempter - Constent State >>> A PostgreSQL Professional Services Company >>> www.consistentstate.com >>> - >>> >>> >>> -- >>> Sent via pgsql-performance mailing list >>> (pgsql-performance@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-performance >> > Agreed. but it's not fast enough for the client. I think we're going to > look at creating an aggregate table or maybe partitioning Take a look here: http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Poor performance on a simple join
On 11/02/2011 02:45 PM, Scott Marlowe wrote: On Wed, Nov 2, 2011 at 2:21 PM, CS DBA wrote: Hi All; The below contab2 table conmtains ~400,000 rows. This query should not take this long. We've tweaked work_mem up to 50MB, ensured that the appropriate indexes are in place, etc... Thoughts? Thanks in advance How long should it take? 300 milliseconds is fairly fast for mushing 129k rows up against 26k rows and getting 12k rows back. That's 40 rows / millisecond, which isn't too bad really. What pg version are you running? What evidence do you have that this is slow? i.e. other machines you've run it on where it's faster? What hardware (CPU, RAM, IO subsystem, OS) Are you running on? Explain analyze: SELECT contab2.contacts_tab FROM contab2 INNER JOIN sctab ON sctab.id = contab2.to_service_id AND sctab.type IN ('FService', 'FqService', 'LService', 'TService') WHERE contab2.from_contact_id=402513; QUERY PLAN --- Hash Join (cost=16904.28..25004.54 rows=26852 width=4) (actual time=302.621..371.599 rows=12384 loops=1) Hash Cond: (contab2.to_service_id = sctab.id) ->Bitmap Heap Scan on contab2 (cost=1036.49..8566.14 rows=26852 width=20) (actual time=5.191..32.701 rows=26963 loops=1) Recheck Cond: (from_contact_id = 402513) ->Bitmap Index Scan on index_contab2_on_from_user_id (cost=0.00..1029.78 rows=26852 width=0) (actual time=4.779..4.779 rows=26963 loops=1) Index Cond: (from_contact_id = 402513) ->Hash (cost=14445.19..14445.19 rows=113808 width=16) (actual time=297.332..297.332 rows=129945 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 6092kB ->Bitmap Heap Scan on sctab (cost=2447.07..14445.19 rows=113808 width=16) (actual time=29.480..187.166 rows=129945 loops=1) Recheck Cond: ((type)::text = ANY ('{FService,FqService,LService,TService}'::text[])) ->Bitmap Index Scan on index_sctab_on_type (cost=0.00..2418.62 rows=113808 width=0) (actual time=27.713..27.713 rows=130376 loops=1) Index Cond: ((type)::text = ANY ('{FService,FqService,LService,TService}'::text[])) Total runtime: 382.514 ms (13 rows) -- - Kevin Kempter - Constent State A PostgreSQL Professional Services Company www.consistentstate.com - -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance Agreed. but it's not fast enough for the client. I think we're going to look at creating an aggregate table or maybe partitioning -- - Kevin Kempter - Constent State A PostgreSQL Professional Services Company www.consistentstate.com - -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Poor performance on a simple join
On Wed, Nov 2, 2011 at 2:21 PM, CS DBA wrote: > Hi All; > > The below contab2 table conmtains ~400,000 rows. This query should not take > this long. We've tweaked work_mem up to 50MB, ensured that the appropriate > indexes are in place, etc... > > Thoughts? > > Thanks in advance How long should it take? 300 milliseconds is fairly fast for mushing 129k rows up against 26k rows and getting 12k rows back. That's 40 rows / millisecond, which isn't too bad really. What pg version are you running? What evidence do you have that this is slow? i.e. other machines you've run it on where it's faster? What hardware (CPU, RAM, IO subsystem, OS) Are you running on? > > > Explain analyze: > SELECT contab2.contacts_tab > FROM contab2 > INNER JOIN sctab > ON sctab.id = contab2.to_service_id > AND sctab.type IN ('FService', 'FqService', 'LService', 'TService') > WHERE contab2.from_contact_id=402513; > QUERY > PLAN > --- > Hash Join (cost=16904.28..25004.54 rows=26852 width=4) (actual > time=302.621..371.599 rows=12384 loops=1) > Hash Cond: (contab2.to_service_id = sctab.id) > -> Bitmap Heap Scan on contab2 (cost=1036.49..8566.14 rows=26852 > width=20) (actual time=5.191..32.701 rows=26963 loops=1) > Recheck Cond: (from_contact_id = 402513) > -> Bitmap Index Scan on index_contab2_on_from_user_id > (cost=0.00..1029.78 rows=26852 width=0) (actual time=4.779..4.779 > rows=26963 loops=1) > Index Cond: (from_contact_id = 402513) > -> Hash (cost=14445.19..14445.19 rows=113808 width=16) (actual > time=297.332..297.332 rows=129945 loops=1) > Buckets: 16384 Batches: 1 Memory Usage: 6092kB > -> Bitmap Heap Scan on sctab (cost=2447.07..14445.19 rows=113808 > width=16) (actual time=29.480..187.166 rows=129945 loops=1) > Recheck Cond: ((type)::text = ANY > ('{FService,FqService,LService,TService}'::text[])) > -> Bitmap Index Scan on index_sctab_on_type > (cost=0.00..2418.62 rows=113808 width=0) (actual time=27.713..27.713 > rows=130376 loops=1) > Index Cond: ((type)::text = ANY > ('{FService,FqService,LService,TService}'::text[])) > Total runtime: 382.514 ms > (13 rows) > > -- > - > Kevin Kempter - Constent State > A PostgreSQL Professional Services Company > www.consistentstate.com > - > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- To understand recursion, one must first understand recursion. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Poor performance on a simple join
Hi All; The below contab2 table conmtains ~400,000 rows. This query should not take this long. We've tweaked work_mem up to 50MB, ensured that the appropriate indexes are in place, etc... Thoughts? Thanks in advance Explain analyze: SELECT contab2.contacts_tab FROM contab2 INNER JOIN sctab ON sctab.id = contab2.to_service_id AND sctab.type IN ('FService', 'FqService', 'LService', 'TService') WHERE contab2.from_contact_id=402513; QUERY PLAN --- Hash Join (cost=16904.28..25004.54 rows=26852 width=4) (actual time=302.621..371.599 rows=12384 loops=1) Hash Cond: (contab2.to_service_id = sctab.id) -> Bitmap Heap Scan on contab2 (cost=1036.49..8566.14 rows=26852 width=20) (actual time=5.191..32.701 rows=26963 loops=1) Recheck Cond: (from_contact_id = 402513) -> Bitmap Index Scan on index_contab2_on_from_user_id (cost=0.00..1029.78 rows=26852 width=0) (actual time=4.779..4.779 rows=26963 loops=1) Index Cond: (from_contact_id = 402513) -> Hash (cost=14445.19..14445.19 rows=113808 width=16) (actual time=297.332..297.332 rows=129945 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 6092kB -> Bitmap Heap Scan on sctab (cost=2447.07..14445.19 rows=113808 width=16) (actual time=29.480..187.166 rows=129945 loops=1) Recheck Cond: ((type)::text = ANY ('{FService,FqService,LService,TService}'::text[])) -> Bitmap Index Scan on index_sctab_on_type (cost=0.00..2418.62 rows=113808 width=0) (actual time=27.713..27.713 rows=130376 loops=1) Index Cond: ((type)::text = ANY ('{FService,FqService,LService,TService}'::text[])) Total runtime: 382.514 ms (13 rows) -- - Kevin Kempter - Constent State A PostgreSQL Professional Services Company www.consistentstate.com - -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance