Re: [PERFORM] Poor performance on a simple join

2011-11-03 Thread Shaun Thomas

On 11/02/2011 09:04 PM, Scott Marlowe wrote:


Take a look here:
http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views


Not sure materialized views are the approach I would take here. We 
actually see a lot of these kinds of queries with giant result sets, 
here. If they actually need all 12k rows for every execution (not 
likely, but possible) and 300ms is just too darn slow for that, there's 
always client-side caching.


We have a couple queries that we need to keep cached at all times. Stock 
quotes and positions, for example, have to be available in sub-ms time 
thanks to the level of parallelism involved. One query in particular 
effectively grabs the entire set of current positions and every 
optimization in the book brings its execution time down to about two 
seconds. We can't have thousands of clients executing that all the time, 
so it gets shoved into a local memcached on each webserver.


But if he's getting back 12k rows even *after* specifying a contact ID, 
a materialized view is still going to return 12k rows, and still has to 
perform at least an index scan unless he creates an MV for each contact 
ID (eww). This doesn't really look like fact-table territory either.


I think the real question is: Why isn't 300ms fast enough? Is it because 
the client executes this repeatedly? If so, what changes often enough it 
must fetch all 12k rows every single time? Would implementing a 
timestamp and only grabbing newer rows work better? Is it because of 
several connections each running it in parallel? Why not cache a local 
copy and refresh periodically? Do they actually need all 12k rows every 
time? maybe some limit and offset clauses are in order.


There's very little a human can do with 12k results. An automated tool 
shouldn't be grabbing them either, unless they're actually changing with 
every execution. If they're not, the tool really wants items since the 
last change, or it's doing too much work. If it were a report, 300ms is 
nothing compared to most reporting queries which can run for several 
minutes.


I think we're missing a lot of details here.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@peak6.com

__

See http://www.peak6.com/email-disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance on a simple join

2011-11-02 Thread Scott Marlowe
On Wed, Nov 2, 2011 at 2:21 PM, CS DBA cs_...@consistentstate.com wrote:
 Hi All;

 The below contab2 table conmtains ~400,000 rows. This query should not take
 this long.  We've tweaked work_mem up to 50MB, ensured that the appropriate
 indexes are in place, etc...

 Thoughts?

 Thanks in advance

How long should it take?  300 milliseconds is fairly fast for mushing
129k rows up against 26k rows and getting 12k rows back.  That's 40
rows / millisecond, which isn't too bad really.


What pg version are you running?  What evidence do you have that this
is slow? i.e. other machines you've run it on where it's faster?  What
hardware (CPU, RAM, IO subsystem, OS) Are you running on?



 Explain analyze:
 SELECT contab2.contacts_tab
 FROM contab2
 INNER JOIN sctab
    ON sctab.id = contab2.to_service_id
        AND sctab.type IN ('FService', 'FqService', 'LService', 'TService')
 WHERE contab2.from_contact_id=402513;
                                                                        QUERY
 PLAN
 ---
  Hash Join  (cost=16904.28..25004.54 rows=26852 width=4) (actual
 time=302.621..371.599 rows=12384 loops=1)
   Hash Cond: (contab2.to_service_id = sctab.id)
   -  Bitmap Heap Scan on contab2  (cost=1036.49..8566.14 rows=26852
 width=20) (actual time=5.191..32.701 rows=26963 loops=1)
         Recheck Cond: (from_contact_id = 402513)
         -  Bitmap Index Scan on index_contab2_on_from_user_id
  (cost=0.00..1029.78 rows=26852 width=0) (actual time=4.779..4.779
 rows=26963 loops=1)
               Index Cond: (from_contact_id = 402513)
   -  Hash  (cost=14445.19..14445.19 rows=113808 width=16) (actual
 time=297.332..297.332 rows=129945 loops=1)
         Buckets: 16384  Batches: 1  Memory Usage: 6092kB
         -  Bitmap Heap Scan on sctab  (cost=2447.07..14445.19 rows=113808
 width=16) (actual time=29.480..187.166 rows=129945 loops=1)
               Recheck Cond: ((type)::text = ANY
 ('{FService,FqService,LService,TService}'::text[]))
               -  Bitmap Index Scan on index_sctab_on_type
  (cost=0.00..2418.62 rows=113808 width=0) (actual time=27.713..27.713
 rows=130376 loops=1)
                     Index Cond: ((type)::text = ANY
 ('{FService,FqService,LService,TService}'::text[]))
  Total runtime: 382.514 ms
 (13 rows)

 --
 -
 Kevin Kempter       -       Constent State
 A PostgreSQL Professional Services Company
          www.consistentstate.com
 -


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
To understand recursion, one must first understand recursion.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance on a simple join

2011-11-02 Thread CS DBA

On 11/02/2011 02:45 PM, Scott Marlowe wrote:

On Wed, Nov 2, 2011 at 2:21 PM, CS DBAcs_...@consistentstate.com  wrote:

Hi All;

The below contab2 table conmtains ~400,000 rows. This query should not take
this long.  We've tweaked work_mem up to 50MB, ensured that the appropriate
indexes are in place, etc...

Thoughts?

Thanks in advance

How long should it take?  300 milliseconds is fairly fast for mushing
129k rows up against 26k rows and getting 12k rows back.  That's 40
rows / millisecond, which isn't too bad really.


What pg version are you running?  What evidence do you have that this
is slow? i.e. other machines you've run it on where it's faster?  What
hardware (CPU, RAM, IO subsystem, OS) Are you running on?



Explain analyze:
SELECT contab2.contacts_tab
FROM contab2
INNER JOIN sctab
ON sctab.id = contab2.to_service_id
AND sctab.type IN ('FService', 'FqService', 'LService', 'TService')
WHERE contab2.from_contact_id=402513;
QUERY
PLAN
---
  Hash Join  (cost=16904.28..25004.54 rows=26852 width=4) (actual
time=302.621..371.599 rows=12384 loops=1)
   Hash Cond: (contab2.to_service_id = sctab.id)
   -Bitmap Heap Scan on contab2  (cost=1036.49..8566.14 rows=26852
width=20) (actual time=5.191..32.701 rows=26963 loops=1)
 Recheck Cond: (from_contact_id = 402513)
 -Bitmap Index Scan on index_contab2_on_from_user_id
  (cost=0.00..1029.78 rows=26852 width=0) (actual time=4.779..4.779
rows=26963 loops=1)
   Index Cond: (from_contact_id = 402513)
   -Hash  (cost=14445.19..14445.19 rows=113808 width=16) (actual
time=297.332..297.332 rows=129945 loops=1)
 Buckets: 16384  Batches: 1  Memory Usage: 6092kB
 -Bitmap Heap Scan on sctab  (cost=2447.07..14445.19 rows=113808
width=16) (actual time=29.480..187.166 rows=129945 loops=1)
   Recheck Cond: ((type)::text = ANY
('{FService,FqService,LService,TService}'::text[]))
   -Bitmap Index Scan on index_sctab_on_type
  (cost=0.00..2418.62 rows=113808 width=0) (actual time=27.713..27.713
rows=130376 loops=1)
 Index Cond: ((type)::text = ANY
('{FService,FqService,LService,TService}'::text[]))
  Total runtime: 382.514 ms
(13 rows)

--
-
Kevin Kempter   -   Constent State
A PostgreSQL Professional Services Company
  www.consistentstate.com
-


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




Agreed. but it's not fast enough for the client.  I think we're going to 
look at creating an aggregate table or maybe partitioning




--
-
Kevin Kempter   -   Constent State
A PostgreSQL Professional Services Company
  www.consistentstate.com
-


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance on a simple join

2011-11-02 Thread Scott Marlowe
On Wed, Nov 2, 2011 at 3:53 PM, CS DBA cs_...@consistentstate.com wrote:
 On 11/02/2011 02:45 PM, Scott Marlowe wrote:

 On Wed, Nov 2, 2011 at 2:21 PM, CS DBAcs_...@consistentstate.com  wrote:

 Hi All;

 The below contab2 table conmtains ~400,000 rows. This query should not
 take
 this long.  We've tweaked work_mem up to 50MB, ensured that the
 appropriate
 indexes are in place, etc...

 Thoughts?

 Thanks in advance

 How long should it take?  300 milliseconds is fairly fast for mushing
 129k rows up against 26k rows and getting 12k rows back.  That's 40
 rows / millisecond, which isn't too bad really.


 What pg version are you running?  What evidence do you have that this
 is slow? i.e. other machines you've run it on where it's faster?  What
 hardware (CPU, RAM, IO subsystem, OS) Are you running on?


 Explain analyze:
 SELECT contab2.contacts_tab
 FROM contab2
 INNER JOIN sctab
    ON sctab.id = contab2.to_service_id
        AND sctab.type IN ('FService', 'FqService', 'LService',
 'TService')
 WHERE contab2.from_contact_id=402513;

  QUERY
 PLAN

 ---
  Hash Join  (cost=16904.28..25004.54 rows=26852 width=4) (actual
 time=302.621..371.599 rows=12384 loops=1)
   Hash Cond: (contab2.to_service_id = sctab.id)
   -    Bitmap Heap Scan on contab2  (cost=1036.49..8566.14 rows=26852
 width=20) (actual time=5.191..32.701 rows=26963 loops=1)
         Recheck Cond: (from_contact_id = 402513)
         -    Bitmap Index Scan on index_contab2_on_from_user_id
  (cost=0.00..1029.78 rows=26852 width=0) (actual time=4.779..4.779
 rows=26963 loops=1)
               Index Cond: (from_contact_id = 402513)
   -    Hash  (cost=14445.19..14445.19 rows=113808 width=16) (actual
 time=297.332..297.332 rows=129945 loops=1)
         Buckets: 16384  Batches: 1  Memory Usage: 6092kB
         -    Bitmap Heap Scan on sctab  (cost=2447.07..14445.19
 rows=113808
 width=16) (actual time=29.480..187.166 rows=129945 loops=1)
               Recheck Cond: ((type)::text = ANY
 ('{FService,FqService,LService,TService}'::text[]))
               -    Bitmap Index Scan on index_sctab_on_type
  (cost=0.00..2418.62 rows=113808 width=0) (actual time=27.713..27.713
 rows=130376 loops=1)
                     Index Cond: ((type)::text = ANY
 ('{FService,FqService,LService,TService}'::text[]))
  Total runtime: 382.514 ms
 (13 rows)

 --
 -
 Kevin Kempter       -       Constent State
 A PostgreSQL Professional Services Company
          www.consistentstate.com
 -


 --
 Sent via pgsql-performance mailing list
 (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

 Agreed. but it's not fast enough for the client.  I think we're going to
 look at creating an aggregate table or maybe partitioning

Take a look here:
http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance