Re: [GENERAL] Poor select count(*) performance

2009-02-24 Thread Sam Mason
On Mon, Feb 23, 2009 at 11:21:16PM -0800, Mike Ivanov wrote:
 On Mon, Feb 23, 2009 at 6:54 PM, Sam Mason s...@samason.me.uk wrote:
  Depending on where these are on disk and how fast your disks are this
  could take up to 30 seconds.
 
 This does not sound very inspiring :-)

It was calculated with a pessimistic seek time of 10ms * 3000 seeks.
Real worst case would be even worse as you'd have to factor in potential
misses of the index as well but that's unlikely.  In practice, a table
is likely to be stored close together on the disk and hence assuming
average seek time is not accurate.  If it's having to go off and read
the index then you may loose this spacial clustering and performance
will suffer.

 Would throwing more hardware (memory, faster CPU) at the server improve the
 situation?

You're IO bound not CPU bound; faster disks would help or if your
dataset's small enough more memory.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Poor select count(*) performance

2009-02-23 Thread Mike Ivanov
Hi there,

I'm sorry for a stupid question but I'm really stuck.

A query:

SELECT COUNT(*) FROM lingq_card WHERE lingq_card.context_id = ...;

An hour ago it took 8 seconds, one minute ago the same query took just only
7 milliseconds.

Any ideas why the execution time varies so wildly?

Explain Analyze gives:

Aggregate  (cost=2000.08..2000.09 rows=1 width=0) (actual time=6.962..6.963
rows=1 loops=1)
   -  Index Scan using lingq_card_context_id on lingq_card
(cost=0.00..1998.68 rows=561 width=0) (actual time=0.025..5.045 rows=2830
loops=1)
 Index Cond: (context_id = 68672)
 Total runtime: 7.011 ms

The lingq_cards table contains about 1.4 million rows.

Thanks,
Mike


Re: [GENERAL] Poor select count(*) performance

2009-02-23 Thread Scott Marlowe
On Mon, Feb 23, 2009 at 6:44 PM, Mike Ivanov m...@thelinguist.com wrote:
 Hi there,

 I'm sorry for a stupid question but I'm really stuck.

 A query:

 SELECT COUNT(*) FROM lingq_card WHERE lingq_card.context_id = ...;

 An hour ago it took 8 seconds, one minute ago the same query took just only
 7 milliseconds.

The two common causes are caching and changing query plans.

How many rows did it have to hit, did it use an index, which index did
it use, and were the rows it needed already in the pg shared_buffers
OR the OS / kernel file system cache when retrieved?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Poor select count(*) performance

2009-02-23 Thread Sam Mason
On Mon, Feb 23, 2009 at 05:44:05PM -0800, Mike Ivanov wrote:
 An hour ago it took 8 seconds, one minute ago the same query took just only
 7 milliseconds.
 
 Any ideas why the execution time varies so wildly?
 
 Explain Analyze gives:
 
 Aggregate  (cost=2000.08..2000.09 rows=1 width=0) (actual time=6.962..6.963 
 rows=1 loops=1)
-  Index Scan using lingq_card_context_id on lingq_card
 (cost=0.00..1998.68 rows=561 width=0) (actual time=0.025..5.045 rows=2830 
 loops=1)
  Index Cond: (context_id = 68672)
  Total runtime: 7.011 ms

If you're unlucky in the example above, none of those 2830 rows will be
in memory and you'll have to wait for the disk to bring them all back.
Depending on where these are on disk and how fast your disks are this
could take up to 30 seconds.

If you want this sort of thing to go quicker you could try CLUSTERing
the table on this index, but then this will slow down other queries that
want data to come off the disk in a specific order.  It's a balancing
act!

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Poor select count(*) performance

2009-02-23 Thread Mike Ivanov
On Mon, Feb 23, 2009 at 6:54 PM, Sam Mason s...@samason.me.uk wrote:

Depending on where these are on disk and how fast your disks are this
 could take up to 30 seconds.


This does not sound very inspiring :-)

Would throwing more hardware (memory, faster CPU) at the server improve the
situation?


 If you want this sort of thing to go quicker you could try CLUSTERing


This can help because all other queries to this table alway filtered/grouped
by context_id field. I will try that.

Thanks a lot, Sam!