Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-29 Thread Gregory Stark
Phoenix Kiula phoenix.ki...@gmail.com writes:

  Index Cond: ((user_id)::text = 'superman'::text)
  Filter: (status = 'Y'::bpchar)

 Of course for unregistered users we use user_known = 0, so they are
 excluded from this index. Is this not a useful partial index? I think
 in this SQL, the user_id is always superman and the user_known
 always 1 which is why the guesstimate from the planner may be off?

Well the histograms are for each column separately, so the planner will take
the selectivity estimates for user_id='superman' and status = 'Y' and multiply
them.

If the status of 'superman' records are very different from the status
records as a whole then this will give poor results.

If that's not the case then raising the statistics target for those two
columns might help. And of course if the table hasn't been analyzed recently
then analyzing it more often is always good.

There isn't really a good solution for cross-column stats. You could perhaps
create a functional index (could still be partial too) on an expression like

  CASE WHEN status = 'Y' THEN superman

Which will make Postgres build stats for the result of that expression
specifically. Then if you use that expression exactly as-is in the query the
planner should those statistics. I think. I haven't tried this... Tell us how
it goes :)

I wonder if we should look at building partial histograms for the columns in
partial indexes effectively equivalent to this... hm...

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-28 Thread Phoenix Kiula
On Wed, Jan 28, 2009 at 2:37 AM, Alban Hertroys
dal...@solfertje.student.utwente.nl wrote:

 Does that query plan look any better without the select count(id) from
 testimonials?

 If so you may be better off keeping track of those counts in a separate
 table updated by triggers on the testimonials table. Whether that really
 helps depends on how variable your selectors are to determine those counts.
 If those counts are generally very low the benefit will probably be minimal.




Thanks Alban. We have now made all the triggers and such. That part is
working. I suppose not having the count(id) is helping just with a few
seconds, but the query is still taking about 15 seconds in some cases.

Here are the query and its exec plan again fyi. Any other ideas for tweaking?




explain analyze SELECT
   testimonials.url
   ,testimonials.alias
   ,testimonials.aliasEntered
   ,testimonials.title
   ,testimonials.modify_date
   ,testimonials.id
   ,visitcount.visit_count
   ,visitcount.unique_count
   ,visitcount.modify_date
   ,coalesce(  extract(epoch from now()) -  extract(epoch
from visitcount.modify_date), 0)
   ,(select count(id) from testimonials WHERE
testimonials.user_id = 'superman' and testimonials.user_known = 1 and
testimonials.status = 'Y' ) AS total
   FROM testimonials
   LEFT JOIN visitcount ON testimonials.id = visitcount.id
   WHERE
testimonials.user_id = 'superman'
and testimonials.user_known = 1
and testimonials.status = 'Y'
   ORDER BY testimonials.modify_date desc
   OFFSET 0 LIMIT 10
;




QUERY PLAN

 Limit  (cost=224.68..224.71 rows=10 width=187) (actual
time=453.429..453.539 rows=10 loops=1)
  InitPlan
-  Aggregate  (cost=63.52..63.53 rows=1 width=8) (actual
time=89.268..89.271 rows=1 loops=1)
  -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=8) (actual time=0.039..49.968
rows=10149 loops=1)
Index Cond: ((user_id)::text = 'superman'::text)
Filter: (status = 'Y'::bpchar)
  -  Sort  (cost=161.16..161.26 rows=42 width=187) (actual
time=453.420..453.464 rows=10 loops=1)
Sort Key: testimonials.modify_date
-  Nested Loop Left Join  (cost=0.00..160.02 rows=42
width=187) (actual time=89.384..395.008 rows=10149 loops=1)
  -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=171) (actual time=0.061..50.990
rows=10149 loops=1)
Index Cond: ((user_id)::text = 'superman'::text)
Filter: (status = 'Y'::bpchar)
  -  Index Scan using visitcount_pkey1 on visitcount
(cost=0.00..2.28 rows=1 width=24) (actual time=0.007..0.010 rows=1
loops=10149)
Index Cond: (testimonials.id = visitcount.id)
 Total runtime: 461.
682 ms
(15 rows)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-28 Thread Alban Hertroys

On Jan 28, 2009, at 1:38 PM, Phoenix Kiula wrote:


Thanks Alban. We have now made all the triggers and such. That part is
working. I suppose not having the count(id) is helping just with a few
seconds, but the query is still taking about 15 seconds in some cases.

Here are the query and its exec plan again fyi. Any other ideas for  
tweaking?


Ah I see, that's the original query and its plan again, not the one  
after implementing those triggers! You had me scratching my head for a  
bit there, wondering why the count() subquery was still there.


A few things in this query appear to take relatively much time:

- The index scans on new_idx_userknown; What's worrying there is that  
the planner expects only a few rows (42) while in actuality they are  
quite many (10149). This scan is performed twice too! It seems that  
the statistics that index uses are off. That may mean changing the  
statistics on the columns involved or increasing the frequency that  
autovacuum visits them.


- The nested loop left join is expensive; That's probably also due to  
the incorrect assumptions the planner makes about the index scans I  
mentioned above. It expects to have to loop 42 times, but ends up  
doing so 10149 times instead! I believe loops aren't particularly  
efficient, they'll only beat other methods if there are few rows to  
loop through.
The loop is taking 395-89 = 306 ms for 10149 rows, while the planner  
expected it to take 306 * (42/10149) = 1.3 ms. Quite a difference!


You probably need to do something about new_idx_userknown. A partial  
index (as suggested elsewhere) may help make it smaller (easier to fit  
in RAM, fewer branches required to find a node), but the bad  
statistics are likely to be the real problem here. Without knowing  
anything about that particular index and the tables it's indexing it's  
hard to tell how to improve it.



explain analyze SELECT
  testimonials.url
  ,testimonials.alias
  ,testimonials.aliasEntered
  ,testimonials.title
  ,testimonials.modify_date
  ,testimonials.id
  ,visitcount.visit_count
  ,visitcount.unique_count
  ,visitcount.modify_date
  ,coalesce(  extract(epoch from now()) -  extract(epoch
from visitcount.modify_date), 0)
  ,(select count(id) from testimonials WHERE
testimonials.user_id = 'superman' and testimonials.user_known = 1 and
testimonials.status = 'Y' ) AS total
  FROM testimonials
  LEFT JOIN visitcount ON testimonials.id = visitcount.id
  WHERE
   testimonials.user_id = 'superman'
   and testimonials.user_known = 1
   and testimonials.status = 'Y'
  ORDER BY testimonials.modify_date desc
  OFFSET 0 LIMIT 10
;




   QUERY PLAN

Limit  (cost=224.68..224.71 rows=10 width=187) (actual
time=453.429..453.539 rows=10 loops=1)
 InitPlan
   -  Aggregate  (cost=63.52..63.53 rows=1 width=8) (actual
time=89.268..89.271 rows=1 loops=1)
 -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=8) (actual time=0.039..49.968
rows=10149 loops=1)
   Index Cond: ((user_id)::text = 'superman'::text)
   Filter: (status = 'Y'::bpchar)
 -  Sort  (cost=161.16..161.26 rows=42 width=187) (actual
time=453.420..453.464 rows=10 loops=1)
   Sort Key: testimonials.modify_date
   -  Nested Loop Left Join  (cost=0.00..160.02 rows=42
width=187) (actual time=89.384..395.008 rows=10149 loops=1)
 -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=171) (actual time=0.061..50.990
rows=10149 loops=1)
   Index Cond: ((user_id)::text = 'superman'::text)
   Filter: (status = 'Y'::bpchar)
 -  Index Scan using visitcount_pkey1 on visitcount
(cost=0.00..2.28 rows=1 width=24) (actual time=0.007..0.010 rows=1
loops=10149)
   Index Cond: (testimonials.id = visitcount.id)
Total runtime: 461.
682 ms
(15 rows)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general






Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4980a309747032541118883!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-28 Thread Phoenix Kiula
On Thu, Jan 29, 2009 at 2:25 AM, Alban Hertroys
dal...@solfertje.student.utwente.nl wrote:


 Ah I see, that's the original query and its plan again, not the one after
 implementing those triggers! You had me scratching my head for a bit there,
 wondering why the count() subquery was still there.



Yes that was just for info. Here are the new query without the count()
in there:


explain analyze SELECT
  testimonials.url
  ,testimonials.alias
  ,testimonials.aliasEntered
  ,testimonials.title
  ,testimonials.modify_date
  ,testimonials.id
  ,visitcount.visit_count
  ,visitcount.unique_count
  ,visitcount.modify_date
  ,coalesce(  extract(epoch from now()) -  extract(epoch from
visitcount.modify_date), 0)
  FROM testimonials
  LEFT OUTER JOIN visitcount USING (id)
  WHERE
   testimonials.user_id = 'superman'
   and testimonials.user_known = 1
   and testimonials.status = 'Y'
  ORDER BY testimonials.modify_date desc
  OFFSET 0 LIMIT 10
;




QUERY PLAN
---
 Limit  (cost=61.42..61.44 rows=10 width=162) (actual
time=105.400..105.499 rows=10 loops=1)
   -  Sort  (cost=61.42..61.46 rows=16 width=162) (actual
time=105.392..105.425 rows=10 loops=1)
 Sort Key: testimonials.modify_date
 -  Nested Loop Left Join  (cost=0.00..61.10 rows=16
width=162) (actual time=0.092..94.516 rows=2027 loops=1)
   -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..24.29 rows=16 width=146) (actual time=0.058..10.983
rows=2027 loops=1)
 Index Cond: ((user_id)::text = 'superman'::text)
 Filter: (status = 'Y'::bpchar)
   -  Index Scan using visitcount_pkey1 on visitcount
(cost=0.00..2.28 rows=1 width=24) (actual time=0.024..0.026 rows=1
loops=2027)
 Index Cond: (testimonials.id = visitcount.id)
 Total runtime: 105.652 ms
(10 rows)




Note that I have an index on user_id, but because this is a website,
there are several user_ids where we only have the IP. The above query
is only ever needed for registered users, so for just the registered
users we created another partial index called

 new_idx_userknown btree (user_id) WHERE user_known = 1

Of course for unregistered users we use user_known = 0, so they are
excluded from this index. Is this not a useful partial index? I think
in this SQL, the user_id is always superman and the user_known
always 1 which is why the guesstimate from the planner may be off?

Love to hear thoughts.

THANKS!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-28 Thread Alban Hertroys

On Jan 29, 2009, at 1:35 AM, Phoenix Kiula wrote:


On Thu, Jan 29, 2009 at 2:25 AM, Alban Hertroys
dal...@solfertje.student.utwente.nl wrote:

Ah I see, that's the original query and its plan again, not the one  
after
implementing those triggers! You had me scratching my head for a  
bit there,

wondering why the count() subquery was still there.


Yes that was just for info. Here are the new query without the count()
in there:


explain analyze SELECT
 testimonials.url
 ,testimonials.alias
 ,testimonials.aliasEntered
 ,testimonials.title
 ,testimonials.modify_date
 ,testimonials.id
 ,visitcount.visit_count
 ,visitcount.unique_count
 ,visitcount.modify_date
 ,coalesce(  extract(epoch from now()) -  extract(epoch from
visitcount.modify_date), 0)
 FROM testimonials
 LEFT OUTER JOIN visitcount USING (id)
 WHERE
  testimonials.user_id = 'superman'
  and testimonials.user_known = 1
  and testimonials.status = 'Y'
 ORDER BY testimonials.modify_date desc
 OFFSET 0 LIMIT 10
;


QUERY PLAN
---
Limit  (cost=61.42..61.44 rows=10 width=162) (actual
time=105.400..105.499 rows=10 loops=1)
  -  Sort  (cost=61.42..61.46 rows=16 width=162) (actual
time=105.392..105.425 rows=10 loops=1)
Sort Key: testimonials.modify_date
-  Nested Loop Left Join  (cost=0.00..61.10 rows=16
width=162) (actual time=0.092..94.516 rows=2027 loops=1)
  -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..24.29 rows=16 width=146) (actual time=0.058..10.983
rows=2027 loops=1)
Index Cond: ((user_id)::text = 'superman'::text)
Filter: (status = 'Y'::bpchar)
  -  Index Scan using visitcount_pkey1 on visitcount
(cost=0.00..2.28 rows=1 width=24) (actual time=0.024..0.026 rows=1
loops=2027)
Index Cond: (testimonials.id = visitcount.id)
Total runtime: 105.652 ms
(10 rows)

Note that I have an index on user_id, but because this is a website,
there are several user_ids where we only have the IP. The above query
is only ever needed for registered users, so for just the registered
users we created another partial index called

new_idx_userknown btree (user_id) WHERE user_known = 1

Of course for unregistered users we use user_known = 0, so they are
excluded from this index. Is this not a useful partial index? I think
in this SQL, the user_id is always superman and the user_known
always 1 which is why the guesstimate from the planner may be off?

Love to hear thoughts.



Well, that seems to have got you rid of the somewhat expensive index  
scans on new_idx_userknown as well (the duplicate entry for the scan  
being due to the subquery of course).


What's remaining is the left join.
If I understand correctly you have a PK on visitcount.id and that  
table only contains records for people who have a visitcount  0? That  
table gets updated a lot I'd think? The query plan still shows a bad  
estimate on that join; it has improved, but not enough.


Does the plan look better right after you ANALYSE visitcount? I'm  
suspecting you either need to autovacuum visitcount more frequently or  
you need to increase the statistics size on visitcount.id. You're  
updating that table a lot I think, which creates one new dead row for  
every update. Letting vacuum mark the dead ones as reusable more  
frequently should also help keep that table and it's indexes cleaner,  
although the records the indexes are pointing to will be all over the  
place.




I'm wondering... In highly updated tables it's probably more efficient  
to leave the dead rows alone (just marking them dead) and only append  
the updated ones at the end of the table? The dead rows will  
accumulate at the start of the table while the new ones go to the end.  
After a while a large section of the start of the table could just be  
removed as it'd only contain dead rows... This may already be in place  
of course, I don't have time now to look into the design specifics and  
it seems kind of an obvious thing to do!


Regards,
Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,49816122747034095710041!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-27 Thread Alban Hertroys

On Jan 26, 2009, at 4:41 AM, Phoenix Kiula wrote:


Appreciate any thoughts.

My query is:


explain analyze SELECT
   testimonials.url
   ,testimonials.alias
   ,testimonials.aliasEntered
   ,testimonials.title
   ,testimonials.modify_date
   ,testimonials.id
   ,visitcount.visit_count
   ,visitcount.unique_count
   ,visitcount.modify_date
   ,coalesce(  extract(epoch from now()) -  extract(epoch
from visitcount.modify_date), 0)
   ,(select count(id) from testimonials WHERE
testimonials.user_id = 'superman' and testimonials.user_known = 1 and
testimonials.status = 'Y' ) AS total
   FROM testimonials
   LEFT JOIN visitcount ON testimonials.id = visitcount.id
   WHERE
testimonials.user_id = 'superman'
and testimonials.user_known = 1
and testimonials.status = 'Y'
   ORDER BY testimonials.modify_date desc
   OFFSET 0 LIMIT 10
;




QUERY PLAN

Limit  (cost=224.68..224.71 rows=10 width=187) (actual
time=453.429..453.539 rows=10 loops=1)
  InitPlan
-  Aggregate  (cost=63.52..63.53 rows=1 width=8) (actual
time=89.268..89.271 rows=1 loops=1)
  -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=8) (actual time=0.039..49.968
rows=10149 loops=1)
Index Cond: ((user_id)::text = 'superman'::text)
Filter: (status = 'Y'::bpchar)
  -  Sort  (cost=161.16..161.26 rows=42 width=187) (actual
time=453.420..453.464 rows=10 loops=1)
Sort Key: testimonials.modify_date
-  Nested Loop Left Join  (cost=0.00..160.02 rows=42
width=187) (actual time=89.384..395.008 rows=10149 loops=1)
  -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=171) (actual time=0.061..50.990
rows=10149 loops=1)
Index Cond: ((user_id)::text = 'superman'::text)
Filter: (status = 'Y'::bpchar)
  -  Index Scan using visitcount_pkey1 on visitcount
(cost=0.00..2.28 rows=1 width=24) (actual time=0.007..0.010 rows=1
loops=10149)
Index Cond: (testimonials.id = visitcount.id)
Total runtime: 461.
682 ms
(15 rows)



Does that query plan look any better without the select count(id) from  
testimonials?


If so you may be better off keeping track of those counts in a  
separate table updated by triggers on the testimonials table. Whether  
that really helps depends on how variable your selectors are to  
determine those counts. If those counts are generally very low the  
benefit will probably be minimal.


Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,497f5466747032672819277!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-25 Thread Phoenix Kiula
Hope someone can shed light on this.

I have a well-performing query that uses the index. Without OFFSET etc
it returns about 11,000 records.

I only need about 20 of these records at any given time, which is why
my LIMIT and OFFSET try to pull only the 20 records or so.

My queries are fast in general *except* the first time. The first time
I pull my 20 records, it takes quite a lot of time -- about 8-10
seconds or so, which is unacceptable in our case.

My guesses:

1. This first query slowness may be because thequery is being read
into the memory?

2. Because the query uses an ORDER BY DESC on another indexed (date)
key, which means it may well be slowing down the sorting the first
time?

Appreciate any thoughts.

My query is:


explain analyze SELECT
testimonials.url
,testimonials.alias
,testimonials.aliasEntered
,testimonials.title
,testimonials.modify_date
,testimonials.id
,visitcount.visit_count
,visitcount.unique_count
,visitcount.modify_date
,coalesce(  extract(epoch from now()) -  extract(epoch
from visitcount.modify_date), 0)
,(select count(id) from testimonials WHERE
testimonials.user_id = 'superman' and testimonials.user_known = 1 and
testimonials.status = 'Y' ) AS total
FROM testimonials
LEFT JOIN visitcount ON testimonials.id = visitcount.id
WHERE
 testimonials.user_id = 'superman'
 and testimonials.user_known = 1
 and testimonials.status = 'Y'
ORDER BY testimonials.modify_date desc
OFFSET 0 LIMIT 10
;




 QUERY PLAN

 Limit  (cost=224.68..224.71 rows=10 width=187) (actual
time=453.429..453.539 rows=10 loops=1)
   InitPlan
 -  Aggregate  (cost=63.52..63.53 rows=1 width=8) (actual
time=89.268..89.271 rows=1 loops=1)
   -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=8) (actual time=0.039..49.968
rows=10149 loops=1)
 Index Cond: ((user_id)::text = 'superman'::text)
 Filter: (status = 'Y'::bpchar)
   -  Sort  (cost=161.16..161.26 rows=42 width=187) (actual
time=453.420..453.464 rows=10 loops=1)
 Sort Key: testimonials.modify_date
 -  Nested Loop Left Join  (cost=0.00..160.02 rows=42
width=187) (actual time=89.384..395.008 rows=10149 loops=1)
   -  Index Scan using new_idx_userknown on testimonials
(cost=0.00..63.41 rows=42 width=171) (actual time=0.061..50.990
rows=10149 loops=1)
 Index Cond: ((user_id)::text = 'superman'::text)
 Filter: (status = 'Y'::bpchar)
   -  Index Scan using visitcount_pkey1 on visitcount
(cost=0.00..2.28 rows=1 width=24) (actual time=0.007..0.010 rows=1
loops=10149)
 Index Cond: (testimonials.id = visitcount.id)
 Total runtime: 461.
682 ms
(15 rows)



It's using the following indexes on the testimonials table.


new_idx_modify_date btree (modify_date) WITH (fillfactor=75)
new_idx_userknown btree (user_id) WITH (fillfactor=70) WHERE user_known = 1



THANKS!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-25 Thread Scott Marlowe
On Sun, Jan 25, 2009 at 8:41 PM, Phoenix Kiula phoenix.ki...@gmail.com wrote:

 My query is:


 explain analyze SELECT
testimonials.url
,testimonials.alias
,testimonials.aliasEntered
,testimonials.title
,testimonials.modify_date
,testimonials.id
,visitcount.visit_count
,visitcount.unique_count
,visitcount.modify_date
,coalesce(  extract(epoch from now()) -  extract(epoch
 from visitcount.modify_date), 0)
,(select count(id) from testimonials WHERE
 testimonials.user_id = 'superman' and testimonials.user_known = 1 and
 testimonials.status = 'Y' ) AS total
FROM testimonials
LEFT JOIN visitcount ON testimonials.id = visitcount.id
WHERE
 testimonials.user_id = 'superman'
 and testimonials.user_known = 1
 and testimonials.status = 'Y'
ORDER BY testimonials.modify_date desc
OFFSET 0 LIMIT 10


 QUERY PLAN
 
  Limit  (cost=224.68..224.71 rows=10 width=187) (actual
 time=453.429..453.539 rows=10 loops=1)
   InitPlan
 -  Aggregate  (cost=63.52..63.53 rows=1 width=8) (actual
 time=89.268..89.271 rows=1 loops=1)
   -  Index Scan using new_idx_userknown on testimonials
 (cost=0.00..63.41 rows=42 width=8) (actual time=0.039..49.968
 rows=10149 loops=1)
 Index Cond: ((user_id)::text = 'superman'::text)
 Filter: (status = 'Y'::bpchar)
   -  Sort  (cost=161.16..161.26 rows=42 width=187) (actual
 time=453.420..453.464 rows=10 loops=1)
 Sort Key: testimonials.modify_date
 -  Nested Loop Left Join  (cost=0.00..160.02 rows=42
 width=187) (actual time=89.384..395.008 rows=10149 loops=1)
   -  Index Scan using new_idx_userknown on testimonials
 (cost=0.00..63.41 rows=42 width=171) (actual time=0.061..50.990
 rows=10149 loops=1)
 Index Cond: ((user_id)::text = 'superman'::text)
 Filter: (status = 'Y'::bpchar)
   -  Index Scan using visitcount_pkey1 on visitcount
 (cost=0.00..2.28 rows=1 width=24) (actual time=0.007..0.010 rows=1
 loops=10149)

Have you analyzed these tables?  The estimates and real row counts are
quite different.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-25 Thread Phoenix Kiula
On Mon, Jan 26, 2009 at 2:26 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Sun, Jan 25, 2009 at 8:41 PM, Phoenix Kiula phoenix.ki...@gmail.com 
 wrote:

 My query is:


 explain analyze SELECT
testimonials.url
,testimonials.alias
,testimonials.aliasEntered
,testimonials.title
,testimonials.modify_date
,testimonials.id
,visitcount.visit_count
,visitcount.unique_count
,visitcount.modify_date
,coalesce(  extract(epoch from now()) -  extract(epoch
 from visitcount.modify_date), 0)
,(select count(id) from testimonials WHERE
 testimonials.user_id = 'superman' and testimonials.user_known = 1 and
 testimonials.status = 'Y' ) AS total
FROM testimonials
LEFT JOIN visitcount ON testimonials.id = visitcount.id
WHERE
 testimonials.user_id = 'superman'
 and testimonials.user_known = 1
 and testimonials.status = 'Y'
ORDER BY testimonials.modify_date desc
OFFSET 0 LIMIT 10


 QUERY PLAN
 
  Limit  (cost=224.68..224.71 rows=10 width=187) (actual
 time=453.429..453.539 rows=10 loops=1)
   InitPlan
 -  Aggregate  (cost=63.52..63.53 rows=1 width=8) (actual
 time=89.268..89.271 rows=1 loops=1)
   -  Index Scan using new_idx_userknown on testimonials
 (cost=0.00..63.41 rows=42 width=8) (actual time=0.039..49.968
 rows=10149 loops=1)
 Index Cond: ((user_id)::text = 'superman'::text)
 Filter: (status = 'Y'::bpchar)
   -  Sort  (cost=161.16..161.26 rows=42 width=187) (actual
 time=453.420..453.464 rows=10 loops=1)
 Sort Key: testimonials.modify_date
 -  Nested Loop Left Join  (cost=0.00..160.02 rows=42
 width=187) (actual time=89.384..395.008 rows=10149 loops=1)
   -  Index Scan using new_idx_userknown on testimonials
 (cost=0.00..63.41 rows=42 width=171) (actual time=0.061..50.990
 rows=10149 loops=1)
 Index Cond: ((user_id)::text = 'superman'::text)
 Filter: (status = 'Y'::bpchar)
   -  Index Scan using visitcount_pkey1 on visitcount
 (cost=0.00..2.28 rows=1 width=24) (actual time=0.007..0.010 rows=1
 loops=10149)

 Have you analyzed these tables?  The estimates and real row counts are
 quite different.




Hi Scott. Yes, there is an autovacuum on both the tables. Should i
additionally do a manual vacuum too?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-25 Thread Scott Marlowe
On Sun, Jan 25, 2009 at 11:58 PM, Phoenix Kiula phoenix.ki...@gmail.com wrote:
 Hi Scott. Yes, there is an autovacuum on both the tables. Should i
 additionally do a manual vacuum too?

Nah, just an analyze.  what version of pgsql is this, btw?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-25 Thread Phoenix Kiula
On Mon, Jan 26, 2009 at 3:04 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Sun, Jan 25, 2009 at 11:58 PM, Phoenix Kiula phoenix.ki...@gmail.com 
 wrote:
 Hi Scott. Yes, there is an autovacuum on both the tables. Should i
 additionally do a manual vacuum too?

 Nah, just an analyze.  what version of pgsql is this, btw?



Actually both vacuum and analyze happen at regular intervals. Are they
not doing their job?

PG 8.2.9

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause

2009-01-25 Thread Scott Marlowe
On Mon, Jan 26, 2009 at 12:06 AM, Phoenix Kiula phoenix.ki...@gmail.com wrote:
 On Mon, Jan 26, 2009 at 3:04 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:
 On Sun, Jan 25, 2009 at 11:58 PM, Phoenix Kiula phoenix.ki...@gmail.com 
 wrote:
 Hi Scott. Yes, there is an autovacuum on both the tables. Should i
 additionally do a manual vacuum too?

 Nah, just an analyze.  what version of pgsql is this, btw?



 Actually both vacuum and analyze happen at regular intervals. Are they
 not doing their job?

Hard to say.  You could try increasing your stats target on the fields
where the approximation is way off.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general