Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-23 Thread Davenport, Julie
Here is the explain plan of the new query (same as original but with changes to 
the Date subquery to use ::date instead of to_char to truncate the time 
portion), when it is run after doing these 2 sets first:

set work_mem='8MB';

set enable_nestloop = false;

explain plan (8.4):  http://explain.depesz.com/s/tw8

thanks again for the help.
Julie




-Original Message-
From: t...@fuzzy.cz [mailto:t...@fuzzy.cz] 
Sent: Monday, March 21, 2011 11:33 AM
To: Davenport, Julie
Cc: Merlin Moncure; Tomas Vondra; pgsql-general@postgresql.org
Subject: RE: [GENERAL] query taking much longer since Postgres 8.4 upgrade

 Incredible!  Setting enable_nestloop off temporarily for the run of this
 script made it run in less than a minute (had been running in 10 or 11
 minutes).  I think you have found a solution for many of my slow running
 scripts that use these same type of joins.  Thanks again.
 Julie

Nice. Can you post EXPLAIN ANALYZE again, so that we can see why this plan
was evaluated as as more expensive before disabling nested loops?

regards
Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-22 Thread Alban Hertroys
On 22 Mar 2011, at 24:20, F. BROUARD / SQLpro wrote:

 Try this :
 
 1) rewrite your query as is :

Indeed, a join is probably more efficient than a big IN-list. Good point ;)

 
 select course_id AS EXTERNAL_COURSE_KEY,
   user_id AS EXTERNAL_PERSON_KEY,
   'Student' AS ROLE,
   'Y' AS AVAILABLE_IND
 from   course_user_link AS CUL
   INNER JOIN course_control AS CC
 ON CUL.course_id = CC.course_id
 where  CUL.instructor = false
  AND  CC.course_begin_date::date IN ( 
 '20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
  )
  and  CC.course_delivery LIKE 'O%'
  and  CC.course_cross_section IS NULL

  and  NOT EXISTS(SELECT *
  FROM   instr_as_stutemp AS IAS
  WHERE  C.user_id = IAS.user_id)

You could write this last condition as a LEFT OUTER JOIN even, where valid 
records match IAS.user_id IS NULL.

 2) prefix all tables by your SQL schema (public by default)

I don't think that will matter much, it might shave off a tiny bit of planner 
execution time if tables aren't in the first schema in the search_path, but 
otherwise not worth the hassle.

 3) create theses indexes (if not) :

I think the usual convention is to suffix with _idx instead of prefixing with 
x_. It's what automatically created indexes do anyway. That's a matter of 
personal preference though.

 CREATE INDEX X_CUL_INS_CRS_UID
ON course_user_link (instructor,
 course_id,
 user_id);
 CREATE INDEX X_CC_CDV_CCS_CBD_CID
ON course_control (course_delivery,
   course_cross_section,
   course_begin_date,
   course_id);

If queries where cross_section IS NULL (especially in combination with the 
other fields in this index) are very common, while the opposite is quite rare, 
you may want to add a WHERE-clause with that condition to this index.

 CREATE INDEX X_IAS ON IAS_UID
ON instr_as_stutemp (user_id);

 4) beware of using reserved words for the name of a database object like ROLE 
 !

Good advise, but not really needed in the case of aliases I think. There's also 
the possibility to quote those fields as identifiers (which also makes them 
case-sensitive, so beware!) - in this case that would be ROLE.

 Le 16/03/2011 16:49, Davenport, Julie a écrit :
 select
 
 course_id AS EXTERNAL_COURSE_KEY,
 
 user_id AS EXTERNAL_PERSON_KEY,
 
 'Student' AS ROLE,
 
 'Y' AS AVAILABLE_IND
 
 from course_user_link
 
 where instructor = false
 
 and course_id in
 
   (
 
   select course_id
 
   from course_control
 
   where to_char(course_begin_date,'MMDD') IN (
 '20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
 )
 
   and course_delivery LIKE 'O%'
 
   and course_cross_section IS NULL
 
   )
 
 and user_id not in (select user_id from instr_as_stutemp)
 
 
 
 -- 
 Frédéric BROUARD - expert SGBDR et SQL - MVP SQL Server - 06 11 86 40 66
 Le site sur le langage SQL et les SGBDR  :  http://sqlpro.developpez.com
 Enseignant Arts  Métiers PACA, ISEN Toulon et CESI/EXIA Aix en Provence
 Audit, conseil, expertise, formation, modélisation, tuning, optimisation
 *** http://www.sqlspot.com *
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general
 

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4d8852ad651346607679948!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-21 Thread tv
 Incredible!  Setting enable_nestloop off temporarily for the run of this
 script made it run in less than a minute (had been running in 10 or 11
 minutes).  I think you have found a solution for many of my slow running
 scripts that use these same type of joins.  Thanks again.
 Julie

Nice. Can you post EXPLAIN ANALYZE again, so that we can see why this plan
was evaluated as as more expensive before disabling nested loops?

regards
Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-21 Thread Merlin Moncure
On Mon, Mar 21, 2011 at 11:32 AM,  t...@fuzzy.cz wrote:
 Incredible!  Setting enable_nestloop off temporarily for the run of this
 script made it run in less than a minute (had been running in 10 or 11
 minutes).  I think you have found a solution for many of my slow running
 scripts that use these same type of joins.  Thanks again.
 Julie

 Nice. Can you post EXPLAIN ANALYZE again, so that we can see why this plan
 was evaluated as as more expensive before disabling nested loops?

well the problem is obvious -- the planner is estimating ~ 250 loops,
when it in fact has to do ~ 60k.  That's a two orders of magnitude
miss.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-21 Thread tv
 On Mon, Mar 21, 2011 at 11:32 AM,  t...@fuzzy.cz wrote:
 Incredible!  Setting enable_nestloop off temporarily for the run of
 this
 script made it run in less than a minute (had been running in 10 or 11
 minutes).  I think you have found a solution for many of my slow
 running
 scripts that use these same type of joins.  Thanks again.
 Julie

 Nice. Can you post EXPLAIN ANALYZE again, so that we can see why this
 plan
 was evaluated as as more expensive before disabling nested loops?

 well the problem is obvious -- the planner is estimating ~ 250 loops,
 when it in fact has to do ~ 60k.  That's a two orders of magnitude
 miss.

Yeah, you're right, although I think the estimate is 1 loop vs. 2882 loops
in reality. The 250 vs. 60k is related to the result set.

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-21 Thread Davenport, Julie
Incredible!  Setting enable_nestloop off temporarily for the run of this script 
made it run in less than a minute (had been running in 10 or 11 minutes).  I 
think you have found a solution for many of my slow running scripts that use 
these same type of joins.  Thanks again.
Julie


-Original Message-
From: Merlin Moncure [mailto:mmonc...@gmail.com] 
Sent: Friday, March 18, 2011 1:18 PM
To: Davenport, Julie
Cc: Tomas Vondra; pgsql-general@postgresql.org
Subject: Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

On Fri, Mar 18, 2011 at 10:42 AM, Davenport, Julie jdavenp...@ctcd.edu 
wrote:
 This helped, is now down from 14.9 min to 10.9 min to run the entire 
 script.  Thanks.

can you try disabling nestloop and see what happens?   In the session,
before running the query, isssue:
set enable_nestloop = false;

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-21 Thread F. BROUARD / SQLpro

Try this :

1) rewrite your query as is :

select course_id AS EXTERNAL_COURSE_KEY,
   user_id AS EXTERNAL_PERSON_KEY,
   'Student' AS ROLE,
   'Y' AS AVAILABLE_IND
from   course_user_link AS CUL
   INNER JOIN course_control AS CC
 ON CUL.course_id = CC.course_id
where  CUL.instructor = false
  AND  CC.course_begin_date::date IN ( 
'20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307' 
)

  and  CC.course_delivery LIKE 'O%'
  and  CC.course_cross_section IS NULL
  and  NOT EXISTS(SELECT *
  FROM   instr_as_stutemp AS IAS
  WHERE  C.user_id = IAS.user_id)

2) prefix all tables by your SQL schema (public by default)

3) create theses indexes (if not) :

CREATE INDEX X_CUL_INS_CRS_UID
ON course_user_link (instructor,
 course_id,
 user_id);
CREATE INDEX X_CC_CDV_CCS_CBD_CID
ON course_control (course_delivery,
   course_cross_section,
   course_begin_date,
   course_id);
CREATE INDEX X_IAS ON IAS_UID
ON instr_as_stutemp (user_id);

4) beware of using reserved words for the name of a database object like 
ROLE !


A +

Le 16/03/2011 16:49, Davenport, Julie a écrit :

select

course_id AS EXTERNAL_COURSE_KEY,

user_id AS EXTERNAL_PERSON_KEY,

'Student' AS ROLE,

'Y' AS AVAILABLE_IND

from course_user_link

where instructor = false

and course_id in

   (

   select course_id

   from course_control

   where to_char(course_begin_date,'MMDD') IN (
'20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
)

   and course_delivery LIKE 'O%'

   and course_cross_section IS NULL

   )

and user_id not in (select user_id from instr_as_stutemp)




--
Frédéric BROUARD - expert SGBDR et SQL - MVP SQL Server - 06 11 86 40 66
Le site sur le langage SQL et les SGBDR  :  http://sqlpro.developpez.com
Enseignant Arts  Métiers PACA, ISEN Toulon et CESI/EXIA Aix en Provence
Audit, conseil, expertise, formation, modélisation, tuning, optimisation
*** http://www.sqlspot.com *


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-19 Thread Tomas Vondra
Dne 18.3.2011 19:18, Merlin Moncure napsal(a):
 On Fri, Mar 18, 2011 at 10:42 AM, Davenport, Julie jdavenp...@ctcd.edu 
 wrote:
 This helped, is now down from 14.9 min to 10.9 min to run the entire script. 
  Thanks.
 
 can you try disabling nestloop and see what happens?   In the session,
 before running the query, isssue:
 set enable_nestloop = false;

Yes, that'd be interesting. And provide 'exaplain analyze' as before
(using explain.depesz.com), please.

And a bit unrelated recommendation - based on the settings (cost
constants, work_mem etc.) it seems guess you have a default untuned
postgresql.conf. Is that right, Julie? In this case you can
significantly improve the load performance by several settings:

1) increase checkpoint_segments (default is 3, use 12 or something like
that - depends on the if there are warnings about checkpoint segments in
the log)

2) increase wal_buffers (just set it to 16MB and forget it)

The effect depends on the amount of data loaded and other things, but
it's worth a try.

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-18 Thread Davenport, Julie
This helped, is now down from 14.9 min to 10.9 min to run the entire script.  
Thanks.


Dne 17.3.2011 19:29, Davenport, Julie napsal(a):
 I have not yet had time to try Tomas' suggestion of bumping up the 
 work_mem first (trying to figure out how to do that from within a 
 coldfusion script).  Many thanks for all your help guys!

Well, just execute this 'SQL query' just like the other ones

set work_mem='8MB'

and it will increase the amount of memory for that connection.

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-18 Thread Tomas Vondra
Dne 18.3.2011 16:42, Davenport, Julie napsal(a):
 This helped, is now down from 14.9 min to 10.9 min to run the entire script.  
 Thanks.

Still, it's way slower than the 8.0 :-(

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-18 Thread Merlin Moncure
On Fri, Mar 18, 2011 at 10:42 AM, Davenport, Julie jdavenp...@ctcd.edu wrote:
 This helped, is now down from 14.9 min to 10.9 min to run the entire script.  
 Thanks.

can you try disabling nestloop and see what happens?   In the session,
before running the query, isssue:
set enable_nestloop = false;

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-17 Thread Davenport, Julie
FYI, I implemented Pavel's suggestion to use:

 course_begin_date::date IN ( 
'20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
 )

instead of 

to_char(course_begin_date,'MMDD') IN ( 
'20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
 )

and it did help significantly.  The overall script (where there are several 
queries like this one) was taking 7.5 mins on Postgres 8.0 and initially took 
20 mins on 8.4; but now after this change (::date) it only takes 14.9 mins.  
Progress!  I have not yet had time to try Tomas' suggestion of bumping up the 
work_mem first (trying to figure out how to do that from within a coldfusion 
script).  Many thanks for all your help guys!

Julie


-Original Message-
From: Pavel Stehule [mailto:pavel.steh...@gmail.com] 
Sent: Thursday, March 17, 2011 12:13 AM
To: Davenport, Julie
Cc: Tomas Vondra; pgsql-general@postgresql.org
Subject: Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011/3/16 Davenport, Julie jdavenp...@ctcd.edu:
 Yes, the column course_begin_date is a timestamp, so that would not work in 
 this instance, but I will keep that in mind for future use elsewhere.  I 
 agree, there are ways to rewrite this query, just wondering which is best to 
 take advantage of 8.4.
 Thanks much.



ok, sorry, do column_course_begin::date = ...

:)

Pavel



 -Original Message-
 From: Tomas Vondra [mailto:t...@fuzzy.cz]
 Sent: Wednesday, March 16, 2011 4:40 PM
 To: Pavel Stehule
 Cc: pgsql-general@postgresql.org; Davenport, Julie
 Subject: Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

 Dne 16.3.2011 22:31, Pavel Stehule napsal(a):
 2011/3/16 Tomas Vondra t...@fuzzy.cz:
 Dne 16.3.2011 21:38, Davenport, Julie napsal(a):
 OK, I did the explain analyze on both sides (using a file for output 
 instead) and used the tool you suggested.

 8.0 - http://explain.depesz.com/s/Wam
 8.4 - http://explain.depesz.com/s/asJ

 Great, that's exactly what I asked for. I'll repost that to the mailing
 list so that the others can check it too.

 When I run the queries I get 59,881 rows on the 8.0 side and 59,880 on the 
 8.4 side, which is what I expect because 8.4 side was updated a couple 
 hours later and some minor changes make sense.

 Hm, obviously both versions got the row estimates wrong, but the 8.4
 difference (200x) is much bigger that the 8.0 (10x). This might be one
 of the reasons why a different plan is chosen.

 the expression

 to_char(course_begin_date, 'MMDD'::text) = '20101025'::text

 should be a problem

 much better is test on equality in date domain like:

 course_begin_date = to_date('20101025', 'MMDD')

 this is faster and probably better estimated

 Which is not going to work if the course_begin_date column is a
 timestamp, because of the time part.

 But yes, there are several ways to improve this query, yet it does not
 explain why the 8.4 is so much slower.

 Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-17 Thread Tomas Vondra
Dne 17.3.2011 19:29, Davenport, Julie napsal(a):
 I have not yet had time to try Tomas' suggestion of bumping up the work_mem 
 first (trying to figure out how to do that from within a coldfusion script).  
 Many thanks for all your help guys!

Well, just execute this 'SQL query' just like the other ones

set work_mem='8MB'

and it will increase the amount of memory for that connection.

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Davenport, Julie
When I run the following query in Postgres 8.0, it runs in 61,509.372 ms

When I run it in Postgres 8.4, it runs in 397,857.472 ms

Here is the query:

select
course_id AS EXTERNAL_COURSE_KEY,
user_id AS EXTERNAL_PERSON_KEY,
'Student' AS ROLE,
'Y' AS AVAILABLE_IND
from course_user_link
where instructor = false
and course_id in
  (
  select course_id
  from course_control
  where to_char(course_begin_date,'MMDD') IN ( 
'20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
 )
  and course_delivery LIKE 'O%'
  and course_cross_section IS NULL
  )
and user_id not in (select user_id from instr_as_stutemp)

(table instr_as_stutemp has just one column and only 4 rows)

What new feature of Postgres 8.4 would be making the query run so much more 
slowly?  Is there a better way to rewrite the query for 8.4 to make it run 
faster?

Many thanks,
Julie


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Andrew Sullivan
Output of explain (and as likely, explain analyze) for this would be
helpful.

A

On Wed, Mar 16, 2011 at 10:49:24AM -0500, Davenport, Julie wrote:
 When I run the following query in Postgres 8.0, it runs in 61,509.372 ms
 
 When I run it in Postgres 8.4, it runs in 397,857.472 ms
 
 Here is the query:
 
 select
 course_id AS EXTERNAL_COURSE_KEY,
 user_id AS EXTERNAL_PERSON_KEY,
 'Student' AS ROLE,
 'Y' AS AVAILABLE_IND
 from course_user_link
 where instructor = false
 and course_id in
   (
   select course_id
   from course_control
   where to_char(course_begin_date,'MMDD') IN ( 
 '20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
  )
   and course_delivery LIKE 'O%'
   and course_cross_section IS NULL
   )
 and user_id not in (select user_id from instr_as_stutemp)
 
 (table instr_as_stutemp has just one column and only 4 rows)
 
 What new feature of Postgres 8.4 would be making the query run so much more 
 slowly?  Is there a better way to rewrite the query for 8.4 to make it run 
 faster?
 
 Many thanks,
 Julie

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread tv
 When I run the following query in Postgres 8.0, it runs in 61,509.372 ms

 When I run it in Postgres 8.4, it runs in 397,857.472 ms

As Andrew already pointed out, we need to se EXPLAIN ANALYZE output from
both machines to see why this happens. Are you running both queries on the
same data, or is there much more data in 8.4? Have you analyzed the tables
recently (or is autovacuum running)?

BTW I see two possible issues with this query:

1) There's not a suitable index on course_begin_date, i.e. there's no
index at all or the index is not on the expression used in the query.

2) There's not a suitable index on course_delivery (it needs to use the
proper operator class).

regards
Tomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Merlin Moncure
On Wed, Mar 16, 2011 at 10:49 AM, Davenport, Julie jdavenp...@ctcd.edu wrote:
 When I run the following query in Postgres 8.0, it runs in 61,509.372 ms



 When I run it in Postgres 8.4, it runs in 397,857.472 ms



 Here is the query:



 select

 course_id AS EXTERNAL_COURSE_KEY,

 user_id AS EXTERNAL_PERSON_KEY,

 'Student' AS ROLE,

 'Y' AS AVAILABLE_IND

 from course_user_link

 where instructor = false

 and course_id in

   (

   select course_id

   from course_control

   where to_char(course_begin_date,'MMDD') IN (
 '20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
 )

   and course_delivery LIKE 'O%'

   and course_cross_section IS NULL

   )

 and user_id not in (select user_id from instr_as_stutemp)



 (table instr_as_stutemp has just one column and only 4 rows)



 What new feature of Postgres 8.4 would be making the query run so much more
 slowly?  Is there a better way to rewrite the query for 8.4 to make it run
 faster?

another common problem following upgrades are locale issues -- what is
your setting for lc_collate?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Merlin Moncure
On Wed, Mar 16, 2011 at 2:14 PM, Davenport, Julie jdavenp...@ctcd.edu wrote:
 Hello Merlin,
 Thank you very much for your reply.
 I don't see any setting for lc_collate.  I assume it would be in 
 postgresql.conf file if it were there?  These are the only lc_... settings I 
 see in postgresql.conf:

 lc_messages = 'en_US.UTF-8'     # locale for system error message
 lc_monetary = 'en_US.UTF-8'     # locale for monetary formatting
 lc_numeric = 'en_US.UTF-8'     # locale for number formatting
 lc_time = 'en_US.UTF-8'        # locale for time formatting

 Am I looking in the wrong place?  Thanks much,
 Julie


 Julie A. Davenport
 julie.davenp...@ctcd.edu




 -Original Message-
 From: Merlin Moncure [mailto:mmonc...@gmail.com]
 Sent: Wednesday, March 16, 2011 1:37 PM
 To: Davenport, Julie
 Cc: pgsql-general@postgresql.org
 Subject: Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

 On Wed, Mar 16, 2011 at 10:49 AM, Davenport, Julie jdavenp...@ctcd.edu 
 wrote:
 When I run the following query in Postgres 8.0, it runs in 61,509.372 ms



 When I run it in Postgres 8.4, it runs in 397,857.472 ms



 Here is the query:



 select

 course_id AS EXTERNAL_COURSE_KEY,

 user_id AS EXTERNAL_PERSON_KEY,

 'Student' AS ROLE,

 'Y' AS AVAILABLE_IND

 from course_user_link

 where instructor = false

 and course_id in

   (

   select course_id

   from course_control

   where to_char(course_begin_date,'MMDD') IN (
 '20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
 )

   and course_delivery LIKE 'O%'

   and course_cross_section IS NULL

   )

 and user_id not in (select user_id from instr_as_stutemp)



 (table instr_as_stutemp has just one column and only 4 rows)



 What new feature of Postgres 8.4 would be making the query run so much more
 slowly?  Is there a better way to rewrite the query for 8.4 to make it run
 faster?

 another common problem following upgrades are locale issues -- what is
 your setting for lc_collate?


from psql, do:
show lc_collate;
more than likely, your lc_collate is set to UTF8, that means that
where a like 'foo%' will not use index, which is starting to sound
like your problem.

unfortunately, database collation is only settable when database is
created (or more typically with initdb).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Tomas Vondra
Dne 16.3.2011 20:32, Merlin Moncure napsal(a):
 On Wed, Mar 16, 2011 at 2:14 PM, Davenport, Julie jdavenp...@ctcd.edu wrote:
 Hello Merlin,
 Thank you very much for your reply.
 I don't see any setting for lc_collate.  I assume it would be in 
 postgresql.conf file if it were there?  These are the only lc_... settings I 
 see in postgresql.conf:

 lc_messages = 'en_US.UTF-8' # locale for system error message
 lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
 lc_numeric = 'en_US.UTF-8' # locale for number formatting
 lc_time = 'en_US.UTF-8'# locale for time formatting

 Am I looking in the wrong place?  Thanks much,
 Julie


 Julie A. Davenport
 julie.davenp...@ctcd.edu




 -Original Message-
 From: Merlin Moncure [mailto:mmonc...@gmail.com]
 Sent: Wednesday, March 16, 2011 1:37 PM
 To: Davenport, Julie
 Cc: pgsql-general@postgresql.org
 Subject: Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

 On Wed, Mar 16, 2011 at 10:49 AM, Davenport, Julie jdavenp...@ctcd.edu 
 wrote:
 When I run the following query in Postgres 8.0, it runs in 61,509.372 ms



 When I run it in Postgres 8.4, it runs in 397,857.472 ms



 Here is the query:



 select

 course_id AS EXTERNAL_COURSE_KEY,

 user_id AS EXTERNAL_PERSON_KEY,

 'Student' AS ROLE,

 'Y' AS AVAILABLE_IND

 from course_user_link

 where instructor = false

 and course_id in

   (

   select course_id

   from course_control

   where to_char(course_begin_date,'MMDD') IN (
 '20100412','20100510','20100607','20100712','20100830','20100927','20101025','20101122','20101213','20110110','20110207','20110307'
 )

   and course_delivery LIKE 'O%'

   and course_cross_section IS NULL

   )

 and user_id not in (select user_id from instr_as_stutemp)



 (table instr_as_stutemp has just one column and only 4 rows)



 What new feature of Postgres 8.4 would be making the query run so much more
 slowly?  Is there a better way to rewrite the query for 8.4 to make it run
 faster?

 another common problem following upgrades are locale issues -- what is
 your setting for lc_collate?
 
 
 from psql, do:
 show lc_collate;
 more than likely, your lc_collate is set to UTF8, that means that
 where a like 'foo%' will not use index, which is starting to sound
 like your problem.

That probably is not the root cause here, according to a description
I've received off the list. There are no indexes at all (which is an
issue on it's own), but it means this kind of issue is not possible.

Or maybe there's a different lc_collate with much more complex rules?
That might result in very CPU-expensive operations, and as there's a lot
of sequential scans etc. (i.e. a lot of rows to process).

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Tomas Vondra
Dne 16.3.2011 21:38, Davenport, Julie napsal(a):
 OK, I did the explain analyze on both sides (using a file for output instead) 
 and used the tool you suggested.  
 
 8.0 - http://explain.depesz.com/s/Wam
 8.4 - http://explain.depesz.com/s/asJ

Great, that's exactly what I asked for. I'll repost that to the mailing
list so that the others can check it too.

 When I run the queries I get 59,881 rows on the 8.0 side and 59,880 on the 
 8.4 side, which is what I expect because 8.4 side was updated a couple hours 
 later and some minor changes make sense.

Hm, obviously both versions got the row estimates wrong, but the 8.4
difference (200x) is much bigger that the 8.0 (10x). This might be one
of the reasons why a different plan is chosen.

Anyway both versions underestimate the course_control subquery, as they
believe there will be 1 row only, but in reality there's 2882 of them :-(


 After your first email I did a vacuum full analyze on the 8.4 side on each of 
 the tables in the schema that the views cover, then ran the query again, and 
 it took even longer - up from 397,857 ms to 412,862 ms.  Another query that 
 was taking 597248 ms before the vacuum/analyze took 617526 ms after. I don't 
 understand why, but this is generally the experience we've had with 
 vacuum/analyze on these particular tables. We do large numbers of deletes and 
 inserts to them every day, so I would think they would benefit from it.

OK, so the tables were in a quite good shape - not bloated etc. The
slight increase is negligible I guess, the vacuum probably removed the
data from shared buffers or something like that.

 I did do a vacuum full analyze on instr_as_stutemp before the explain 
 analyze, as you suggested.

OK, now the row estimate is correct

Seq Scan on instr_as_stutemp (cost=0.00..1.04 rows=4 width=9) (actual
time=0.052..0.098 rows=4 loops=1)

 I will consider the indexes and do some benchmark testing (I have considered 
 the 'drop-load-reindex' trick in the past). I'm sure increasing 
 maintenance_work_mem will help.

OK. But the question why the plan changed this way still remains
unanswered (or maybe it does and I don't see it).

One thing I've noticed is this difference in estimates:

8.0:

Subquery Scan course_control (cost=9462700.13..9492043.53 rows=1
width=32) (actual time=43368.204..45795.239 rows=2882 loops=1)
* Filter: (((to_char(course_begin_date, 'MMDD'::text) =
'20100412'::text) OR (to_char(course_begin_date, 'MMDD'::text) =
...
'20110307'::text)) AND ((course_delivery)::text ~~ 'O%'::text) AND
(course_cross_section IS NULL))

8.4:

Subquery Scan course_control (cost=18710.12..548966.51 rows=1 width=32)
(actual time=1632.403..4438.949 rows=2882 loops=1)
* Filter: ((course_control.course_cross_section IS NULL) AND
((course_control.course_delivery)::text ~~ 'O%'::text) AND
(to_char(course_control.course_begin_date, 'MMDD'::text) = ANY
('{20100412,20100510,...,20110110,20110207,20110307}'::text[])))

I.e. both verions use seqscan, both estimate the same number of rows
(incorrectly), yet the estimated cost is very different (9492043 vs.
548966).

Maybe the cost estimation really changed between 8.0 and 8.4, but just
for sure - what are the cost values? I mean what is set for those config
values:

cpu_index_tuple_cost
cpu_operator_cost
cpu_tuple_cost
random_page_cost
seq_page_cost
work_mem

(use 'show' to see the actual value).

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Pavel Stehule
2011/3/16 Tomas Vondra t...@fuzzy.cz:
 Dne 16.3.2011 21:38, Davenport, Julie napsal(a):
 OK, I did the explain analyze on both sides (using a file for output 
 instead) and used the tool you suggested.

 8.0 - http://explain.depesz.com/s/Wam
 8.4 - http://explain.depesz.com/s/asJ

 Great, that's exactly what I asked for. I'll repost that to the mailing
 list so that the others can check it too.

 When I run the queries I get 59,881 rows on the 8.0 side and 59,880 on the 
 8.4 side, which is what I expect because 8.4 side was updated a couple hours 
 later and some minor changes make sense.

 Hm, obviously both versions got the row estimates wrong, but the 8.4
 difference (200x) is much bigger that the 8.0 (10x). This might be one
 of the reasons why a different plan is chosen.

the expression

to_char(course_begin_date, 'MMDD'::text) = '20101025'::text

should be a problem

much better is test on equality in date domain like:

course_begin_date = to_date('20101025', 'MMDD')

this is faster and probably better estimated

Regards

Pavel Stehule

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Tomas Vondra
Dne 16.3.2011 22:31, Pavel Stehule napsal(a):
 2011/3/16 Tomas Vondra t...@fuzzy.cz:
 Dne 16.3.2011 21:38, Davenport, Julie napsal(a):
 OK, I did the explain analyze on both sides (using a file for output 
 instead) and used the tool you suggested.

 8.0 - http://explain.depesz.com/s/Wam
 8.4 - http://explain.depesz.com/s/asJ

 Great, that's exactly what I asked for. I'll repost that to the mailing
 list so that the others can check it too.

 When I run the queries I get 59,881 rows on the 8.0 side and 59,880 on the 
 8.4 side, which is what I expect because 8.4 side was updated a couple 
 hours later and some minor changes make sense.

 Hm, obviously both versions got the row estimates wrong, but the 8.4
 difference (200x) is much bigger that the 8.0 (10x). This might be one
 of the reasons why a different plan is chosen.
 
 the expression
 
 to_char(course_begin_date, 'MMDD'::text) = '20101025'::text
 
 should be a problem
 
 much better is test on equality in date domain like:
 
 course_begin_date = to_date('20101025', 'MMDD')
 
 this is faster and probably better estimated

Which is not going to work if the course_begin_date column is a
timestamp, because of the time part.

But yes, there are several ways to improve this query, yet it does not
explain why the 8.4 is so much slower.

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Tomas Vondra
OK, so the cost constants are equal in both versions (the only
difference is due to change of the default value).

Just out of curiosity, have you tried to throw a bit more work_mem at
the query? Try something like 8MB or 16MB so - just do this

db=$ set work_mem=8192

and then run the query (the change is valid in that session only, the
other sessions will still use 1MB).

Most of the sorts was performed on-disk insted of in memory, and it
might result in better plan.

regards
Tomas

Dne 16.3.2011 22:40, Davenport, Julie napsal(a):
 Tomas,
 Here are the settings on the 8.0 side:
 
 srn_mst=# show cpu_index_tuple_cost;
  cpu_index_tuple_cost
 --
  0.001
 (1 row)
 
 srn_mst=# show cpu_operator_cost;
  cpu_operator_cost
 ---
  0.0025
 (1 row)
 
 srn_mst=# show cpu_tuple_cost;
  cpu_tuple_cost
 
  0.01
 (1 row)
 
 srn_mst=# show random_page_cost;
  random_page_cost
 --
  4
 (1 row)
 
 srn_mst=# show seq_page_cost;
 ERROR:  unrecognized configuration parameter seq_page_cost
 
 srn_mst=# show work_mem;
  work_mem
 --
  1024
 (1 row)
 
 Here are the settings on the 8.4 side:
 
 srn_mst=# show cpu_index_tuple_cost;
  cpu_index_tuple_cost
 --
  0.005
 (1 row)
 
 srn_mst=# show cpu_operator_cost;
  cpu_operator_cost
 ---
  0.0025
 (1 row)
 
 srn_mst=# show cpu_tuple_cost;
  cpu_tuple_cost
 
  0.01
 (1 row)
 
 srn_mst=# show random_page_cost;
  random_page_cost
 --
  4
 (1 row)
 
 srn_mst=# show seq_page_cost;
  seq_page_cost
 ---
  1
 (1 row)
 
 srn_mst=# show work_mem;
  work_mem
 --
  1MB
 (1 row)
 
 Thanks,
 Julie
 
 
 
 -Original Message-
 From: Tomas Vondra [mailto:t...@fuzzy.cz] 
 Sent: Wednesday, March 16, 2011 4:23 PM
 To: pgsql-general@postgresql.org
 Cc: Davenport, Julie
 Subject: Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade
 
 Dne 16.3.2011 21:38, Davenport, Julie napsal(a):
 OK, I did the explain analyze on both sides (using a file for output 
 instead) and used the tool you suggested.  

 8.0 - http://explain.depesz.com/s/Wam
 8.4 - http://explain.depesz.com/s/asJ
 
 Great, that's exactly what I asked for. I'll repost that to the mailing
 list so that the others can check it too.
 
 When I run the queries I get 59,881 rows on the 8.0 side and 59,880 on the 
 8.4 side, which is what I expect because 8.4 side was updated a couple hours 
 later and some minor changes make sense.
 
 Hm, obviously both versions got the row estimates wrong, but the 8.4
 difference (200x) is much bigger that the 8.0 (10x). This might be one
 of the reasons why a different plan is chosen.
 
 Anyway both versions underestimate the course_control subquery, as they
 believe there will be 1 row only, but in reality there's 2882 of them :-(
 
 
 After your first email I did a vacuum full analyze on the 8.4 side on each 
 of the tables in the schema that the views cover, then ran the query again, 
 and it took even longer - up from 397,857 ms to 412,862 ms.  Another query 
 that was taking 597248 ms before the vacuum/analyze took 617526 ms after. I 
 don't understand why, but this is generally the experience we've had with 
 vacuum/analyze on these particular tables. We do large numbers of deletes 
 and inserts to them every day, so I would think they would benefit from it.
 
 OK, so the tables were in a quite good shape - not bloated etc. The
 slight increase is negligible I guess, the vacuum probably removed the
 data from shared buffers or something like that.
 
 I did do a vacuum full analyze on instr_as_stutemp before the explain 
 analyze, as you suggested.
 
 OK, now the row estimate is correct
 
 Seq Scan on instr_as_stutemp (cost=0.00..1.04 rows=4 width=9) (actual
 time=0.052..0.098 rows=4 loops=1)
 
 I will consider the indexes and do some benchmark testing (I have considered 
 the 'drop-load-reindex' trick in the past). I'm sure increasing 
 maintenance_work_mem will help.
 
 OK. But the question why the plan changed this way still remains
 unanswered (or maybe it does and I don't see it).
 
 One thing I've noticed is this difference in estimates:
 
 8.0:
 
 Subquery Scan course_control (cost=9462700.13..9492043.53 rows=1
 width=32) (actual time=43368.204..45795.239 rows=2882 loops=1)
 * Filter: (((to_char(course_begin_date, 'MMDD'::text) =
 '20100412'::text) OR (to_char(course_begin_date, 'MMDD'::text) =
 ...
 '20110307'::text)) AND ((course_delivery)::text ~~ 'O%'::text) AND
 (course_cross_section IS NULL))
 
 8.4:
 
 Subquery Scan course_control (cost=18710.12..548966.51 rows=1 width=32)
 (actual time=1632.403..4438.949 rows=2882 loops=1)
 * Filter: ((course_control.course_cross_section IS NULL) AND
 ((course_control.course_delivery)::text ~~ 'O%'::text) AND
 (to_char

Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

2011-03-16 Thread Pavel Stehule
2011/3/16 Davenport, Julie jdavenp...@ctcd.edu:
 Yes, the column course_begin_date is a timestamp, so that would not work in 
 this instance, but I will keep that in mind for future use elsewhere.  I 
 agree, there are ways to rewrite this query, just wondering which is best to 
 take advantage of 8.4.
 Thanks much.



ok, sorry, do column_course_begin::date = ...

:)

Pavel



 -Original Message-
 From: Tomas Vondra [mailto:t...@fuzzy.cz]
 Sent: Wednesday, March 16, 2011 4:40 PM
 To: Pavel Stehule
 Cc: pgsql-general@postgresql.org; Davenport, Julie
 Subject: Re: [GENERAL] query taking much longer since Postgres 8.4 upgrade

 Dne 16.3.2011 22:31, Pavel Stehule napsal(a):
 2011/3/16 Tomas Vondra t...@fuzzy.cz:
 Dne 16.3.2011 21:38, Davenport, Julie napsal(a):
 OK, I did the explain analyze on both sides (using a file for output 
 instead) and used the tool you suggested.

 8.0 - http://explain.depesz.com/s/Wam
 8.4 - http://explain.depesz.com/s/asJ

 Great, that's exactly what I asked for. I'll repost that to the mailing
 list so that the others can check it too.

 When I run the queries I get 59,881 rows on the 8.0 side and 59,880 on the 
 8.4 side, which is what I expect because 8.4 side was updated a couple 
 hours later and some minor changes make sense.

 Hm, obviously both versions got the row estimates wrong, but the 8.4
 difference (200x) is much bigger that the 8.0 (10x). This might be one
 of the reasons why a different plan is chosen.

 the expression

 to_char(course_begin_date, 'MMDD'::text) = '20101025'::text

 should be a problem

 much better is test on equality in date domain like:

 course_begin_date = to_date('20101025', 'MMDD')

 this is faster and probably better estimated

 Which is not going to work if the course_begin_date column is a
 timestamp, because of the time part.

 But yes, there are several ways to improve this query, yet it does not
 explain why the 8.4 is so much slower.

 Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general