Re: [PERFORM] Please Help: PostgreSQL performance Optimization

2006-01-12 Thread Jaime Casanova
On 1/12/06, Jamal Ghaffour <[EMAIL PROTECTED]> wrote:
> Jamal Ghaffour a écrit :
> Hi,

I'm working on a project, whose implementation deals with PostgreSQL. A
> brief description of our application is given below.

I'm running version
> 8.0 on a dedicated server 1Gb of RAM.
my database isn't complex, it
> contains just 2 simple tables.

CREATE TABLE cookies (
 domain varchar(50)
> NOT NULL,
 path varchar(50) NOT NULL,
 name varchar(50) NOT NULL,
> principalid varchar(50) NOT NULL,
 host text NOT NULL,
 value text NOT
> NULL,
 secure bool NOT NULL,
 timestamp timestamp with time zone NOT NULL
> DEFAULT
CURRENT_TIMESTAMP+TIME '04:00:00',
 PRIMARY KEY
> (domain,path,name,principalid)
)

CREATE TABLE liberty (
 principalid
> varchar(50) NOT NULL,
 requestid varchar(50) NOT NULL,
 spassertionurl text
> NOT NULL,
 libertyversion varchar(50) NOT NULL,
 relaystate varchar(50) NOT
> NULL,
 PRIMARY KEY (principalid)
)

I'm developping an application that uses
> the libpqxx to execute
psql queries on the database and have to execute 500
> requests at the same time.


UPDATE cookies SET host='ping.icap-elios.com',
> value= '54E5B5491F27C0177083795F2E09162D', secure=FALSE,
>
timestamp=CURRENT_TIMESTAMP+INTERVAL '14400 SECOND' WHERE
>
domain='ping.icap-elios.com' AND path='/tfs' AND
> principalid='192.168.8.219' AND
name='jsessionid'

SELECT path, upper(name)
> AS name, value FROM cookies WHERE timestamp principalid='192.168.8.219' AND
secure=FALSE AND
> (domain='ping.icap-elios.com' OR domain='.icap-elios.com')

I have to notify
> that the performance of is extremely variable and irregular.
I can also see
> that the server process uses almost 100% of
a CPU.

I'm using the default
> configuration file, and i m asking if i have to change some paramters to
> have a good performance.

Any help would be greatly appreciated.

Thanks,
> Hi,
>
> There are some results that can give you concrete  idea about my problem:
> when i 'm launching my test that executes in loop manner the  SELECT and
> UPDATE queries described above, i'm obtaining this results:
>
> UPDATE Time execution :0s: 5225 us
> SELECT Time execution  :0s: 6908 us
>
> 5 minutes Later:
>
> UPDATE Time execution :0s: 6125 us
> SELECT Time execution  :0s: 10928 us
>
> 5 minutes Later:
>
> UPDATE Time execution :0s: 5825 us
> SELECT Time execution  :0s: 14978 us
>
> As you can see , the time execution of the SELECT request is growing
> relatively to time and not the UPDATE time execution.
>  I note that to stop the explosion of the Select time execution, i m using
> frequently the vaccum query on the cookies table.
> Set  the  autovacuum parmaeter in the configuation file to on wasn't able to
> remplace the use of the vaccum command, and i don't know if this behaivour
> is normal?
>
> Thanks,
> Jamal
>
>

please execute

EXPLAIN ANALYZE 
and show the results, is the only way to know what's happening

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Extremely irregular query performance

2006-01-12 Thread Jean-Philippe Cote


Can I actully know whether a given plan is excuted with GEQO on ?
In other words, if I launch 'explain ', I'll get a given plan, but if I 
re-launch
the  (withtout the 'explain' keyword), could I get a different
plan given that GEQO induces some randomness ?

>Is it the plan that is different in the fastest case with GEQO or is it
>the time needed to plan that is causing the GEQO to beat the exhaustive
>search?



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Ottó Havasvölgyi
Hi,

If the join is to a primary key or notnull unique column(s), then
inner join is also ok. But of course left join is the simpler case.
An example:

create table person (id serial primary key, name varchar not null);
create table pet (id serial primary key, name varchar not null,
person_id int not null references person(id));
create view v_pet_person as select pet.id as pet_id, pet.name as
pet_name, person_id as person_id, person.name as person_name from pet
join person (pet.person_id=person.id);

At this point we know that optimization may be possible because of the
primary key on person. The optimization depends on the primary key
constraint. Kindof internal dependency.
We can find out that which "from-element" is a given field's source as
far they are simple references. This can be stored.
Then query the view:

select pet_name, person_id from v_pet_person where person_id=2;

In this case we don't need the join.
These queries are usually dynamically generated, the selection list
and the where condition is the dynamic part.

Best Regards,
Otto


2006/1/12, Scott Marlowe <[EMAIL PROTECTED]>:
> On Thu, 2006-01-12 at 11:00, Ottó Havasvölgyi wrote:
> > Hi,
> >
> > I think it would be sufficient only for views. In other cases the
> > programmer can optimize himself. But a view can be a join of other
> > tables, and it is not sure that all of them are always needed. It all
> > depends on what I select from the view.
>
> The idea that you could throw away joins only works for outer joins.
> I.e. if you did:
>
> select a.x, a.y, a.z from a left join b (on a.id=b.aid)
>
> then you could throw away the join to b.  But if it was a regular inner
> join then you couldn't know whether or not you needed to join to b
> without actually joining to b...
>

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Please Help: PostgreSQL performance Optimization

2006-01-12 Thread Andrew Lazarus

Jamal Ghaffour wrote:


CREATE TABLE cookies (
   domain varchar(50) NOT NULL,
   path varchar(50) NOT NULL,
   name varchar(50) NOT NULL,
   principalid varchar(50) NOT NULL,
   host text NOT NULL,
   value text NOT NULL,
   secure bool NOT NULL,
   timestamp timestamp with time zone NOT NULL DEFAULT 
CURRENT_TIMESTAMP+TIME '04:00:00',

   PRIMARY KEY  (domain,path,name,principalid)
)

[snip]
SELECT path, upper(name) AS name, value FROM cookies  WHERE timestampsecure=FALSE AND (domain='ping.icap-elios.com' OR domain='.icap-elios.com')


I think the problem here is that the column order in the index doesn't 
match the columns used in the WHERE clause criteria. Try adding an index 
on (domain,principalid) or (domain,principalid,timestamp). If these are 
your only queries, you can get the same effect by re-ordering the 
columns in the table so that this is the column order used by the 
primary key and its implicit index.


You should check up on EXPLAIN and EXPLAIN ANALYZE to help you debug 
slow queries.
begin:vcard
fn:Andrew Lazarus
n:Lazarus;Andrew
org:Pillette Investment Management;Research and Development
adr;dom:;;3028 Fillmore;San Francisco;CA;94123
email;internet:[EMAIL PROTECTED]
title:Director
tel;work:800-366-0688
tel;fax:415-440-4093
url:http://www.pillette.com
version:2.1
end:vcard


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Scott Marlowe
On Thu, 2006-01-12 at 11:00, Ottó Havasvölgyi wrote:
> Hi,
> 
> I think it would be sufficient only for views. In other cases the
> programmer can optimize himself. But a view can be a join of other
> tables, and it is not sure that all of them are always needed. It all
> depends on what I select from the view.

The idea that you could throw away joins only works for outer joins. 
I.e. if you did:

select a.x, a.y, a.z from a left join b (on a.id=b.aid) 

then you could throw away the join to b.  But if it was a regular inner
join then you couldn't know whether or not you needed to join to b
without actually joining to b...

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Ottó Havasvölgyi
Hi,

I think it would be sufficient only for views. In other cases the
programmer can optimize himself. But a view can be a join of other
tables, and it is not sure that all of them are always needed. It all
depends on what I select from the view.
This information could even be calculted at view creation time. Of
cource this requires that views are handled in a bit more special way,
not just a view definition that is substituted into the original query
(as far as I know the current implementation is similar to this. Sorry
if not).
What do you think about this idea? Of course it is not trivial to
implement, but the result is really great.

Postgres could determine at creation time, if this kind of
optimization is possible at all or not. It can happan though that not
all information is available (I mean unique index or foreign key) at
that time. So this optimiztaion info could be refreshed later by a
command, "ALTER VIEW  ANALYZE" or "ANALYZE "
simply.
Postgres could also establish at creation time that for a given column
in the selection list which source table(s) are required. This is
probably not sufficient, but I haven't thought is through thouroughly
yet. And I am not that familiar with the current optimizer internals.
And one should be able to turn off this optimization, so that view
creation takes not longer than now. If the optimizer finds no
optimization info in the catalog, it behaves like now.
I hope you see this worth.
This all is analogue to statistics collection.

Thanks for reading,
Otto


2006/1/12, Tom Lane <[EMAIL PROTECTED]>:
> =?ISO-8859-1?Q?Ott=F3_Havasv=F6lgyi?= <[EMAIL PROTECTED]> writes:
> > As far as I know SQL Server has some similar feature. It does not join
> > if not necessary, more exactly: if the result would be the same if it
> > joined the table.
>
> I find it really really hard to believe that such cases arise often
> enough to justify having the planner spend cycles checking for them.
>
>regards, tom lane
>

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Szűcs Gábor

Dear Tom,

Not sure about Otto's exact problem, but he did mention views, and I'd feel 
more comfortable if you told me that view-based queries are re-planned based 
on actual conditions etc. Are they?


Also, if you find it unlikely (or very rare) then it might be a configurable 
parameter. If someone finds it drastically improving (some of) their 
queries, it'd be possible to enable this feature in expense of extra planner 
cycles (on all queries).


What I'd be concerned about, is whether the developers' time spent on this 
feature would worth it. :)


--
G.


On 2006.01.12. 16:53, Tom Lane wrote:

=?ISO-8859-1?Q?Ott=F3_Havasv=F6lgyi?= <[EMAIL PROTECTED]> writes:

As far as I know SQL Server has some similar feature. It does not join
if not necessary, more exactly: if the result would be the same if it
joined the table.


I find it really really hard to believe that such cases arise often
enough to justify having the planner spend cycles checking for them.

regards, tom lane



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Tom Lane
=?ISO-8859-1?Q?Ott=F3_Havasv=F6lgyi?= <[EMAIL PROTECTED]> writes:
> As far as I know SQL Server has some similar feature. It does not join
> if not necessary, more exactly: if the result would be the same if it
> joined the table.

I find it really really hard to believe that such cases arise often
enough to justify having the planner spend cycles checking for them.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Please Help: PostgreSQL performance Optimization

2006-01-12 Thread Jamal Ghaffour




Jamal Ghaffour a écrit :

  
  
  
  Hi,

I'm working on a project, whose implementation deals with PostgreSQL. A brief description of our application is given  below.

I'm running version 8.0 on a dedicated  server 1Gb of RAM. 
my database isn't complex, it contains just 2 simple tables.

CREATE TABLE cookies (
domain varchar(50) NOT NULL,
path varchar(50) NOT NULL,
name varchar(50) NOT NULL,
principalid varchar(50) NOT NULL,
host text NOT NULL,
value text NOT NULL,
secure bool NOT NULL,
timestamp timestamp with time zone NOT NULL DEFAULT 
CURRENT_TIMESTAMP+TIME '04:00:00',
PRIMARY KEY  (domain,path,name,principalid)
)

CREATE TABLE liberty (
principalid varchar(50) NOT NULL,
requestid varchar(50) NOT NULL,
spassertionurl text NOT NULL,
libertyversion  varchar(50) NOT NULL,
relaystate  varchar(50) NOT NULL,
PRIMARY KEY  (principalid)
)

I'm developping an application that uses the libpqxx to execute 
psql queries on the database and have to execute 500 requests at the same time.


UPDATE cookies SET host='ping.icap-elios.com', value= '54E5B5491F27C0177083795F2E09162D', secure=FALSE, 
timestamp=CURRENT_TIMESTAMP+INTERVAL '14400 SECOND' WHERE 
domain='ping.icap-elios.com' AND path='/tfs' AND principalid='192.168.8.219' AND 
name='jsessionid'

SELECT path, upper(name) AS name, value FROM cookies  WHERE timestamp

I have to notify that the performance of is extremely variable and irregular.
I can also see that the server process uses almost 100% of
a CPU.

I'm using the default configuration file, and i m asking if i have to change some paramters to have a good performance.

Any help would be greatly appreciated.

Thanks,
  

Hi,

There are some results that can give you concrete 
idea about my problem: 
when i 'm launching my test that executes in loop manner the  SELECT
and UPDATE queries described above, i'm obtaining this results:

UPDATE Time execution :0s: 5225 us
SELECT Time execution  :0s: 6908 us

5 minutes Later: 

UPDATE Time execution :0s: 6125 us
SELECT Time execution  :0s: 10928 us

5 minutes Later: 

UPDATE Time execution :0s: 5825 us
SELECT Time execution  :0s: 14978 us

As you can see , the time execution of the SELECT request is growing
relatively to time and not the UPDATE time execution. 
 I note that to stop the explosion of the Select time execution, i m
using frequently the vaccum query on the cookies table.
Set  the  autovacuum parmaeter in the configuation file to on wasn't
able to remplace the use of the vaccum command, and i don't know if
this behaivour is normal?

Thanks,
Jamal


begin:vcard
fn:Jamal Ghaffour
n:Ghaffour;Jamal
org:ELIOS Informatique
adr;quoted-printable:;;1, sq de ch=C3=AAne Germain,;CESSON SEVIGNE;;35510;FRANCE
email;internet:[EMAIL PROTECTED]
tel;work:(+33) 2.99.63.85.30
tel;fax:(+33) 2.99.63.85.93
tel;home:(+33) 2 99 36 73 96
tel;cell:(+33) 6.21.85.15.91
url:http://www.elios-informatique.fr
version:2.1
end:vcard


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] query slower on 8.1 than 7.3

2006-01-12 Thread Robert Treat
Porting app from 7.3 to 8.1, have hit a query that is slower. Everything
is analyzed / vacuumed appropriately. I've managed to pare the query
down into something manageable that still gives me a problem, it looks
like this:

SELECT 
* 
FROM
(
SELECT
software_download.*
FROM
(
SELECT
host_id, max(mtime) as mtime
FROM   
software_download
JOIN software_binary USING (software_binary_id)
WHERE
binary_type_id IN (3,5,6) AND bds_status_id not in (6,17,18)
GROUP BY
host_id, software_binary_id
 ) latest_download
 JOIN software_download using (host_id,mtime)
) ld
LEFT JOIN
(
SELECT
entityid, rmsbinaryid, rmsbinaryid as software_binary_id, timestamp 
as downloaded, ia.host_id
FROM
(
SELECT
entityid, rmsbinaryid,max(msgid) as msgid
FROM
msg306u
WHERE
downloadstatus=1
GROUP BY entityid,rmsbinaryid
) a1
JOIN myapp_app ia on (entityid=myapp_app_id)
JOIN (
SELECT 
*
FROM 
msg306u
WHERE
downloadstatus != 0
  ) a2 USING(entityid,rmsbinaryid,msgid)
 ) aa USING (host_id,software_binary_id)



The problem seems to stem from 8.1's thinking that using a nested loop
left join is a good idea. The 7.3 explain plan looks like this:

 Hash Join  (cost=10703.38..11791.64 rows=1 width=150) (actual 
time=2550.23..2713.26 rows=475 loops=1)
   Hash Cond: ("outer".host_id = "inner".host_id)
   Join Filter: ("outer".software_binary_id = "inner".rmsbinaryid)
   ->  Merge Join  (cost=1071.80..2160.07 rows=1 width=110) (actual 
time=93.16..252.12 rows=475 loops=1)
 Merge Cond: ("outer".host_id = "inner".host_id)
 Join Filter: ("inner".mtime = "outer".mtime)
 ->  Index Scan using software_download_host_id on software_download  
(cost=0.00..973.16 rows=18513 width=98) (actual time=0.05..119.89 rows=15587 
loops=1)
 ->  Sort  (cost=1071.80..1072.81 rows=403 width=20) (actual 
time=90.82..94.97 rows=7328 loops=1)
   Sort Key: latest_download.host_id
   ->  Subquery Scan latest_download  (cost=1014.00..1054.34 
rows=403 width=20) (actual time=85.60..90.12 rows=475 loops=1)
 ->  Aggregate  (cost=1014.00..1054.34 rows=403 width=20) 
(actual time=85.59..89.27 rows=475 loops=1)
   ->  Group  (cost=1014.00..1044.26 rows=4034 
width=20) (actual time=85.55..87.61 rows=626 loops=1)
 ->  Sort  (cost=1014.00..1024.09 rows=4034 
width=20) (actual time=85.54..85.86 rows=626 loops=1)
   Sort Key: software_download.host_id, 
software_download.software_binary_id
   ->  Hash Join  (cost=21.64..772.38 
rows=4034 width=20) (actual time=1.06..84.14 rows=626 loops=1)
 Hash Cond: 
("outer".software_binary_id = "inner".software_binary_id)
 ->  Seq Scan on software_download  
(cost=0.00..565.98 rows=17911 width=16) (actual time=0.06..67.26 rows=15364 
loops=1)
   Filter: ((bds_status_id <> 
6) AND (bds_status_id <> 17) AND (bds_status_id <> 18))
 ->  Hash  (cost=21.59..21.59 
rows=20 width=4) (actual time=0.94..0.94 rows=0 loops=1)
   ->  Seq Scan on 
software_binary  (cost=0.00..21.59 rows=20 width=4) (actual time=0.32..0.91 
rows=23 loops=1)
 Filter: 
((binary_type_id = 3) OR (binary_type_id = 5) OR (binary_type_id = 6))
   ->  Hash  (cost=9631.57..9631.57 rows=1 width=40) (actual 
time=2457.04..2457.04 rows=0 loops=1)
 ->  Merge Join  (cost=9495.38..9631.57 rows=1 width=40) (actual 
time=2345.77..2456.74 rows=240 loops=1)
   Merge Cond: (("outer".rmsbinaryid = "inner".rmsbinaryid) AND 
("outer".msgid = "inner".msgid) AND ("outer".entityid = "inner".entityid))
   ->  Sort  (cost=4629.24..4691.15 rows=24761 width=20) (actual 
time=514.19..539.04 rows=25544 loops=1)
 Sort Key: msg306u.rmsbinaryid, msg306u.msgid, 
msg306u.entityid
 ->  Seq Scan on msg306u  (cost=0.00..2556.22 rows=24761 
width=20) (actual time=0.08..228.09 rows=25544 loops=1)
   Filter: (downloadstatus <> '0'::text)
   ->  Sort  (cost=4866.14..4872.33 rows=2476 width=20) (actual 
time=1831.55..1831.68 rows=241 loops=1)

Re: [PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Ottó Havasvölgyi
Hi,
As far as I know SQL Server has some similar feature. It does not join
if not necessary, more exactly: if the result would be the same if it
joined the table.
Here is another example:
http://www.ianywhere.com/developer/product_manuals/sqlanywhere/0902/en/html/dbugen9/0468.htm
This would be a fantastic feature.
Best Regards,
Otto


2006/1/12, Alessandro Baretta <[EMAIL PROTECTED]>:
> Ottó Havasvölgyi wrote:
> > Hi all,
> >
> > Is PostgreSQL able to throw unnecessary joins?
> > For example I have two tables, and I join then with their primary keys,
> > say type of bigint . In this case if I don't reference to one of the
> > tables anywhere except the join condition, then the join can be eliminated.
> > Or if I do a "table1 left join table2 (table1.referer=table2.id)"  (N :
> > 1 relationship), and I don't reference table2 anywhere else, then it is
> > unnecessary.
>
> It cannot possibly remove "unnecessary joins", simply because the join
> influences whether a tuple in the referenced table gets selected and how many 
> times.
>
> Alex
>
>
> --
> *
> http://www.barettadeit.com/
> Baretta DE&IT
> A division of Baretta SRL
>
> tel. +39 02 370 111 55
> fax. +39 02 370 111 54
>
> Our technology:
>
> The Application System/Xcaml (AS/Xcaml)
> 
>
> The FreerP Project
> 
>

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Alessandro Baretta

Ottó Havasvölgyi wrote:

Hi all,
 
Is PostgreSQL able to throw unnecessary joins?
For example I have two tables, and I join then with their primary keys, 
say type of bigint . In this case if I don't reference to one of the 
tables anywhere except the join condition, then the join can be eliminated.
Or if I do a "table1 left join table2 (table1.referer=table2.id)"  (N : 
1 relationship), and I don't reference table2 anywhere else, then it is 
unnecessary.


It cannot possibly remove "unnecessary joins", simply because the join 
influences whether a tuple in the referenced table gets selected and how many times.


Alex


--
*
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)


The FreerP Project


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Throwing unnecessary joins away

2006-01-12 Thread Ottó Havasvölgyi
Hi all,
 
Is PostgreSQL able to throw unnecessary joins?
For example I have two tables, and I join then with their primary keys, say type of bigint . In this case if I don't reference to one of the tables anywhere except the join condition, then the join can be eliminated. 

Or if I do a "table1 left join table2 (table1.referer=table2.id)"  (N : 1 relationship), and I don't reference table2 anywhere else, then it is unnecessary.
Primary key - primary key joins are often generated by O/R mappers. These generated queries could be optimized even more by not joining if not necessary.
 
You may say that I should not write such queries. The truth is that the O/R mapper is generating queries on views, and it does not use every field every time, but even so the query of the view is executed with the same plan by PostgreSQL, although some joins are unnecessary.

 
So basically this all is relevant only with views.
 
Best Regards,
Otto


Re: [PERFORM] Extremely irregular query performance

2006-01-12 Thread Simon Riggs
On Wed, 2006-01-11 at 22:23 -0500, Tom Lane wrote:
> =?iso-8859-1?Q?Jean-Philippe_C=F4t=E9?= <[EMAIL PROTECTED]> writes:
> > Thanks a lot for this info, I was indeed exceeding the genetic
> > optimizer's threshold.  Now that it is turned off, I get
> > a very stable response time of 435ms (more or less 5ms) for
> > the same query. It is about three times slower than the best
> > I got with the genetic optimizer on, but the overall average
> > is much lower.
> 
> Hmm.  It would be interesting to use EXPLAIN ANALYZE to confirm that the
> plan found this way is the same as the best plan found by GEQO, and
> the extra couple hundred msec is the price you pay for the exhaustive
> plan search.  If GEQO is managing to find a plan better than the regular
> planner then we need to look into why ...

It seems worth noting in the EXPLAIN whether GEQO has been used to find
the plan, possibly along with other factors influencing the plan such as
enable_* settings.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster