Re: [PERFORM] Nested Loop trouble : Execution time increases more

2005-09-22 Thread Simon Riggs
On Sat, 2005-09-17 at 17:47 +0200, Antoine Bajolet wrote:

 There are more tables around, but the heart of  the search engine is 
 made of three tables :
 
 fiches (f_id int4, f_title varchar)  52445 rows
 engine (f_id int4, k_id int4, weight )11761700 rows
 keywords(k_id, keyword)1072600 rows
 
 A fiche is linked to any kind of document.
 The engine table counts how many times a keyword appears in a document.
 
 A query to search on one or two keywords is quick to execute (the 
 front-end creates thoses queries):
 

 Is there a specific reason the planner chooses this way ?

Yes, you have an additional join for each new keyword, so there is more
work to do.

Recode your SQL with an IN subselect that retrieves all possible
keywords before it accesses the larger table.

That way you should have only one join for each new keyword.

 Can whe do something on the postgresql configuration to avoid this ?
 Can whe force the planner to use a hash join as it does for the first 
 joins ?

Not required, IMHO.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Simon Riggs
On Thu, 2005-09-22 at 12:21 +0800, K C Lau wrote:

 Investigating further on this problem I brought up in June, the following 
 query with pg 8.0.3 on Windows scans all 1743 data records for a player:
 
 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc 
 LIMIT 1);
 

   Total runtime: 51.133 ms
 
 Using a static value in the subquery produces the desired result below, but 
 since we use views for our queries (see last part of this email), we cannot 
 push the static value into the subquery:
 
 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc LIMIT 1);

   Total runtime: 0.149 ms
 
 The Player table has a primary key on PlayerID, AtDate. Is there a way to 
 stop the inner-most index scan looping all 1743 data records for that 
 player?  Is that a bug or known issue?

Currently the planner can't tell whether a subquery is correlated or not
until it has planned the query. So it is unable to push down the
qualification automatically in the way you have achieved manually. The
new min() optimisation doesn't yet work with GROUP BY which is what you
would use to reformulate the query that way, so no luck that way either.

If you don't want to do this in a view, calculate the values for all
players at once and store the values in a summary table for when you
need them.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread K C Lau
We use similar views as base views throughout our OLTP system to get the 
latest time-based record(s). So it is quite impossible to use summary 
tables etc. Are there other ways to do it?


The subquery would pinpoint the record(s) with the composite primary key. 
Both MS Sql and Oracle do not have such performance problem. So this 
problem is effectively stopping us from migrating to PostgreSQL.


Any suggestions would be most appreciated.

Best regards,
KC.

At 16:40 05/09/22, Simon Riggs wrote:

On Thu, 2005-09-22 at 12:21 +0800, K C Lau wrote:

 Investigating further on this problem I brought up in June, the following
 query with pg 8.0.3 on Windows scans all 1743 data records for a player:

 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc
 LIMIT 1);


   Total runtime: 51.133 ms

 Using a static value in the subquery produces the desired result below, 
but
 since we use views for our queries (see last part of this email), we 
cannot

 push the static value into the subquery:

 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc 
LIMIT 1);


   Total runtime: 0.149 ms

 The Player table has a primary key on PlayerID, AtDate. Is there a way to
 stop the inner-most index scan looping all 1743 data records for that
 player?  Is that a bug or known issue?

Currently the planner can't tell whether a subquery is correlated or not
until it has planned the query. So it is unable to push down the
qualification automatically in the way you have achieved manually. The
new min() optimisation doesn't yet work with GROUP BY which is what you
would use to reformulate the query that way, so no luck that way either.

If you don't want to do this in a view, calculate the values for all
players at once and store the values in a summary table for when you
need them.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Simon Riggs
On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote:
 We use similar views as base views throughout our OLTP system to get the 
 latest time-based record(s). So it is quite impossible to use summary 
 tables etc. Are there other ways to do it?
 
 The subquery would pinpoint the record(s) with the composite primary key. 
 Both MS Sql and Oracle do not have such performance problem. So this 
 problem is effectively stopping us from migrating to PostgreSQL.
 
 Any suggestions would be most appreciated.

Even if this were fixed for 8.1, which seems unlikely, would you be able
to move to that release immediately?

ISTM you have two choices, in priority, complexity and time/cost order
1) custom mods to your app
2) custom mods to PostgreSQL

Maybe its possible to reconstruct your query with sub-sub-selects so
that you have a correlated query with manually pushed down clauses,
which also references a more constant base view?

Is a 51ms query really such a problem for you?

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Merlin Moncure
 I previously posted the following as a sequel to my SELECT DISTINCT
 Performance Issue question. We would most appreciate any clue or
 suggestions on how to overcome this show-stopping issue. We are using
 8.0.3 on Windows.
 
 Is it a known limitation when using a view with SELECT ... LIMIT 1?
 
 Would the forthcoming performance enhancement with MAX help when used
 within a view, as in:
 
 create or replace view VCurPlayer as select * from Player a
 where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID
=
 b.PlayerID);

Here is a trick I use sometimes with views, etc.  This may or may not be
effective to solve your problem but it's worth a shot.  Create one small
SQL function taking date, etc. and returning the values and define it
immutable.  Now in-query it is treated like a constant.

Another useful application for this feature is when you have nested
views (view 1 queries view 2) and you need to filter records based on
fields from view 2 which are not returned in view 1.  Impossible? 

in view 2 add clause where v2.f between  f_min() and f_max(), them being
immutable functions which can grab filter criteria based on inputs or
values from a table.

Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread K C Lau

At 20:48 05/09/22, Simon Riggs wrote:

On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote:
 We use similar views as base views throughout our OLTP system to get the
 latest time-based record(s). So it is quite impossible to use summary
 tables etc. Are there other ways to do it?

 The subquery would pinpoint the record(s) with the composite primary key.
 Both MS Sql and Oracle do not have such performance problem. So this
 problem is effectively stopping us from migrating to PostgreSQL.

 Any suggestions would be most appreciated.

Even if this were fixed for 8.1, which seems unlikely, would you be able
to move to that release immediately?


Yes. In fact when we first developed our system a few years ago, we tested 
on MS7.0, Oracle 8 and PG 7.1.1 and we did not hit that problem. When we 
try again with PG 8.0, the performance becomes unbearable, but other areas 
appear ok and other queries are often faster than MS Sql2k.



Maybe its possible to reconstruct your query with sub-sub-selects so
that you have a correlated query with manually pushed down clauses,
which also references a more constant base view?


We would be most happy to try them if we have some example views or pointers.


Is a 51ms query really such a problem for you?


Unfortunately yes, as our target performance is in the high hundreds of 
transactions per sec. And 51 ms is already the best case for a single 
select, with everything cached in memory immediately after the same select 
which took 390 ms on a quiet system.



Best Regards, Simon Riggs


Best regards,
KC. 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread K C Lau

At 22:37 05/09/22, Merlin Moncure wrote:


 create or replace view VCurPlayer as select * from Player a
 where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID=
 b.PlayerID);

Here is a trick I use sometimes with views, etc.  This may or may not be
effective to solve your problem but it's worth a shot.  Create one small
SQL function taking date, etc. and returning the values and define it
immutable.  Now in-query it is treated like a constant.


We don't use functions as a rule, but I would be glad to give it a try.
I would most appreciate if you could define a sample function and rewrite 
the VCurPlayer view above. Both PlayerID and AtDate are varchar fields.



Another useful application for this feature is when you have nested
views (view 1 queries view 2) and you need to filter records based on
fields from view 2 which are not returned in view 1.  Impossible?

in view 2 add clause where v2.f between  f_min() and f_max(), them being
immutable functions which can grab filter criteria based on inputs or
values from a table.

Merlin


Best regards,
KC. 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Tom Lane
K C Lau [EMAIL PROTECTED] writes:
 At 20:48 05/09/22, Simon Riggs wrote:
 Even if this were fixed for 8.1, which seems unlikely, would you be able
 to move to that release immediately?

 Yes. In fact when we first developed our system a few years ago, we tested 
 on MS7.0, Oracle 8 and PG 7.1.1 and we did not hit that problem.

It's really not credible that PG 7.1 did any better with this than
current sources do.  The subplan mechanism hasn't changed materially
since about 6.5.  It could be that 7.1's performance was simply so
bad across the board that you didn't notice ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Simon Riggs
On Thu, 2005-09-22 at 22:39 +0800, K C Lau wrote:
 Is a 51ms query really such a problem for you?
 
 Unfortunately yes, as our target performance is in the high hundreds of 
 transactions per sec. And 51 ms is already the best case for a single 
 select, with everything cached in memory immediately after the same select 
 which took 390 ms on a quiet system.

If the current value is used so often, use two tables - one with a
current view only of the row maintained using UPDATE. Different
performance issues maybe, but at least not correlated subquery ones.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Nested Loop trouble : Execution time increases more 1000 time (long)

2005-09-22 Thread Tom Lane
Antoine Bajolet [EMAIL PROTECTED] writes:
 We are using postgresql in a search engine on an intranet handling 
 throusand of documents.
 But we ave a big problem when users use more than two search key.

I think you need to increase the statistics targets for your keywords
table --- the estimates of numbers of matching rows are much too small:

 -  Index Scan using keyword_pattern_key on keywords 
 k2  (cost=0.00..3.51 rows=1 width=4) (actual time=0.078..1.887 rows=75 
 loops=1)
   Index Cond: (((keyword)::text ~=~ 
 'exploitation'::character varying) AND ((keyword)::text ~~ 
 'exploitatioo'::character varying))
   Filter: ((keyword)::text ~~ 'exploitation%'::text)

A factor-of-75 error is quite likely to mislead the planner into
choosing a bad join plan.

BTW, have you looked into using a real full-text-search engine (eg,
tsearch2) instead of rolling your own like this?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Nested Loop trouble : Execution time increases more

2005-09-22 Thread Antoine Bajolet

Hello,

Tom Lane a écrit :


Antoine Bajolet [EMAIL PROTECTED] writes:
 

We are using postgresql in a search engine on an intranet handling 
throusand of documents.

But we ave a big problem when users use more than two search key.
   



I think you need to increase the statistics targets for your keywords
table --- the estimates of numbers of matching rows are much too small:
 

What value you think i could put into a ALTER TABLE SET STATISTICS 
statment ?


Also, the solution given by Simon Riggs works well.
quote

Recode your SQL with an IN subselect that retrieves all possible 
keywords before it accesses the larger table.

/quote

But i will try the old ones increasing the statistics parameter and compare 
performance.


 

   -  Index Scan using keyword_pattern_key on keywords 
k2  (cost=0.00..3.51 rows=1 width=4) (actual time=0.078..1.887 rows=75 
loops=1)
 Index Cond: (((keyword)::text ~=~ 
'exploitation'::character varying) AND ((keyword)::text ~~ 
'exploitatioo'::character varying))

 Filter: ((keyword)::text ~~ 'exploitation%'::text)
   



A factor-of-75 error is quite likely to mislead the planner into
choosing a bad join plan.

BTW, have you looked into using a real full-text-search engine (eg,
tsearch2) instead of rolling your own like this?
 


It seems a quite good contrib, but...
The first version of this search engine was developped in 2000... 
tsearch2 nor tsearch existed at this time.
Also, there are some developpement works around this search engine 
(pertinence algorithm, filtering with users rights, ponderating keywords 
with specific rules to each type of document, etc.) and adapting all to 
work in the similar way with tsearch2 seems to be a bit heavy.
At the end, each document indexed are quite big and the choosen method 
reduces disk storage : 1 Go of text content traduces to ~100 Mo of table 
space.


Best Regards,
Antoine Bajolet



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Nested Loop trouble : Execution time increases more

2005-09-22 Thread Antoine Bajolet

Re,

With modifing parameters like this :

ALTER TABLE keywords ALTER keyword SET STATISTICS 100;
ALTER TABLE keywords ALTER k_id SET STATISTICS 100;
ALTER TABLE engine ALTER k_id SET STATISTICS 100;
ALTER TABLE engine ALTER f_id SET STATISTICS 100;

vacuuming both tables
and rewriting the queries using sub-selects :

select count (distinct f.f_id) as results
FROM
fiches f
INNER JOIN (SELECT distinct f_id FROM keywords,engine WHERE engine.k_id 
= keywords.k_id AND keyword like 'exploitation%') as e1 USING(f_id)
INNER JOIN (SELECT distinct f_id FROM keywords,engine WHERE engine.k_id 
= keywords.k_id AND keyword like 'maintenance%') as e2 USING(f_id)
INNER JOIN (SELECT distinct f_id FROM keywords,engine WHERE engine.k_id 
= keywords.k_id AND keyword like 'numerique%') as e3 USING(f_id)


The query time is less than 600 ms, and increases only a little adding 
more keywords.


Thanks to Tom Lane and Simon Riggs.

Best regards,
Antoine Bajolet

Antoine Bajolet a écrit :


Hello,

Tom Lane a écrit :


Antoine Bajolet [EMAIL PROTECTED] writes:
 

We are using postgresql in a search engine on an intranet handling 
throusand of documents.

But we ave a big problem when users use more than two search key.
  



I think you need to increase the statistics targets for your keywords
table --- the estimates of numbers of matching rows are much too small:
 

What value you think i could put into a ALTER TABLE SET STATISTICS 
statment ?


Also, the solution given by Simon Riggs works well.
quote

Recode your SQL with an IN subselect that retrieves all possible 
keywords before it accesses the larger table.

/quote

But i will try the old ones increasing the statistics parameter and 
compare performance.





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Merlin Moncure
 Here is a trick I use sometimes with views, etc.  This may or may not
be
 effective to solve your problem but it's worth a shot.  Create one
small
 SQL function taking date, etc. and returning the values and define it
 immutable.  Now in-query it is treated like a constant.
 
 We don't use functions as a rule, but I would be glad to give it a
try.
 I would most appreciate if you could define a sample function and
rewrite
 the VCurPlayer view above. Both PlayerID and AtDate are varchar
fields.

 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc
LIMIT 1
try:

create function player_max_at_date (varchar) returns date as 
$$
select atdate from player where playerid = $1 order by playerid
desc, AtDate desc limit 1;
$$ language sql immutable;

create view v as select playerid, player_max_at_date(playerid) from
player;
select * from v where playerid = 'x'; --etc

note: this function is not really immutable.  try with both 'immutable'
and 'stable' if performance is same, do stable.

You're welcome in advance, ;)
Merlin



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Kevin Grittner
Have you tried the best choice pattern -- where you select the set of
candidate rows and then exclude those for which a better choice
exists within the set?  I often get better results with this pattern than
with the alternatives.  Transmuting your query to use this patter gives:
 
select PlayerID,AtDate from Player a where PlayerID='0'
  and not exists
(select * from Player b
  where b.PlayerID = a.PlayerID and b.AtDate  a.AtDate);
 
 K C Lau [EMAIL PROTECTED] 09/21/05 11:21 PM 

select PlayerID,AtDate from Player a
  where PlayerID='0' and AtDate = (select b.AtDate from Player b
  where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc 
LIMIT 1);


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)

2005-09-22 Thread Gurpreet Aulakh
Hi,

Here is the information that you requested.

The sub query that I am using is

EXPLAIN ANALYZE SELECT doc.doc_documentid FROM document AS doc
LEFT JOIN document as root
ON doc.doc_internalRootXref = root.doc_documentId
LEFT JOIN folder_document ON doc.doc_documentid =
folder_document.doc_documentId
LIMIT 500 OFFSET 0


The column doc_documentid is character varying(48) on both tables (document,
folder_document).
The column doc_internalRootXref is also character varying(48)
doc_documentid and doc_internalRootXref are UUIDs that is 36 chars long.

The document table has 58 columns.
31 columns are varchar ranging from size 8 to 80
7 booleans
4 numeric(12,2)
8 timestamp with time zone
1 integer
1 bigint
5 text

The folder_documen table has 6 columns
4 varchar (2 of length 16 2 of length 48)

The following indexes are on the document table
   pk_document primary key btree (doc_documentid),
 document_pk unique btree (doc_documentid),
 doc_deliverydate_index btree (doc_deliverydate),
 doc_externalxref_index btree (doc_externalxref),
 doc_internalparentomxref_index btree (doc_internalparentomxref),
 doc_internalrootxref_index btree (doc_internalrootxref)
The following indexes are on the folder_document table
pk_folder_document primary key btree (doc_documentid)
  fk_folder_document1 FOREIGN KEY (fld_folderid) REFERENCES
folder(fld_folderid)
ON UPDATE RESTRICT ON DELETE CASCADE,
fk_folder_document2 FOREIGN KEY (doc_documentid) REFERENCES
document(doc_documentid)
ON UPDATE RESTRICT ON DELETE CASCADE

After reading your hint about locale settings, I reinstalled postgres and
made sure the locale was set
to C and that the encoding was SQL_ASCII. (these are the settings on the
cygwin installation).

I still get the same results in the last post.


-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: September 21, 2005 8:13 PM
To: Gurpreet Aulakh
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)


Gurpreet Aulakh [EMAIL PROTECTED] writes:
 What is really interesting is the time it takes for the Hash to occur. For
 the first hash, on the 7.3 it takes only 12ms while on the 8.0 it takes
 47ms.

You haven't told us a thing about the column datatypes involved (much
less what the query actually is) ... but I wonder if this is a textual
datatype and the 8.0 installation is using a non-C locale where the 7.3
installation is using C locale.  That could account for a considerable
slowdown in text comparison speeds.

regards, tom lane




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier

Hi,

I've got many queries running much slower on 8.1 beta2 than on 8.0.1
Here is a simplified one that takes 484 ms on 8.1 and 32 ms on 8.0.1.

select
  0
from
  Content C

  left outer join Supplier S
  on  C.SupplierId = S.SupplierId

  left outer join Price P
  on C.PriceId = P.PriceId;

Any ideas why it's slower?

Thanks
Jean-Pierre Pelletier
e-djuster

==

create table Price (
  PriceId   INTEGER  NOT NULL DEFAULT NEXTVAL('PriceId'),
  ItemIdINTEGER  NOT NULL,
  SupplierIdINTEGER  NOT NULL,
  LocationIdSMALLINT NULL,
  FromDate  DATE NOT NULL DEFAULT CURRENT_DATE,
  UnitValue DECIMAL  NOT NULL,
  InsertedByPersonIdINTEGER  NOT NULL,
  LastUpdatedByPersonId INTEGER  NULL,
  InsertTimestamp   TIMESTAMP(0) NOT NULL DEFAULT CURRENT_TIMESTAMP,
  LastUpdateTimeStamp   TIMESTAMP(0) NULL
);

alter table price add primary key (priceid);

create table Supplier (
 SupplierId  INTEGER NOT NULL DEFAULT NEXTVAL('SupplierId'),
 SupplierDescription VARCHAR(50) NOT NULL,
 InsertTimestamp TIMESTAMP(0)NULL DEFAULT CURRENT_TIMESTAMP,
 ApprovalDateDATENULL
);

alter table supplier add primary key (supplierid);

-- I've only put one row in table Content because it was sufficient to
produce
-- the slowdown

create table content (contentid integer not null, supplierid integer,
priceid integer);
insert into content VALUES (148325, 12699, 388026);

vacuum analyze content; -- 1 row
vacuum analyze price; -- 581475 rows
vacuum analyze supplier; -- 10139 rows

==
Here are the query plans:

On PostgreSQL 8.1beta2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
3.4.2 (mingw-special)

explain select0 fromContent C  LEFT OUTER JOIN Supplier SON
C.SupplierId = S.SupplierId  LEFT OUTER JOIN Price PON C.PriceId =
P.PriceId;
QUERY PLAN

Nested Loop Left Join  (cost=0.00..18591.77 rows=1 width=0)
  Join Filter: (outer.priceid = inner.priceid)
  -  Nested Loop Left Join  (cost=0.00..5.59 rows=1 width=4)
-  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8)
-  Index Scan using Supplier Id on supplier s  (cost=0.00..4.56
rows=1 width=4)
  Index Cond: (outer.supplierid = s.supplierid)
  -  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4)


PostgreSQL 8.0.1 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)

explain select0 fromContent C  LEFT OUTER JOIN Supplier SON
C.SupplierId = S.SupplierId  LEFT OUTER JOIN Price PON C.PriceId =
P.PriceId;
QUERY PLAN

Nested Loop Left Join  (cost=0.00..11.08 rows=1 width=0)
  -  Nested Loop Left Join  (cost=0.00..5.53 rows=1 width=4)
-  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8)
-  Index Scan using Supplier Id on supplier s  (cost=0.00..4.51
rows=1 width=4)
  Index Cond: (outer.supplierid = s.supplierid)
  -  Index Scan using price_pkey on price p  (cost=0.00..5.53 rows=1
width=4)
Index Cond: (outer.priceid = p.priceid)


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Gavin M. Roy
What stood out to me the most was:On Sep 22, 2005, at 2:20 PM, Jean-Pierre Pelletier wrote:  -  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4) a) is the index there, b) have you analyzed, c) perhaps the planners have different default values for when to use an index vrs a seqscan...  if you turn off seqscan, are the timings similar? Gavin M. Roy800 Pound Gorilla[EMAIL PROTECTED] 

Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier



All indexes are there, and I've analyzed the three 
tables.

I turned off seq scan, the query plans became 
identical but the performance
was not better.

- Original Message - 

  From: 
  Gavin M. Roy 
  To: Jean-Pierre Pelletier 
  Cc: pgsql-performance@postgresql.org 
  
  Sent: Thursday, September 22, 2005 5:32 
  PM
  Subject: Re: [PERFORM] Queries 15 times 
  slower on 8.1 beta 2 than on 8.0
  What stood out to me the most was:
  
  
  On Sep 22, 2005, at 2:20 PM, Jean-Pierre Pelletier wrote:
  
 - Seq Scan on price p (cost=0.00..11317.75 rows=581475 
width=4)
  a) is the index there, b) have you analyzed, c) perhaps the planners have 
  different default values for when to use an index vrs a seqscan... if 
  you turn off seqscan, are the timings similar?
  
  Gavin M. Roy
  800 Pound Gorilla
  [EMAIL PROTECTED]
  


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread John Arbash Meinel
Jean-Pierre Pelletier wrote:
 Hi,
 
 I've got many queries running much slower on 8.1 beta2 than on 8.0.1
 Here is a simplified one that takes 484 ms on 8.1 and 32 ms on 8.0.1.
 
 select
   0
 from
   Content C
 
   left outer join Supplier S
   on  C.SupplierId = S.SupplierId
 
   left outer join Price P
   on C.PriceId = P.PriceId;
 
 Any ideas why it's slower?

You really have to post the results of EXPLAIN ANALYZE not just
explain. So that we can tell what the planner is expecting, versus what
really happened.

John
=:-

 
 Thanks
 Jean-Pierre Pelletier
 e-djuster
 


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier

Here are the explain analyze:

On 8.1 beta2:

Nested Loop Left Join  (cost=0.00..18591.77 rows=1 width=0) (actual 
time=1320.302..2439.066 rows=1 loops=1)

  Join Filter: (outer.priceid = inner.priceid)
  -  Nested Loop Left Join  (cost=0.00..5.59 rows=1 width=4) (actual 
time=0.044..0.058 rows=1 loops=1)
-  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) (actual 
time=0.009..0.011 rows=1 loops=1)
-  Index Scan using Supplier Id on supplier s  (cost=0.00..4.56 
rows=1 width=4) (actual time=0.016..0.022 rows=1 loops=1)

  Index Cond: (outer.supplierid = s.supplierid)
  -  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4) 
(actual time=0.004..1143.720 rows=581475 loops=1)

Total runtime: 2439.211 ms

On 8.0.1:

Nested Loop Left Join  (cost=0.00..11.02 rows=1 width=0) (actual 
time=0.000..0.000 rows=1 loops=1)
  -  Nested Loop Left Join  (cost=0.00..5.48 rows=1 width=4) (actual 
time=0.000..0.000 rows=1 loops=1)
-  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) (actual 
time=0.000..0.000 rows=1 loops=1)
-  Index Scan using Supplier Id on supplier s  (cost=0.00..4.46 
rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)

  Index Cond: (outer.supplierid = s.supplierid)
  -  Index Scan using Price Id on price p  (cost=0.00..5.53 rows=1 
width=4) (actual time=0.000..0.000 rows=1 loops=1)

Index Cond: (outer.priceid = p.priceid)
Total runtime: 0.000 ms

- Original Message - 
From: John Arbash Meinel [EMAIL PROTECTED]

To: Jean-Pierre Pelletier [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Sent: Thursday, September 22, 2005 5:48 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread John Arbash Meinel
Jean-Pierre Pelletier wrote:
 Here are the explain analyze:

What is the explain analyze if you use set enable_seqscan to off?

Also, can you post the output of:
\d supplier
\d price
\d content

Mostly I just want to see what the indexes are, in the case that you
don't want to show us your schema.

John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Josh Berkus
Jean-Pierre,

First off, you're on Windows?

   -  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4)
 (actual time=0.004..1143.720 rows=581475 loops=1)

Well, this is your pain point.   Can we see the index scan plan on 8.1?  
Given that it's *expecting* only one row, I can't understand why it's 
using a seq scan ...

 Nested Loop Left Join  (cost=0.00..11.02 rows=1 width=0) (actual
 time=0.000..0.000 rows=1 loops=1)
   -  Nested Loop Left Join  (cost=0.00..5.48 rows=1 width=4) (actual
 time=0.000..0.000 rows=1 loops=1)
 Total runtime: 0.000 ms

Feh, this looks like the windows does not report times bug, which makes 
it hard to compare ...

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier

With enable-seq-scan = off, it runs in 350 ms so better than 484 ms
but still much slower than 32 ms in 8.0.1.

==

   Table public.content
  Column   |  Type   | Modifiers
+-+---
contentid  | integer | not null
supplierid | integer |
priceid| integer |

 Table public.price
   Column |  Type  | Modifiers
---++---
priceid   | integer| not null
itemid| integer|
supplierid| integer|
locationid| smallint   |
fromdate  | date   |
unitvalue | numeric|
insertedbypersonid| integer|
lastupdatedbypersonid | integer|
inserttimestamp   | timestamp(0) without time zone |
lastupdatetimestamp   | timestamp(0) without time zone |
Indexes:
   price_pkey PRIMARY KEY, btree (priceid)

  Table public.supplier
  Column|  Type  | 
Modifie

rs
-++-
-
supplierid  | integer| not null default 
nextval

('SupplierId'::text)
supplierdescription | character varying(50)  | not null
inserttimestamp | timestamp(0) without time zone | default now()
approvaldate| date   |
Indexes:
   Supplier Id PRIMARY KEY, btree (supplierid)
   Supplier Description UNIQUE, btree (upper(supplierdescription::text))
   Supplier.InsertTimestamp btree (inserttimestamp)
Check constraints:
   Supplier Name cannot be empty CHECK (btrim(supplierdescription::text) 
 ''::tex





Explan analyze with enable-seq-scan = off on 8.1 beta2
QUERY PLAN



Merge Left Join  (cost=10005.60..101607964.74 rows=1 width=0) (actual 
time=

729.067..729.078 rows=1 loops=1)
  Merge Cond: (outer.priceid = inner.priceid)
  -  Sort  (cost=10005.60..10005.60 rows=1 width=4) (actual 
time=0.064

..0.067 rows=1 loops=1)
Sort Key: c.priceid
-  Nested Loop Left Join  (cost=1.00..10005.59 rows=1 
widt

h=4) (actual time=0.038..0.049 rows=1 loops=1)
  -  Seq Scan on content c  (cost=1.00..10001.01 
ro

ws=1 width=8) (actual time=0.008..0.011 rows=1 loops=1)
  -  Index Scan using Supplier Id on supplier s 
(cost=0.00..4.5

6 rows=1 width=4) (actual time=0.016..0.019 rows=1 loops=1)
Index Cond: (outer.supplierid = s.supplierid)
  -  Index Scan using Price Id on price p  (cost=0.00..1606505.44 
rows=58147

5 width=4) (actual time=0.008..370.854 rows=164842 loops=1)
Total runtime: 729.192 ms

- Original Message - 
From: John Arbash Meinel [EMAIL PROTECTED]

To: Jean-Pierre Pelletier [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Sent: Thursday, September 22, 2005 6:03 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Tomeh, Husam
 
Have tried adjusting the effective_cache_size so that you don't the
planner may produce a better explain plan for you and not needing to set
seqscan to off.


-- 
 Husam 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jean-Pierre
Pelletier
Sent: Thursday, September 22, 2005 3:28 PM
To: John Arbash Meinel
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

With enable-seq-scan = off, it runs in 350 ms so better than 484 ms
but still much slower than 32 ms in 8.0.1.

==

Table public.content
   Column   |  Type   | Modifiers
+-+---
 contentid  | integer | not null
 supplierid | integer |
 priceid| integer |

  Table public.price
Column |  Type  | Modifiers
---++---
 priceid   | integer| not null
 itemid| integer|
 supplierid| integer|
 locationid| smallint   |
 fromdate  | date   |
 unitvalue | numeric|
 insertedbypersonid| integer|
 lastupdatedbypersonid | integer|
 inserttimestamp   | timestamp(0) without time zone |
 lastupdatetimestamp   | timestamp(0) without time zone |
Indexes:
price_pkey PRIMARY KEY, btree (priceid)

   Table public.supplier
   Column|  Type  | 
Modifie
rs
-++-

-
 supplierid  | integer| not null default

nextval
('SupplierId'::text)
 supplierdescription | character varying(50)  | not null
 inserttimestamp | timestamp(0) without time zone | default now()
 approvaldate| date   |
Indexes:
Supplier Id PRIMARY KEY, btree (supplierid)
Supplier Description UNIQUE, btree
(upper(supplierdescription::text))
Supplier.InsertTimestamp btree (inserttimestamp)
Check constraints:
Supplier Name cannot be empty CHECK
(btrim(supplierdescription::text) 
 ''::tex





Explan analyze with enable-seq-scan = off on 8.1 beta2
 QUERY
PLAN




 Merge Left Join  (cost=10005.60..101607964.74 rows=1 width=0)
(actual 
time=
729.067..729.078 rows=1 loops=1)
   Merge Cond: (outer.priceid = inner.priceid)
   -  Sort  (cost=10005.60..10005.60 rows=1 width=4) (actual 
time=0.064
..0.067 rows=1 loops=1)
 Sort Key: c.priceid
 -  Nested Loop Left Join  (cost=1.00..10005.59
rows=1 
widt
h=4) (actual time=0.038..0.049 rows=1 loops=1)
   -  Seq Scan on content c
(cost=1.00..10001.01 
ro
ws=1 width=8) (actual time=0.008..0.011 rows=1 loops=1)
   -  Index Scan using Supplier Id on supplier s 
(cost=0.00..4.5
6 rows=1 width=4) (actual time=0.016..0.019 rows=1 loops=1)
 Index Cond: (outer.supplierid = s.supplierid)
   -  Index Scan using Price Id on price p  (cost=0.00..1606505.44 
rows=58147
5 width=4) (actual time=0.008..370.854 rows=164842 loops=1)
 Total runtime: 729.192 ms

- Original Message - 
From: John Arbash Meinel [EMAIL PROTECTED]
To: Jean-Pierre Pelletier [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Sent: Thursday, September 22, 2005 6:03 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

**
This message contains confidential information intended only for the 
use of the addressee(s) named above and may contain information that 
is legally privileged.  If you are not the addressee, or the person 
responsible for delivering it to the addressee, you are hereby 
notified that reading, disseminating, distributing or copying this 
message is strictly prohibited.  If you have received this message by 
mistake, please immediately notify us by replying to the message and 
delete the original message immediately thereafter.

Thank you.   FADLD Tag

Fw: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier
- Original Message - 
From: Jean-Pierre Pelletier [EMAIL PROTECTED]

To: josh@agliodbs.com
Sent: Thursday, September 22, 2005 6:37 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



How do I produce an Index scan plan ?

- Original Message - 
From: Josh Berkus josh@agliodbs.com

To: pgsql-performance@postgresql.org
Cc: Jean-Pierre Pelletier [EMAIL PROTECTED]; John Arbash 
Meinel [EMAIL PROTECTED]

Sent: Thursday, September 22, 2005 6:19 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



Jean-Pierre,

First off, you're on Windows?


  -  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4)
(actual time=0.004..1143.720 rows=581475 loops=1)


Well, this is your pain point.   Can we see the index scan plan on 8.1?
Given that it's *expecting* only one row, I can't understand why it's
using a seq scan ...


Nested Loop Left Join  (cost=0.00..11.02 rows=1 width=0) (actual
time=0.000..0.000 rows=1 loops=1)
  -  Nested Loop Left Join  (cost=0.00..5.48 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=1)
Total runtime: 0.000 ms


Feh, this looks like the windows does not report times bug, which makes
it hard to compare ...

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Michael Fuhr
On Thu, Sep 22, 2005 at 03:19:05PM -0700, Josh Berkus wrote:
-  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4)
  (actual time=0.004..1143.720 rows=581475 loops=1)
 
 Well, this is your pain point.   Can we see the index scan plan on 8.1?  
 Given that it's *expecting* only one row, I can't understand why it's 
 using a seq scan ...

I've created a simplified, self-contained test case for this:

CREATE TABLE price (
priceid  integer PRIMARY KEY
);

CREATE TABLE supplier (
supplierid  integer PRIMARY KEY
);

CREATE TABLE content (
contentid   integer PRIMARY KEY,
supplierid  integer NOT NULL REFERENCES supplier,
priceid integer NOT NULL REFERENCES price
);

INSERT INTO price (priceid) SELECT * FROM generate_series(1, 5);
INSERT INTO supplier (supplierid) SELECT * FROM generate_series(1, 1);
INSERT INTO content (contentid, supplierid, priceid) VALUES (1, 1, 5);

ANALYZE price;
ANALYZE supplier;
ANALYZE content;

EXPLAIN ANALYZE
SELECT 0
FROM content c
LEFT OUTER JOIN supplier s ON c.supplierid = s.supplierid
LEFT OUTER JOIN price pON c.priceid = p.priceid;

Here's the EXPLAIN ANALYZE from 8.0.3:

 Nested Loop Left Join  (cost=0.00..7.06 rows=1 width=0) (actual 
time=0.180..0.232 rows=1 loops=1)
   -  Nested Loop Left Join  (cost=0.00..4.04 rows=1 width=4) (actual 
time=0.105..0.133 rows=1 loops=1)
 -  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) (actual 
time=0.021..0.029 rows=1 loops=1)
 -  Index Scan using supplier_pkey on supplier s  (cost=0.00..3.01 
rows=1 width=4) (actual time=0.052..0.059 rows=1 loops=1)
   Index Cond: (outer.supplierid = s.supplierid)
   -  Index Scan using price_pkey on price p  (cost=0.00..3.01 rows=1 width=4) 
(actual time=0.046..0.055 rows=1 loops=1)
 Index Cond: (outer.priceid = p.priceid)
 Total runtime: 0.582 ms

Here it is from 8.1beta2:

 Merge Right Join  (cost=4.05..1054.06 rows=1 width=0) (actual 
time=676.863..676.895 rows=1 loops=1)
   Merge Cond: (outer.priceid = inner.priceid)
   -  Index Scan using price_pkey on price p  (cost=0.00..925.00 rows=5 
width=4) (actual time=0.035..383.345 rows=5 loops=1)
   -  Sort  (cost=4.05..4.05 rows=1 width=4) (actual time=0.152..0.159 rows=1 
loops=1)
 Sort Key: c.priceid
 -  Nested Loop Left Join  (cost=0.00..4.04 rows=1 width=4) (actual 
time=0.082..0.111 rows=1 loops=1)
   -  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) 
(actual time=0.016..0.024 rows=1 loops=1)
   -  Index Scan using supplier_pkey on supplier s  
(cost=0.00..3.01 rows=1 width=4) (actual time=0.039..0.047 rows=1 loops=1)
 Index Cond: (outer.supplierid = s.supplierid)
 Total runtime: 677.563 ms

If we change content's priceid then we get the same plan but faster results:

UPDATE content SET priceid = 1;

 Merge Right Join  (cost=4.05..1054.06 rows=1 width=0) (actual 
time=0.268..0.303 rows=1 loops=1)
   Merge Cond: (outer.priceid = inner.priceid)
   -  Index Scan using price_pkey on price p  (cost=0.00..925.00 rows=5 
width=4) (actual time=0.049..0.061 rows=2 loops=1)
   -  Sort  (cost=4.05..4.05 rows=1 width=4) (actual time=0.187..0.192 rows=1 
loops=1)
 Sort Key: c.priceid
 -  Nested Loop Left Join  (cost=0.00..4.04 rows=1 width=4) (actual 
time=0.099..0.128 rows=1 loops=1)
   -  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) 
(actual time=0.025..0.033 rows=1 loops=1)
   -  Index Scan using supplier_pkey on supplier s  
(cost=0.00..3.01 rows=1 width=4) (actual time=0.046..0.053 rows=1 loops=1)
 Index Cond: (outer.supplierid = s.supplierid)
 Total runtime: 0.703 ms

-- 
Michael Fuhr

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier

I don't know if it makes a difference but in my tables,
content.supplierid and content.priceid were nullable.

- Original Message - 
From: Michael Fuhr [EMAIL PROTECTED]

To: Josh Berkus josh@agliodbs.com
Cc: pgsql-performance@postgresql.org; Jean-Pierre Pelletier 
[EMAIL PROTECTED]; John Arbash Meinel [EMAIL PROTECTED]

Sent: Thursday, September 22, 2005 6:54 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



On Thu, Sep 22, 2005 at 03:19:05PM -0700, Josh Berkus wrote:

   -  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4)
 (actual time=0.004..1143.720 rows=581475 loops=1)

Well, this is your pain point.   Can we see the index scan plan on 8.1?
Given that it's *expecting* only one row, I can't understand why it's
using a seq scan ...


I've created a simplified, self-contained test case for this:

CREATE TABLE price (
   priceid  integer PRIMARY KEY
);

CREATE TABLE supplier (
   supplierid  integer PRIMARY KEY
);

CREATE TABLE content (
   contentid   integer PRIMARY KEY,
   supplierid  integer NOT NULL REFERENCES supplier,
   priceid integer NOT NULL REFERENCES price
);

INSERT INTO price (priceid) SELECT * FROM generate_series(1, 5);
INSERT INTO supplier (supplierid) SELECT * FROM generate_series(1, 1);
INSERT INTO content (contentid, supplierid, priceid) VALUES (1, 1, 5);

ANALYZE price;
ANALYZE supplier;
ANALYZE content;

EXPLAIN ANALYZE
SELECT 0
FROM content c
LEFT OUTER JOIN supplier s ON c.supplierid = s.supplierid
LEFT OUTER JOIN price pON c.priceid = p.priceid;

Here's the EXPLAIN ANALYZE from 8.0.3:

Nested Loop Left Join  (cost=0.00..7.06 rows=1 width=0) (actual 
time=0.180..0.232 rows=1 loops=1)
  -  Nested Loop Left Join  (cost=0.00..4.04 rows=1 width=4) (actual 
time=0.105..0.133 rows=1 loops=1)
-  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) 
(actual time=0.021..0.029 rows=1 loops=1)
-  Index Scan using supplier_pkey on supplier s  (cost=0.00..3.01 
rows=1 width=4) (actual time=0.052..0.059 rows=1 loops=1)

  Index Cond: (outer.supplierid = s.supplierid)
  -  Index Scan using price_pkey on price p  (cost=0.00..3.01 rows=1 
width=4) (actual time=0.046..0.055 rows=1 loops=1)

Index Cond: (outer.priceid = p.priceid)
Total runtime: 0.582 ms

Here it is from 8.1beta2:

Merge Right Join  (cost=4.05..1054.06 rows=1 width=0) (actual 
time=676.863..676.895 rows=1 loops=1)

  Merge Cond: (outer.priceid = inner.priceid)
  -  Index Scan using price_pkey on price p  (cost=0.00..925.00 
rows=5 width=4) (actual time=0.035..383.345 rows=5 loops=1)
  -  Sort  (cost=4.05..4.05 rows=1 width=4) (actual time=0.152..0.159 
rows=1 loops=1)

Sort Key: c.priceid
-  Nested Loop Left Join  (cost=0.00..4.04 rows=1 width=4) 
(actual time=0.082..0.111 rows=1 loops=1)
  -  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) 
(actual time=0.016..0.024 rows=1 loops=1)
  -  Index Scan using supplier_pkey on supplier s 
(cost=0.00..3.01 rows=1 width=4) (actual time=0.039..0.047 rows=1 loops=1)

Index Cond: (outer.supplierid = s.supplierid)
Total runtime: 677.563 ms

If we change content's priceid then we get the same plan but faster 
results:


UPDATE content SET priceid = 1;

Merge Right Join  (cost=4.05..1054.06 rows=1 width=0) (actual 
time=0.268..0.303 rows=1 loops=1)

  Merge Cond: (outer.priceid = inner.priceid)
  -  Index Scan using price_pkey on price p  (cost=0.00..925.00 
rows=5 width=4) (actual time=0.049..0.061 rows=2 loops=1)
  -  Sort  (cost=4.05..4.05 rows=1 width=4) (actual time=0.187..0.192 
rows=1 loops=1)

Sort Key: c.priceid
-  Nested Loop Left Join  (cost=0.00..4.04 rows=1 width=4) 
(actual time=0.099..0.128 rows=1 loops=1)
  -  Seq Scan on content c  (cost=0.00..1.01 rows=1 width=8) 
(actual time=0.025..0.033 rows=1 loops=1)
  -  Index Scan using supplier_pkey on supplier s 
(cost=0.00..3.01 rows=1 width=4) (actual time=0.046..0.053 rows=1 loops=1)

Index Cond: (outer.supplierid = s.supplierid)
Total runtime: 0.703 ms

--
Michael Fuhr 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Tom Lane
Michael Fuhr [EMAIL PROTECTED] writes:
 I've created a simplified, self-contained test case for this:

I see the problem --- I broke best_inner_indexscan() for some cases
where the potential indexscan clause is an outer-join ON clause.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Josh Berkus
Jean-Pierre,

 effective_cache_size = 1000

Try setting this to 16,384 as a test.

 random_page_cost = 4

Try setting this to 2.5 as a test.

 work_mem = 2

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Michael Fuhr
On Thu, Sep 22, 2005 at 07:07:41PM -0400, Jean-Pierre Pelletier wrote:
 I don't know if it makes a difference but in my tables,
 content.supplierid and content.priceid were nullable.

That makes no difference in the tests I've done.

Tom Lane says he's found the problem; I expect he'll be committing
a fix shortly.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Tomeh, Husam
 
The recommendation for effective_cache_size is about 2/3 of your
server's physical RAM (if the server is dedicated only for postgres).
This should have a significant impact on whether Postgres planner
chooses indexes over sequential scans. 

-- 
 Husam 


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jean-Pierre
Pelletier
Sent: Thursday, September 22, 2005 4:10 PM
To: josh@agliodbs.com
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

my settings are:

effective_cache_size = 1000
random_page_cost = 4
work_mem = 2

- Original Message - 
From: Josh Berkus josh@agliodbs.com
To: Jean-Pierre Pelletier [EMAIL PROTECTED]
Sent: Thursday, September 22, 2005 6:58 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0


 Jean-Pierre,
 
 How do I produce an Index scan plan ?
 
 You just did.   What's your effective_cache_size set to?   
 random_page_cost?  work_mem?
 
 -- 
 --Josh
 
 Josh Berkus
 Aglio Database Solutions
 San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

**
This message contains confidential information intended only for the 
use of the addressee(s) named above and may contain information that 
is legally privileged.  If you are not the addressee, or the person 
responsible for delivering it to the addressee, you are hereby 
notified that reading, disseminating, distributing or copying this 
message is strictly prohibited.  If you have received this message by 
mistake, please immediately notify us by replying to the message and 
delete the original message immediately thereafter.

Thank you.   FADLD Tag
**


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Josh Berkus
Jean-Pierre,

 Thanks everybody for your help, I'll be awaiting the fix.

 I've also noticed that pg_stat_activity is always empty even if
 stats_start_collector = on

Yes, I believe that this is a know Windows issue.   Not sure if it's fixed 
in 8.1.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier

Thanks everybody for your help, I'll be awaiting the fix.

I've also noticed that pg_stat_activity is always empty even if
stats_start_collector = on

- Original Message - 
From: Michael Fuhr [EMAIL PROTECTED]

To: Jean-Pierre Pelletier [EMAIL PROTECTED]
Cc: Josh Berkus josh@agliodbs.com; pgsql-performance@postgresql.org; 
John Arbash Meinel [EMAIL PROTECTED]

Sent: Thursday, September 22, 2005 7:17 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



On Thu, Sep 22, 2005 at 07:07:41PM -0400, Jean-Pierre Pelletier wrote:

I don't know if it makes a difference but in my tables,
content.supplierid and content.priceid were nullable.


That makes no difference in the tests I've done.

Tom Lane says he's found the problem; I expect he'll be committing
a fix shortly.

--
Michael Fuhr

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Tom Lane
Michael Fuhr [EMAIL PROTECTED] writes:
 Tom Lane says he's found the problem; I expect he'll be committing
 a fix shortly.

The attached patch allows it to generate the expected plan, at least
in the test case I tried.

regards, tom lane

*** src/backend/optimizer/path/indxpath.c.orig  Sun Aug 28 18:47:20 2005
--- src/backend/optimizer/path/indxpath.c   Thu Sep 22 19:17:41 2005
***
*** 955,969 
/*
 * Examine each joinclause in the joininfo list to see if it matches any
 * key of any index.  If so, add the clause's other rels to the result.
-* (Note: we consider only actual participants, not extraneous rels
-* possibly mentioned in required_relids.)
 */
foreach(l, rel-joininfo)
{
RestrictInfo *joininfo = (RestrictInfo *) lfirst(l);
Relids  other_rels;
  
!   other_rels = bms_difference(joininfo-clause_relids, 
rel-relids);
if (matches_any_index(joininfo, rel, other_rels))
outer_relids = bms_join(outer_relids, other_rels);
else
--- 955,967 
/*
 * Examine each joinclause in the joininfo list to see if it matches any
 * key of any index.  If so, add the clause's other rels to the result.
 */
foreach(l, rel-joininfo)
{
RestrictInfo *joininfo = (RestrictInfo *) lfirst(l);
Relids  other_rels;
  
!   other_rels = bms_difference(joininfo-required_relids, 
rel-relids);
if (matches_any_index(joininfo, rel, other_rels))
outer_relids = bms_join(outer_relids, other_rels);
else

---(end of broadcast)---
TIP 6: explain analyze is your friend


FW: [PERFORM] Deadlock Issue with PostgreSQL

2005-09-22 Thread Anu Kucharlapati








Hello Tom,



Thanks a lot for your quick response. Which version do you think is the
more stable one that we should upgrade to?



Please provide us with the Upgrade instructions/documentation to be
followed for both red hat and PostgreSQL. 



Thanks and Best Regards,

Anu





-Original Message-

From: Tom Lane
[mailto:[EMAIL PROTECTED] 

Sent: Wednesday, September 21, 2005 12:15 PM

To: Anu Kucharlapati

Cc: pgsql-performance@postgresql.org; Owen Blizzard

Subject: Re: [PERFORM] Deadlock Issue with PostgreSQL 



Anu Kucharlapati [EMAIL PROTECTED] writes:

 Red Hat Linux release 7.3

 Apache 1.3.20

 PostgreSQL 7.1.3



I'm not sure about Apache, but both the RHL and Postgres versions

you are using are stone age --- *please* update. Red Hat stopped

supporting that release years ago, and the PG community isn't

supporting 7.1.* anymore either. There are too many known problems

in 7.1.* that are unfixable without a major-version upgrade.



 regards, tom lane










Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread Jean-Pierre Pelletier

Explain analyze on my 8.0.1 installation does report the time for
slower queries but for this small query it reports 0.000 ms

- Original Message - 
From: Josh Berkus josh@agliodbs.com

To: pgsql-performance@postgresql.org
Cc: Jean-Pierre Pelletier [EMAIL PROTECTED]; John Arbash 
Meinel [EMAIL PROTECTED]

Sent: Thursday, September 22, 2005 6:19 PM
Subject: Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0



Jean-Pierre,

First off, you're on Windows?


  -  Seq Scan on price p  (cost=0.00..11317.75 rows=581475 width=4)
(actual time=0.004..1143.720 rows=581475 loops=1)


Well, this is your pain point.   Can we see the index scan plan on 8.1?
Given that it's *expecting* only one row, I can't understand why it's
using a seq scan ...


Nested Loop Left Join  (cost=0.00..11.02 rows=1 width=0) (actual
time=0.000..0.000 rows=1 loops=1)
  -  Nested Loop Left Join  (cost=0.00..5.48 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=1)
Total runtime: 0.000 ms


Feh, this looks like the windows does not report times bug, which makes
it hard to compare ...

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: FW: [PERFORM] Deadlock Issue with PostgreSQL

2005-09-22 Thread Josh Berkus
Anu,

 Thanks a lot for your quick response. Which version do you think is the
 more stable one that we should upgrade to?

8.0.3

 Please provide us with the Upgrade instructions/documentation to be
 followed for both red hat and PostgreSQL.

See the PostgreSQL documentation for upgrade instructions.   Given how old 
your version is, you might need to go through an intermediate version, 
like 7.3.

Red Hat upgrades are between you and Red Hat.  They sell support for a 
reason ...

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] VACUUM FULL vs CLUSTER

2005-09-22 Thread Markus Benne
I have a table that is purged by 25% each night.  I'd like to do a
vacuum nightly after the purge to reclaim the space, but I think I'll
still need to do a vacuum full weekly.

Would there be any benefit to doing a cluster instead of the vacuum?


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] tsearch2 seem very slow

2005-09-22 Thread Ahmad Fajar








I have about 419804 rows in my article table. I have
installed tsearch2 and its gist index correctly. 

My table structure is:

CREATE TABLE tbarticles

(

 articleid int4 NOT NULL,

 title varchar(250),

 mediaid int4,

 datee date,

 content text,

 contentvar text,

 mmcol float4 NOT NULL,

 sirkulasi float4,

 page varchar(10),

 tglisidata date,

 namapc varchar(12),

 usere varchar(12),

 file_pdf varchar(255),

 file_pdf2 varchar(50),

 kolom int4,

 size_jpeg int4,

 journalist varchar(120),

 ratebw float4,

 ratefc float4,

 fti tsvector, 

 CONSTRAINT pk_tbarticles PRIMARY KEY (articleid)

) WITHOUT OIDS;

Create index fti_idx1 on tbarticles using gist (fti);

Create index fti_idx2 on tbarticles using gist (datee, fti);



But when I search something like:

Select articleid, title, datee from tbarticles where fti @@
to_tsquery(susilobambangyudhoyonojusufkalla);

It takes about 30 sec. I run explain analyze and the index
is used correctly.



Then I try multi column index to filter by date, and my
query something like:

Select articleid, title, datee from tbarticles where fti @@
to_tsquery(susilobambangyudhoyonojusufkalla)
and datee = '2002-01-01' and datee = current_date

An it still run about 25 sec. I do run explain analyze and
my multicolumn index is used correctly.

This is not acceptable if want to publish my website if the
search took very longer.



I have run vacuum full analyze before doing such query. What
going wrong with my query?? Is there any way to make this faster?

I have try to tune my postgres configuration, but it seem
helpless. My linux box is Redhat 4 AS, and 

the hardware: 2 GB RAM DDR 400, 2x200 GB Serial ATA 7200RPM and
configure as RAID0 (just for postgres data), my sistem run at EIDE 80GB 7200
RPM.



Pleasehelphelp








[PERFORM] optimization downgrade perfomance?

2005-09-22 Thread eVl
Hello!

Got a DB with traffic statictics stored. And a SELECT statement which 
shows traffic volume per days also divided by regions - local traffic and 
global.
   Thus SELECT statement returns about some (in about 10-20) rows paired 
like this:

ttype (text)| volume (int)| tdate (date)

local   | x   | some-date
global  | x   | some-date

When executing this SELECT (see SELECT.A above) it executes in about 
700 ms, but when I want wipe out all info about local traffic, with query like 
this:
  SELECT * FROM ( SELECT.A ) a WHERE type = 'global';
It executes about 1 ms - more then 10 TIMES SLOWER!

 Why this could be?



-
Initial Query - SELECT.A (executes about 700 ms)

SELECT 
  CASE is_local(aa.uaix) WHEN true THEN 'local' 
 ELSE 'global' END AS TType, 
  aa.cDate AS TDate,
  SUM(aa.data) AS Value 
FROM (
  SELECT 
a.uaix AS uaix, 
cDate AS cDate, 
SUM(a.data) AS data 
  FROM (
 (
 SELECT toIP AS uaix, 
cDate AS cDate, 
SUM(packetSize) AS data
  FROM vw_stats
 WHERE interface  'inet'
  AND cdate = '01.09.2005'
  AND fromIP  '192.168.0.0/16'
  AND NOT (toIP  '192.168.0.0/16')
 GROUP BY 1,2
)
   UNION 
   (
SELECT fromIP AS uaix, 
   cDate AS cDate, 
   SUM(packetSize) AS data
 FROM vw_stats
 WHERE interface  'inet'
   AND cdate = '01.09.2005'
   AND toIP  '192.168.0.0/16'
   AND NOT (fromIP  '192.168.0.0/16')
 GROUP BY 1,2
)
  ) a
  GROUP BY 1,2
) aa
GROUP BY 1,2
ORDER BY 1,2

---
Query with local info filtered (executes about 1 ms)

SELECT * FROM (
HERE PLACED SELECT.A
) aaa WHERE aaa.TType = 'global';


---

Running Postgresql 8.0.3 on FreeBSD 5.3

  

-- 
Best regards,
 eVl  mailto:[EMAIL PROTECTED]



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq