Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-10-26 Thread Tom Lane
Anton <[EMAIL PROTECTED]> writes:
> I want ask about problem with partioned tables (it was discussed some
> time ago, see below). Is it fixed somehow in 8.2.5 ?

No.  The patch you mention never was considered at all, since it
consisted of a selective quote from Greenplum source code.  It would
not even compile in community Postgres, because it adds calls to half a
dozen Greenplum routines that we've never seen.  Not to mention that
the base of the diff is Greenplum proprietary code, so the patch itself
wouldn't even apply successfully.

As to whether it would work if we had the full story ... well, not
having the full story, I don't want to opine.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Speed difference between select ... union select ... and select from partitioned_table

2007-10-26 Thread Pablo Alcaraz

These are the EXPLAIN ANALIZE:

I ran both queries on a CLUSTER and ANALYZEd tables:

UNION QUERY
explain analyze
select e, p,  sum( c) as c
from (
   select e, p, count( *) as c
   from tt_3
   group by e, p
   union
   select e, p, count( *) as c
   from tt_6
   group by e, p
   union
   select e, p, count( *) as c
   from tt_9
   group by e, p
   union
   select e, p, count( *) as c
   from tt_00012
   group by e, p
   union
   select e, p, count( *) as c
   from tt_00015
   group by e, p
) as t
group by e, p
order by e, p desc;

"Sort  (cost=2549202.87..2549203.37 rows=200 width=16) (actual 
time=263593.182..263593.429 rows=207 loops=1)"

"  Sort Key: t.e, t.p"
"  ->  HashAggregate  (cost=2549192.73..2549195.23 rows=200 width=16) 
(actual time=263592.469..263592.763 rows=207 loops=1)"
"->  Unique  (cost=2549172.54..2549179.88 rows=734 width=8) 
(actual time=263590.481..263591.764 rows=356 loops=1)"
"  ->  Sort  (cost=2549172.54..2549174.38 rows=734 width=8) 
(actual time=263590.479..263590.891 rows=356 loops=1)"

"Sort Key: e, p, c"
"->  Append  (cost=1307131.88..2549137.60 rows=734 
width=8) (actual time=132862.176..263589.774 rows=356 loops=1)"
"  ->  HashAggregate  
(cost=1307131.88..1307133.03 rows=92 width=8) (actual 
time=132862.173..132862.483 rows=200 loops=1)"
"->  Seq Scan on tt_3  
(cost=0.00..1081550.36 rows=30077536 width=8) (actual 
time=10.135..83957.424 rows=3000 loops=1)"
"  ->  HashAggregate  
(cost=1241915.64..1241916.16 rows=42 width=8) (actual 
time=130726.219..130726.457 rows=156 loops=1)"
"->  Seq Scan on tt_6  
(cost=0.00..1028793.22 rows=28416322 width=8) (actual 
time=11.389..87338.730 rows=28351293 loops=1)"
"  ->  HashAggregate  (cost=24.53..27.03 
rows=200 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"->  Seq Scan on tt_9  
(cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 
loops=1)"
"  ->  HashAggregate  (cost=24.53..27.03 
rows=200 width=8) (actual time=0.004..0.004 rows=0 loops=1)"
"->  Seq Scan on tt_00012  
(cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 
loops=1)"
"  ->  HashAggregate  (cost=24.53..27.03 
rows=200 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"->  Seq Scan on tt_00015  
(cost=0.00..18.30 rows=830 width=8) (actual time=0.001..0.001 rows=0 
loops=1)"

"Total runtime: 263594.381 ms"


PARTITIONED QUERY

explain analyze
select e, p,  count( *) as c
from tt
group by e, p
order by e, p desc;

"GroupAggregate  (cost=13256958.67..13842471.95 rows=4 width=8) 
(actual time=899391.384..1065585.531 rows=207 loops=1)"
"  ->  Sort  (cost=13256958.67..13403211.99 rows=58501328 width=8) 
(actual time=899391.364..989749.914 rows=58351293 loops=1)"

"Sort Key: public.tt.e, public.tt.p"
"->  Append  (cost=0.00..2110508.28 rows=58501328 width=8) 
(actual time=14.031..485211.466 rows=58351293 loops=1)"
"  ->  Seq Scan on tt  (cost=0.00..18.30 rows=830 width=8) 
(actual time=0.002..0.002 rows=0 loops=1)"
"  ->  Seq Scan on tt_3 tt  (cost=0.00..1081550.36 
rows=30077536 width=8) (actual time=14.024..178657.738 rows=3000 
loops=1)"
"  ->  Seq Scan on tt_6 tt  (cost=0.00..1028793.22 
rows=28416322 width=8) (actual time=39.852..168307.030 rows=28351293 
loops=1)"
"  ->  Seq Scan on tt_9 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"  ->  Seq Scan on tt_00012 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"  ->  Seq Scan on tt_00015 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"  ->  Seq Scan on tt_00018 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"  ->  Seq Scan on tt_00021 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"  ->  Seq Scan on tt_00024 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"  ->  Seq Scan on tt_00027 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"  ->  Seq Scan on tt_00030 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"

"Total runtime: 1066301.084 ms"

Any idea?

Regards

Pablo


Jeff Davis wrote:

On Fri, 2007-10-26 at 16:37 -0400, Pablo Alcaraz wrote:
  

Hi List!

I executed 2 equivalents queries. The first one uses a union structure. 
The second uses a partitioned table. The tables are the same with 3

Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-10-26 Thread Anton
I want ask about problem with partioned tables (it was discussed some
time ago, see below). Is it fixed somehow in 8.2.5 ?

2007/8/24, Luke Lonergan <[EMAIL PROTECTED]>:
> Below is a patch against 8.2.4 (more or less), Heikki can you take a look at
> it?
>
> This enables the use of index scan of a child table by recognizing sort
> order of the append node.  Kurt Harriman did the work.
...
>
> On 8/24/07 3:38 AM, "Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:
>
> > Anton wrote:
>  =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
> QUERY PLAN
> >>> 
> >>> -
>  Limit  (cost=824637.69..824637.69 rows=1 width=32)
> ->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
>   Sort Key: public.n_traf.date_time
>   ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
> ->  Append  (cost= 0.00..100877.99 rows=5643499 width=32)
>   ->  Seq Scan on n_traf  (cost=0.00..22.30
>  rows=1230 width=32)
>   ->  Seq Scan on n_traf_y2007m01 n_traf
>  (cost=0.00..22.30 rows=1230 width=32)
> >> ...
>   ->  Seq Scan on n_traf_y2007m12 n_traf
>  (cost=0.00..22.30 rows=1230 width=32)
>  (18 rows)
> 
>  Why it no uses indexes at all?
>  ---
> >>> I'm no expert but I'd guess that the the planner doesn't know which
> >>> partition holds the latest time so it has to read them all.
> >>
> >> Agree. But why it not uses indexes when it reading them?
> >
> > The planner isn't smart enough to push the "ORDER BY ... LIMIT ..."
> > below the append node. Therefore it needs to fetch all rows from all the
> > tables, and the fastest way to do that is a seq scan.

-- 
engineer

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Suggestions on an update query

2007-10-26 Thread Joshua D. Drake
On Sat, 27 Oct 2007 03:04:47 +0100
Gregory Stark <[EMAIL PROTECTED]> wrote:


> > O.k. first you might be grinding through your 20 checkpoint segments
> > but in reality what I think is happening is you are doing foreign
> > key checks against all of it and slowing things down.
> 
> If you're going to berate someone about top-posting perhaps you
> should attach your own commentary to relevant bits of context :P

It was hardly berating Greg, I even said please.


> I
> suspect you've guessed it right though. In fact I suspect what's
> happening is he doesn't have an index on the referencing column so
> the foreign key checks are doing sequential scans of.
> 

Sincerely,

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/



signature.asc
Description: PGP signature


Re: [PERFORM] Suggestions on an update query

2007-10-26 Thread Gregory Stark

"Joshua D. Drake" <[EMAIL PROTECTED]> writes:

> On Fri, 26 Oct 2007 15:31:44 -0500
> "Campbell, Lance" <[EMAIL PROTECTED]> wrote:
>
>> I forgot to include an additional parameter I am using in
>> Postgresql.conf: 
>> 
>
> O.k. first, just to get it out of the way (and then I will try and
> help). Please do not top post, it makes replying contextually very
> difficult.
>
>> PostgreSql version 8.2.4
>> 
>> Memory = 8 Gig
>> 
>> CPUs 1 dual core Zeon running at 3.0
>> 
>
> O.k. first you might be grinding through your 20 checkpoint segments
> but in reality what I think is happening is you are doing foreign key
> checks against all of it and slowing things down.

If you're going to berate someone about top-posting perhaps you should attach
your own commentary to relevant bits of context :P

But the original post didn't include any foreign key constraints. I suspect
you've guessed it right though. In fact I suspect what's happening is he
doesn't have an index on the referencing column so the foreign key checks are
doing sequential scans of.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah


I  changed  CLOG Buffers to 16

Running the test again:
# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 0   1027   :tick-5sec

 /export/home0/igen/pgdata/pg_clog/0024   
-27530282192961
 /export/home0/igen/pgdata/pg_clog/0025   
-27530282111041

# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 1   1027   :tick-5sec

# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 1   1027   :tick-5sec

# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 0   1027   :tick-5sec

 /export/home0/igen/pgdata/pg_clog/0025   
-27530281947201



So Tom seems to be correct that it is a case of CLOG Buffer thrashing. 
But since I saw the same problem with two different workloads, I think 
people hitting this problem is pretty high.


Also I am bit surprised that CLogControlFile did not show up as being 
hot.. Maybe because not much writes are going on .. Or maybe since I did 
not trace all 500 users to see their hot lock status..



Dmitri has another workload to test, I might try that out later on to 
see if it causes similar impact or not.


Of course I havent seen my throughput go up yet since I am already CPU 
bound... But this is good since the number of IOPS to the disk are 
reduced (and hence system calls).



If I take this as my baseline number.. I can then proceed to hunt other 
bottlenecks



Whats the view of the community?

Hunt down CPU utilizations or Lock waits next?

Your votes are crucial on where I put my focus.

Another thing Josh B told me to check out was the wal_writer_delay setting:

I have done two settings with almost equal performance (with the CLOG 16 
setting) .. One with 100ms and other default at 200ms.. Based on the 
runs it seemed that the 100ms was slightly better than the default .. 
(Plus the risk of loosing data is reduced from 600ms to 300ms)


Thanks.

Regards,
Jignesh




Tom Lane wrote:

"Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
  

So the ratio of reads vs writes to clog files is pretty huge..



It looks to me that the issue is simply one of not having quite enough
CLOG buffers.  Your first run shows 8 different pages being fetched and
the second shows 10.  Bearing in mind that we "pin" the latest CLOG page
into buffers, there are only NUM_CLOG_BUFFERS-1 buffers available for
older pages, so what we've got here is thrashing for the available
slots.

Try increasing NUM_CLOG_BUFFERS to 16 and see how it affects this test.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq
  


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Suggestions on an update query

2007-10-26 Thread Joshua D. Drake
On Fri, 26 Oct 2007 15:31:44 -0500
"Campbell, Lance" <[EMAIL PROTECTED]> wrote:

> I forgot to include an additional parameter I am using in
> Postgresql.conf: 
> 

O.k. first, just to get it out of the way (and then I will try and
help). Please do not top post, it makes replying contextually very
difficult.
> 
> PostgreSql version 8.2.4
> 
> Memory = 8 Gig
> 
> CPUs 1 dual core Zeon running at 3.0
> 

O.k. first you might be grinding through your 20 checkpoint segments
but in reality what I think is happening is you are doing foreign key
checks against all of it and slowing things down.


> 
> The table result_entry contains 17,767,240 rows and the table
> question_number contains 40,787.  Each row from the result_entry table
> will match to one and only one row in the table question_number using
> the fk_question_id field.  Each row from the question_number table
> matches to an average of 436 rows on the result_entry table.
> 
>  


You could disable the foreign key for the update and then reapply it.

Joshua D. Drake


> 
> CREATE TABLE question_number
> 
> (
> 
>   fk_form_idinteger not null,
> 
>   fk_question_idinteger not null,
> 
>   question_number   integer not null,
> 
>   sequence_id   integer not null
> 
> );
> 
>  
> 
> ALTER TABLE ONLY question_number ADD CONSTRAINT question_number_pkey
> PRIMARY KEY (fk_question_id);
> 
> CREATE INDEX question_number_index1 ON question_number USING btree
> (question_number);
> 
>  
> 
>  
> 
> CREATE TABLE result_entry (
> 
> fk_result_submission_id integer NOT NULL,
> 
> fk_question_id integer NOT NULL,
> 
> fk_option_order_id integer NOT NULL,  
> 
> value character varying,
> 
> order_id integer NOT NULL,
> 
> question_number integer
> 
> );
> 
>  
> 
> CREATE INDEX result_entery_index1 ON result_entry USING btree
> (fk_question_id);
> 
>  
> 
>  
> 
> update result_entry set
> question_number=question_number.question_number
> 
> 
> from question_number where
> result_entry.fk_question_id=question_number.fk_question_id;
> 
>  
> 
>  
> 
>  
> 
> explain update result_entry set
> question_number=question_number.question_number  
> 
> from question_number where
> result_entry.fk_question_id=question_number.fk_question_id;
> 
>  
> 
>QUERY PLAN
> 
> 
> 
> -
> 
>  Hash Join  (cost=1437.71..1046983.94 rows=17333178 width=32)
> 
>Hash Cond: (result_entry.fk_question_id =
> question_number.fk_question_id)
> 
>->  Seq Scan on result_entry  (cost=0.00..612216.78 rows=17333178
> width=28)
> 
>->  Hash  (cost=927.87..927.87 rows=40787 width=8)
> 
>  ->  Seq Scan on question_number  (cost=0.00..927.87
> rows=40787 width=8)
> 
> (5 rows)
> 
>  
> 
>  
> 
>  
> 
> Postgresql.conf settings:
> 
>  
> 
> shared_buffers = 1GB
> 
> work_mem = 10MB
> 
> max_fsm_pages = 204800
> 
> random_page_cost = 1.0
> 
> effective_cache_size = 8GB
> 
>  
> 
>  
> 
> Thanks for any help!
> 
>  
> 
>  
> 
> Lance Campbell
> 
> Project Manager/Software Architect
> 
> Web Services at Public Affairs
> 
> University of Illinois
> 
> 217.333.0382
> 
> http://webservices.uiuc.edu
> 
>  
> 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/



signature.asc
Description: PGP signature


Re: [PERFORM] Speed difference between select ... union select ... and select from partitioned_table

2007-10-26 Thread Gregory Stark
"Pablo Alcaraz" <[EMAIL PROTECTED]> writes:

> Hi List!
>
> I executed 2 equivalents queries. The first one uses a union structure. The
> second uses a partitioned table. The tables are the same with 30 millions of
> rows each one and the returned rows are the same.
>
> But the union query perform faster than the partitioned query.
>
> My question is: why? :)
>
> [EMAIL PROTECTED] testeo]$ cat query-union.sql
> select e, p,  sum( c) as c
> from (
>select e, p, count( *) as c
>from tt_3
>group by e, p
>union
>select e, p, count( *) as c
>from tt_6
>group by e, p
>union
...


You should send along the "explain analyze" results for both queries,
otherwise we're just guessing.

Also, you should consider using UNION ALL instead of plain UNION.

Finally you should consider removing all the intermediate GROUP BYs and just
group the entire result. In theory it should be faster but in practice I'm not
sure it works out that way.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Speed difference between select ... union select ... and select from partitioned_table

2007-10-26 Thread Pablo Alcaraz

I forgot to post the times:

query-union: 21:59
query-heritage: 1:31:24

Regards

Pablo

Pablo Alcaraz wrote:

Hi List!

I executed 2 equivalents queries. The first one uses a union 
structure. The second uses a partitioned table. The tables are the 
same with 30 millions of rows each one and the returned rows are the 
same.


But the union query perform faster than the partitioned query.

My question is: why? :)

[EMAIL PROTECTED] testeo]$ cat query-union.sql
select e, p,  sum( c) as c
from (
   select e, p, count( *) as c
   from tt_3
   group by e, p
   union
   select e, p, count( *) as c
   from tt_6
   group by e, p
   union
   select e, p, count( *) as c
   from tt_9
   group by e, p
   union
   select e, p, count( *) as c
   from tt_00012
   group by e, p
   union
   select e, p, count( *) as c
   from tt_00015
   group by e, p
) as t
group by e, p
order by e, p desc;



[EMAIL PROTECTED] testeo]$ cat query-heritage.sql
select e, p,  count( *) as c
from tt
group by e, p
order by e, p desc;


The server is a Athlon 64x2 6000+ 2 Gb RAM PostreSQL 8.2.5

The structure tables are:

CREATE TABLE tt_3
(
-- Inherited:   idtt bigint NOT NULL,
-- Inherited:   idttp bigint NOT NULL,
-- Inherited:   e integer NOT NULL,
-- Inherited:   dmodi timestamp without time zone NOT NULL DEFAULT now(),
-- Inherited:   p integer NOT NULL DEFAULT 0,
-- Inherited:   m text NOT NULL,
 CONSTRAINT tt_3_pkey PRIMARY KEY (idtt),
 CONSTRAINT tt_3_idtt_check CHECK (idtt >= 1::bigint AND idtt <= 
3000::bigint)

) INHERITS (tt)
WITHOUT OIDS;
ALTER TABLE tt_3 ;

CREATE INDEX tt_3_e
 ON tt_3
 USING btree
 (e);






---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Suggestions on an update query

2007-10-26 Thread Gregory Stark
"Campbell, Lance" <[EMAIL PROTECTED]> writes:

>QUERY PLAN
>
>
> 
>
>  Hash Join  (cost=1437.71..1046983.94 rows=17333178 width=32)
>Hash Cond: (result_entry.fk_question_id = question_number.fk_question_id)
>->  Seq Scan on result_entry  (cost=0.00..612216.78 rows=17333178 width=28)
>->  Hash  (cost=927.87..927.87 rows=40787 width=8)
>  ->  Seq Scan on question_number  (cost=0.00..927.87 rows=40787 
> width=8)
>
> (5 rows)

That looks like an entirely reasonable plan. Is it possible some other session
was blocking this update with a lock on a record? Was there lots of I/O at the
time? You could peek in pg_locks while the update seems frozen.

This looks like a one-time administrative job to add a new column, is that it?
You might also consider creating a new table with the new data and replacing
the old table with the new one with something like:

CREATE TABLE new_result_entry AS 
  SELECT fk_result_submission_id, fk_question_id, fk_option_order_id, 
 value, order_id, 
 question_number.question_number 
FROM result_entry
JOIN question_number USING (fk_question_id)

CREATE INDEX result_entery_index1n ON new_result_entry USING btree 
(fk_question_id);

ALTER TABLE result_entry RENAME TO old_result_entry
ALTER TABLE newresult_entry RENAME TO result_entry

Unfortunately (for this use case) any views, triggers, etc which reference the
old table will continue to reference the old table after the renames. You'll
have to drop and recreate them.

That may not be an option if the data is actively being used though. But if it
is an option there are a few advantages 1) it'll be a bit faster 2) you can
build the indexes on the new data at the end of the creation b) the resulting
table and indexes won't have all the old versions taking up space waiting for
a vacuum.


> Postgresql.conf settings:
> shared_buffers = 1GB
> work_mem = 10MB
> max_fsm_pages = 204800
> random_page_cost = 1.0
> effective_cache_size = 8GB

I would suggest keeping random_page_cost at least slightly above 1.0 and
effective_cache_size should probably be about 6GB rather than 8 since the
shared buffers and other things which use memory reduce the memory available
for cache. Also, work_mem could be larger at least for large batch queries
like this.

None of this is relevant for this query though. Actually I think a larger
work_mem can avoid problems with hash joins so you might try that but I don't
think it would be choosing it estimated that might happen -- and the estimates
all look accurate.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Speed difference between select ... union select ... and select from partitioned_table

2007-10-26 Thread Jeff Davis
On Fri, 2007-10-26 at 16:37 -0400, Pablo Alcaraz wrote:
> Hi List!
> 
> I executed 2 equivalents queries. The first one uses a union structure. 
> The second uses a partitioned table. The tables are the same with 30 
> millions of rows each one and the returned rows are the same.
> 
> But the union query perform faster than the partitioned query.
> 

I think you mean to use UNION ALL here. UNION forces a DISTINCT, which
results in a sort operation. What surprises me is that the UNION is
actually faster than the partitioning using inheritance.

I suspect it has something to do with the GROUP BYs, but we won't know
until you post EXPLAIN ANALYZE results.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Speed difference between select ... union select ... and select from partitioned_table

2007-10-26 Thread Pablo Alcaraz

Hi List!

I executed 2 equivalents queries. The first one uses a union structure. 
The second uses a partitioned table. The tables are the same with 30 
millions of rows each one and the returned rows are the same.


But the union query perform faster than the partitioned query.

My question is: why? :)

[EMAIL PROTECTED] testeo]$ cat query-union.sql
select e, p,  sum( c) as c
from (
   select e, p, count( *) as c
   from tt_3
   group by e, p
   union
   select e, p, count( *) as c
   from tt_6
   group by e, p
   union
   select e, p, count( *) as c
   from tt_9
   group by e, p
   union
   select e, p, count( *) as c
   from tt_00012
   group by e, p
   union
   select e, p, count( *) as c
   from tt_00015
   group by e, p
) as t
group by e, p
order by e, p desc;



[EMAIL PROTECTED] testeo]$ cat query-heritage.sql
select e, p,  count( *) as c
from tt
group by e, p
order by e, p desc;


The server is a Athlon 64x2 6000+ 2 Gb RAM PostreSQL 8.2.5

The structure tables are:

CREATE TABLE tt_3
(
-- Inherited:   idtt bigint NOT NULL,
-- Inherited:   idttp bigint NOT NULL,
-- Inherited:   e integer NOT NULL,
-- Inherited:   dmodi timestamp without time zone NOT NULL DEFAULT now(),
-- Inherited:   p integer NOT NULL DEFAULT 0,
-- Inherited:   m text NOT NULL,
 CONSTRAINT tt_3_pkey PRIMARY KEY (idtt),
 CONSTRAINT tt_3_idtt_check CHECK (idtt >= 1::bigint AND idtt <= 
3000::bigint)

) INHERITS (tt)
WITHOUT OIDS;
ALTER TABLE tt_3 ;

CREATE INDEX tt_3_e
 ON tt_3
 USING btree
 (e);






---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Suggestions on an update query

2007-10-26 Thread Campbell, Lance
I forgot to include an additional parameter I am using in
Postgresql.conf: 

 

checkpoint_segments = 30

 

Thanks,

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Campbell,
Lance
Sent: Friday, October 26, 2007 3:27 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Suggestions on an update query

 

PostgreSql version 8.2.4

Memory = 8 Gig

CPUs 1 dual core Zeon running at 3.0

 

I have a problem with an update query taking over 10 hours in order to
run.   I rebooted my server.  I ran the SQL command "analyze".  Could
you please help me with any suggestions?  I have included the two tables
involved in the update below as well as the indexes I am using.  

 

The table result_entry contains 17,767,240 rows and the table
question_number contains 40,787.  Each row from the result_entry table
will match to one and only one row in the table question_number using
the fk_question_id field.  Each row from the question_number table
matches to an average of 436 rows on the result_entry table.

 

CREATE TABLE question_number

(

  fk_form_idinteger not null,

  fk_question_idinteger not null,

  question_number   integer not null,

  sequence_id   integer not null

);

 

ALTER TABLE ONLY question_number ADD CONSTRAINT question_number_pkey
PRIMARY KEY (fk_question_id);

CREATE INDEX question_number_index1 ON question_number USING btree
(question_number);

 

 

CREATE TABLE result_entry (

fk_result_submission_id integer NOT NULL,

fk_question_id integer NOT NULL,

fk_option_order_id integer NOT NULL,  

value character varying,

order_id integer NOT NULL,

question_number integer

);

 

CREATE INDEX result_entery_index1 ON result_entry USING btree
(fk_question_id);

 

 

update result_entry set question_number=question_number.question_number


from question_number where
result_entry.fk_question_id=question_number.fk_question_id;

 

 

 

explain update result_entry set
question_number=question_number.question_number  

from question_number where
result_entry.fk_question_id=question_number.fk_question_id;

 

   QUERY PLAN



-

 Hash Join  (cost=1437.71..1046983.94 rows=17333178 width=32)

   Hash Cond: (result_entry.fk_question_id =
question_number.fk_question_id)

   ->  Seq Scan on result_entry  (cost=0.00..612216.78 rows=17333178
width=28)

   ->  Hash  (cost=927.87..927.87 rows=40787 width=8)

 ->  Seq Scan on question_number  (cost=0.00..927.87 rows=40787
width=8)

(5 rows)

 

 

 

Postgresql.conf settings:

 

shared_buffers = 1GB

work_mem = 10MB

max_fsm_pages = 204800

random_page_cost = 1.0

effective_cache_size = 8GB

 

 

Thanks for any help!

 

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

 



[PERFORM] Suggestions on an update query

2007-10-26 Thread Campbell, Lance
PostgreSql version 8.2.4

Memory = 8 Gig

CPUs 1 dual core Zeon running at 3.0

 

I have a problem with an update query taking over 10 hours in order to
run.   I rebooted my server.  I ran the SQL command "analyze".  Could
you please help me with any suggestions?  I have included the two tables
involved in the update below as well as the indexes I am using.  

 

The table result_entry contains 17,767,240 rows and the table
question_number contains 40,787.  Each row from the result_entry table
will match to one and only one row in the table question_number using
the fk_question_id field.  Each row from the question_number table
matches to an average of 436 rows on the result_entry table.

 

CREATE TABLE question_number

(

  fk_form_idinteger not null,

  fk_question_idinteger not null,

  question_number   integer not null,

  sequence_id   integer not null

);

 

ALTER TABLE ONLY question_number ADD CONSTRAINT question_number_pkey
PRIMARY KEY (fk_question_id);

CREATE INDEX question_number_index1 ON question_number USING btree
(question_number);

 

 

CREATE TABLE result_entry (

fk_result_submission_id integer NOT NULL,

fk_question_id integer NOT NULL,

fk_option_order_id integer NOT NULL,  

value character varying,

order_id integer NOT NULL,

question_number integer

);

 

CREATE INDEX result_entery_index1 ON result_entry USING btree
(fk_question_id);

 

 

update result_entry set question_number=question_number.question_number


from question_number where
result_entry.fk_question_id=question_number.fk_question_id;

 

 

 

explain update result_entry set
question_number=question_number.question_number  

from question_number where
result_entry.fk_question_id=question_number.fk_question_id;

 

   QUERY PLAN



-

 Hash Join  (cost=1437.71..1046983.94 rows=17333178 width=32)

   Hash Cond: (result_entry.fk_question_id =
question_number.fk_question_id)

   ->  Seq Scan on result_entry  (cost=0.00..612216.78 rows=17333178
width=28)

   ->  Hash  (cost=927.87..927.87 rows=40787 width=8)

 ->  Seq Scan on question_number  (cost=0.00..927.87 rows=40787
width=8)

(5 rows)

 

 

 

Postgresql.conf settings:

 

shared_buffers = 1GB

work_mem = 10MB

max_fsm_pages = 204800

random_page_cost = 1.0

effective_cache_size = 8GB

 

 

Thanks for any help!

 

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

 



Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Tom Lane
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
> So the ratio of reads vs writes to clog files is pretty huge..

It looks to me that the issue is simply one of not having quite enough
CLOG buffers.  Your first run shows 8 different pages being fetched and
the second shows 10.  Bearing in mind that we "pin" the latest CLOG page
into buffers, there are only NUM_CLOG_BUFFERS-1 buffers available for
older pages, so what we've got here is thrashing for the available
slots.

Try increasing NUM_CLOG_BUFFERS to 16 and see how it affects this test.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah

Also to give perspective on the equivalent writes on CLOG

I used the following script which runs for 10 sec to track all writes to 
the clog directory and here is what it came up with... (This is with 500 
users running)


# cat write.d
#!/usr/sbin/dtrace -s
syscall::write:entry
/execname=="postgres" && 
dirname(fds[arg0].fi_pathname)=="/export/home0/igen/pgdata/pg_clog"/

{
   @write[fds[arg0].fi_pathname,arg1] = count();
}
tick-10sec
{
exit(0);
}

# ./write.d
dtrace: script './write.d' matched 2 probes
CPU IDFUNCTION:NAME
 3   1026  :tick-10sec

 /export/home0/igen/pgdata/pg_clog/001E   
-27530282770881

#
I modified read.d to do a 5sec read
# ./read.d
dtrace: script './read.d' matched 3 probes
CPU IDFUNCTION:NAME
 0  1   :BEGIN
 0   1027   :tick-5sec

 /export/home0/igen/pgdata/pg_clog/001F   
-27530282688961
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282525121
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282852802
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282770883
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282361283
 /export/home0/igen/pgdata/pg_clog/001E   
-27530282852805
 /export/home0/igen/pgdata/pg_clog/001E   
-27530282361289
 /export/home0/igen/pgdata/pg_clog/001E   
-2753028277088   13
 /export/home0/igen/pgdata/pg_clog/001E   
-2753028268896   15
 /export/home0/igen/pgdata/pg_clog/001E   
-2753028252512   27

#

So the ratio of reads vs writes to clog files is pretty huge..


-Jignesh



Jignesh K. Shah wrote:

Tom,

Here is what I did:

I started aggregating all read information:

First I also had added group by pid(arg0,arg1, pid) and the counts 
were all coming as 1


Then I just grouped by filename and location (arg0,arg1 of reads) and 
the counts came back as


# cat read.d
#!/usr/sbin/dtrace -s
syscall::read:entry
/execname=="postgres"/
{
   @read[fds[arg0].fi_pathname, arg1] = count();
}


# ./read.d
dtrace: script './read.d' matched 1 probe
^C

 /export/home0/igen/pgdata/pg_clog/0014   
-27530282934721
 /export/home0/igen/pgdata/pg_clog/0014   
-27530282770881
 /export/home0/igen/pgdata/pg_clog/0015   
-27530282443202
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028268896   14
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028260704   25
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028252512   27
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028277088   28
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028293472   37



FYI  I pressed ctrl-c within like less than a second

So to me this seems that multiple processes are reading the same page 
from different pids. (This was with about 600 suers active.


Aparently we do have a problem that we are reading the same buffer 
address again.  (Same as not being cached anywhere or not finding it 
in cache anywhere).


I reran lock wait script on couple of processes and did not see 
CLogControlFileLock  as a problem..


# ./83_lwlock_wait.d 14341

Lock IdMode   Count
  WALInsertLock   Exclusive   1
  ProcArrayLock   Exclusive  16

Lock Id   Combined Time (ns)
  WALInsertLock   383109
  ProcArrayLock198866236

# ./83_lwlock_wait.d 14607

Lock IdMode   Count
  WALInsertLock   Exclusive   2
  ProcArrayLock   Exclusive  15

Lock Id   Combined Time (ns)
  WALInsertLock55243
  ProcArrayLock 69700140

#

What will help you find out why it is reading the same page again?


-Jignesh



Jignesh K. Shah wrote:
I agree with Tom..  somehow I think  increasing NUM_CLOG_BUFFERS is 
just avoiding the symptom to a later value.. I promise to look more 
into it before making any recommendations to increase NUM_CLOG_BUFFERs.



Because though "iGen"  showed improvements in that area by increasing 
num_clog_buffers , EAStress had shown no improvements.. Plus the 
reason I think this is not the problem in 8.3beta1 since the Lock 
Output clearly does not show CLOGControlFile as to be the issue which 
I had seen in earlier case.  So I dont think that increasing 
NUM_CLOG_BUFFERS will change thing here.


Now I dont understand the code pretty well yet I see three hotspots 
and not sure if they are related to each other
* ProcArrayLock waits  - causing Waits  as reported by 
83_lockwait.

Re: [PERFORM] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah


Hi George,

I have seen the 4M/sec problem first actually during an EAStress type 
run with only 150 connections.


I will try to do more testing today that Tom has requested.

Regards,
Jignesh


Gregory Stark wrote:

"Jignesh K. Shah" <[EMAIL PROTECTED]> writes:

  

CLOG data is not cached in any PostgreSQL shared memory segments and hence
becomes the bottleneck as it has to constantly go to the filesystem to get
the read data.



This is the same bottleneck you discussed earlier. CLOG reads are cached in
the Postgres shared memory segment but only NUM_CLOG_BUFFERS are which
defaults to 8 buffers of 8kb each. With 1,000 clients and the transaction rate
you're running you needed a larger number of buffers.

Using the filesystem buffer cache is also an entirely reasonable solution
though. That's surely part of the logic behind not trying to keep more of the
clog in shared memory. Do you have any measurements of how much time is being
spent just doing the logical I/O to the buffer cache for the clog pages? 4MB/s
seems like it's not insignificant but your machine is big enough that perhaps
I'm thinking at the wrong scale.

I'm really curious whether you see any benefit from the vxid read-only
transactions. I'm not sure how to get an apples to apples comparison though.
Ideally just comparing it to CVS HEAD from immediately prior to the vxid patch
going in. Perhaps calling some function which forces an xid to be allocated
and seeing how much it slows down the benchmark would be a good substitute.

  


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah

Tom,

Here is what I did:

I started aggregating all read information:

First I also had added group by pid(arg0,arg1, pid) and the counts 
were all coming as 1


Then I just grouped by filename and location (arg0,arg1 of reads) and 
the counts came back as


# cat read.d
#!/usr/sbin/dtrace -s
syscall::read:entry
/execname=="postgres"/
{
   @read[fds[arg0].fi_pathname, arg1] = count();
}


# ./read.d
dtrace: script './read.d' matched 1 probe
^C

 /export/home0/igen/pgdata/pg_clog/0014   
-27530282934721
 /export/home0/igen/pgdata/pg_clog/0014   
-27530282770881
 /export/home0/igen/pgdata/pg_clog/0015   
-27530282443202
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028268896   14
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028260704   25
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028252512   27
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028277088   28
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028293472   37



FYI  I pressed ctrl-c within like less than a second

So to me this seems that multiple processes are reading the same page 
from different pids. (This was with about 600 suers active.


Aparently we do have a problem that we are reading the same buffer 
address again.  (Same as not being cached anywhere or not finding it in 
cache anywhere).


I reran lock wait script on couple of processes and did not see 
CLogControlFileLock  as a problem..


# ./83_lwlock_wait.d 14341

Lock IdMode   Count
  WALInsertLock   Exclusive   1
  ProcArrayLock   Exclusive  16

Lock Id   Combined Time (ns)
  WALInsertLock   383109
  ProcArrayLock198866236

# ./83_lwlock_wait.d 14607

Lock IdMode   Count
  WALInsertLock   Exclusive   2
  ProcArrayLock   Exclusive  15

Lock Id   Combined Time (ns)
  WALInsertLock55243
  ProcArrayLock 69700140

#

What will help you find out why it is reading the same page again?


-Jignesh



Jignesh K. Shah wrote:
I agree with Tom..  somehow I think  increasing NUM_CLOG_BUFFERS is 
just avoiding the symptom to a later value.. I promise to look more 
into it before making any recommendations to increase NUM_CLOG_BUFFERs.



Because though "iGen"  showed improvements in that area by increasing 
num_clog_buffers , EAStress had shown no improvements.. Plus the 
reason I think this is not the problem in 8.3beta1 since the Lock 
Output clearly does not show CLOGControlFile as to be the issue which 
I had seen in earlier case.  So I dont think that increasing 
NUM_CLOG_BUFFERS will change thing here.


Now I dont understand the code pretty well yet I see three hotspots 
and not sure if they are related to each other
* ProcArrayLock waits  - causing Waits  as reported by 
83_lockwait.d script
* SimpleLRUReadPage - causing read IOs as reported by 
iostat/rsnoop.d

* GetSnapshotData - causing CPU utilization  as reported by hotuser

But I will shut up and do more testing.

Regards,
Jignesh



Tom Lane wrote:

Josh Berkus <[EMAIL PROTECTED]> writes:
 
Actually, 32 made a significant difference as I recall ... do you 
still have the figures for that, Jignesh?



I'd want to see a new set of test runs backing up any call for a change
in NUM_CLOG_BUFFERS --- we've changed enough stuff around this area that
benchmarks using code from a few months back shouldn't carry a lot of
weight.

regards, tom lane
  


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Bunching "transactions"

2007-10-26 Thread Greg Smith

On Fri, 26 Oct 2007, Jean-David Beyer wrote:

I think it was Jon Louis Bently who wrote (in his book, "Writing 
Efficient Programs") something to the effect, "Premature optimization is 
the root of all evil."


That quote originally comes from Tony Hoare, popularized by a paper 
written by Donald Knuth in 1974.  The full statement is "We should forget 
about small efficiencies, say about 97% of the time: premature 
optimization is the root of all evil.  Yet we should not pass up our 
opportunities in that critical 3%."


My favorite sound-bite on this topic is from William Wulf:  "More 
computing sins are committed in the name of efficiency (without 
necessarily achieving it) than for any other single reason - including 
blind stupidity."  That was back in 1972.  Both his and Knuth's papers 
centered on abusing GOTO, which typically justified at the time via 
performance concerns.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Bunching "transactions"

2007-10-26 Thread Jean-David Beyer
Chris Browne wrote:

> Further, the Right Thing is to group related data together, and come
> up with a policy that is driven primarily by the need for data
> consistency.  If things work well enough, then don't go off trying to
> optimize something that doesn't really need optimization, and perhaps
> break the logic of the application.

Right. I think it was Jon Louis Bently who wrote (in his book, "Writing
Efficient Programs") something to the effect, "Premature optimization is the
root of all evil." Just because so much of it broke the logic of the
application (and did not help anyway). (Gotta profile first, for one thing.)

I had a boss once who insisted we write everyting in assembly language for
efficiency. We did not even know what algorithms we needed for the
application. And at the time (System 360 days), IBM did not even publish the
execution times for the instruction set of the machine we were using because
so many executed in zero-time -- overlapped with other instructions, local
caching in the processor, locality of memory reference, and so on. To get
efficiency, you must first get your algorithms right, including getting the
best ones for the problem at hand.

-- 
  .~.  Jean-David Beyer  Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A Registered Machine   241939.
 /( )\ Shrewsbury, New Jerseyhttp://counter.li.org
 ^^-^^ 10:05:01 up 3 days, 2:23, 1 user, load average: 4.10, 4.24, 4.18

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah
I agree with Tom..  somehow I think  increasing NUM_CLOG_BUFFERS is just 
avoiding the symptom to a later value.. I promise to look more into it 
before making any recommendations to increase NUM_CLOG_BUFFERs.



Because though "iGen"  showed improvements in that area by increasing 
num_clog_buffers , EAStress had shown no improvements.. Plus the reason 
I think this is not the problem in 8.3beta1 since the Lock Output 
clearly does not show CLOGControlFile as to be the issue which I had 
seen in earlier case.  So I dont think that increasing NUM_CLOG_BUFFERS 
will change thing here.


Now I dont understand the code pretty well yet I see three hotspots and 
not sure if they are related to each other
* ProcArrayLock waits  - causing Waits  as reported by 
83_lockwait.d script
* SimpleLRUReadPage - causing read IOs as reported by 
iostat/rsnoop.d

* GetSnapshotData - causing CPU utilization  as reported by hotuser

But I will shut up and do more testing.

Regards,
Jignesh



Tom Lane wrote:

Josh Berkus <[EMAIL PROTECTED]> writes:
  
Actually, 32 made a significant difference as I recall ... do you still have 
the figures for that, Jignesh?



I'd want to see a new set of test runs backing up any call for a change
in NUM_CLOG_BUFFERS --- we've changed enough stuff around this area that
benchmarks using code from a few months back shouldn't carry a lot of
weight.

regards, tom lane
  


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah
The problem I saw was first highlighted by EAStress runs with PostgreSQL 
on Solaris with 120-150 users. I just replicated that via my smaller 
internal benchmark that we use here to recreate that problem.


EAStress should be just fine to highlight it.. Just put pg_clog on 
O_DIRECT or something so that all IOs go to disk making it easier to 
observe.


In the meanwhile I will try to get more information.


Regards,
Jignesh


Tom Lane wrote:

Gregory Stark <[EMAIL PROTECTED]> writes:
  

Didn't we already go through this? He and Simon were pushing to bump up
NUM_CLOG_BUFFERS and you were arguing that the test wasn't representative and
some other clog.c would have to be reengineered to scale well to larger
values. 



AFAIR we never did get any clear explanation of what the test case is.
I guess it must be write-mostly, else lazy XID assignment would have
helped this by reducing the rate of XID consumption.

It's still true that I'm leery of a large increase in the number of
buffers without reengineering slru.c.  That code was written on the
assumption that there were few enough buffers that a linear search
would be fine.  I'd hold still for 16, or maybe even 32, but I dunno
how much impact that will have for such a test case.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match
  


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Bunching "transactions"

2007-10-26 Thread Jean-David Beyer
Heikki Linnakangas wrote:
> Jean-David Beyer wrote:
> 
>> My IO system has two Ultra/320 LVD SCSI controllers and 6 10,000rpm SCSI
>> hard drives. The dual SCSI controller is on its own PCI-X bus (the machine
>> has 5 independent PCI-X busses). Two hard drives are on one SCSI controller
>> and the other four are on the other. The WAL is on the first controller,
>> most of the rest is on the other controller. Once in a while, I get 144
>> Megabytes/sec transfers for a few seconds at a time to the hard drive
>> system, where I have an advertizing-maximum of 640 Megabytes/second. Each
>> hard drive claims to take a sustained data rate of about 80
>> Megabytes/second. When I test it, I can get 55 and sometimes a little more
>> for a single drive.
> 
> You might find that you get better performance by just putting all the
> drives on a single RAID array. Or not :-). I'm not a hardware guy
> myself, but having read this mailing list for some time, I've seen
> different people come to different conclusions on that one. I guess it
> depends on the hardware and the application.

In the old days, I was a "hardware guy." But not in the last 15 years or so
(although I did put this machine together from parts). Right now, I do not
think I would get more performance with a single RAID array. Certainly not
if it were software RAID. Right now, I have the WAL on one drive that is not
heavily used when doing bulk loading of the database, and the main data on
the other 4 drives on a different SCSI controller. Measurements revealed
that THE bottleneck was the writing to the WAL.

The indices for any one table are on a different drive from the data itself
to minimize seek contention (and IO transmission contention, too, but that
does not seem to be an issue). Note that now the machine is only in IO-WAIT
state less than 1% of the time, and I no longer notice the main postgres
server process in D state. It used to be in D state a lot of the time before
I started bunching transactions. The IO to the drive with the WAL dropped
from a little over 3000 sectors per second to about 700 sectors per second,
for one thing. And the IO bandwidth consumed dropped, IIRC, about 50%.
> 
>> Likewise, I seemto have enough processing power.
>>
>> top - 12:47:22 up 2 days,  5:06,  4 users,  load average: 1.40, 3.13, 4.20
>> Tasks: 168 total,   3 running, 165 sleeping,   0 stopped,   0 zombie
>> Cpu0  : 29.5%us,  3.3%sy,  0.0%ni, 67.0%id,  0.2%wa,  0.0%hi,  0.0%si,
>> Cpu1  : 21.8%us,  3.1%sy,  0.0%ni, 73.7%id,  1.4%wa,  0.0%hi,  0.0%si,
>> Cpu2  : 24.6%us,  3.6%sy,  0.0%ni, 71.7%id,  0.1%wa,  0.0%hi,  0.0%si,
>> Cpu3  : 23.1%us,  2.7%sy,  0.0%ni, 74.0%id,  0.1%wa,  0.1%hi,  0.0%si,
>> Mem:   8185340k total,  5112656k used,  3072684k free,32916k buffers
>> Swap:  4096496k total,  384k used,  4096112k free,  4549536k cached
>>
 > Actually it looks like you've saturated the CPU.

How do you figure that? There are two or four (depending on how you count
them) CPUs. The CPUs seem to be running at 75% idle. If I let BOINC
processes run (nice 19), I can soak up most of this idle time. I turned them
off for the purpose of these measurements because they hide the io-wait times.

> Postgres backends are
> single-threaded, so a single bulk load like that won't use more than one
> CPU at a time. If you add up the usr percentages above, it's ~100%.

If you add up the idle percentages, it is about 300%. Recall that there are
two hyperthreaded processors here. That is more than two processors (but
less than four). If I examine the postgres processes, one of them used to
get to 100% once in a while when I did things like DELETE FROM tablename;
but I do a TRUNCATE now and it is much faster. Now any single process peaks
at 80% of a CPU and usually runs at less than 50%. The postgres processes
run on multiple CPUS. Looking at the top command, normally my client runs at
around 20% on one CPU, the main postgres server runs on a second at between
30% and 80% (depends on which tables I am loading), and the writer runs on
yet another. The two loggers wander around more. But these last three run at
around 1% each. In fact, the writer is idle much of the time.
> 
> You should switch to using COPY if you can.
> 
Someone else posted that I should not get neurotic about squeezing the last
little bit out of this (not his exact words), and I agree. This is only for
doing an initial load of the database after all. And as long as the total
time is acceptable, that is good enough. When I first started this (using
DB2), one of the loads used to take something like 10 hours. Redesigning my
basic approach got that time down to about 2 hours without too much
programming effort. As the amount of data has increased, that started
creeping up, and one of the tables, that has about 6,000,000 entries at the
moment, took overnight to load. That is why I looked into bunching these
transactions, with gratifying results.

To use COPY, I would have to write a bunch of special purpose pro

Re: [PERFORM] Finalizing commit taking very long

2007-10-26 Thread Giulio Cesare Solaroli
Hello Tom,

I can confirm that adding the indexes used by the deferred constraint
triggers solved the issue.

Thank you very much for your suggestions.

Best regards,

Giulio Cesare


On 10/24/07, Giulio Cesare Solaroli <[EMAIL PROTECTED]> wrote:
> On 10/24/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> > "Giulio Cesare Solaroli" <[EMAIL PROTECTED]> writes:
> > > How can I try to isolate the trigger taking so long, in oder to
> > > understand which is/are the missing index(es)?
> >
> > Try SET CONSTRAINTS ALL IMMEDIATE and then EXPLAIN ANALYZE the
> > delete.  This should cause all the triggers to run within the
> > scope of the EXPLAIN ANALYZE, and you'll be able to see which
> > one(s) are slow.  (This assumes you're running a recent release
> > of PG; I think EXPLAIN shows trigger times since 8.1 or so.)
>
> I was thinking about something similar after writing the last message.
>
> Thank you very much for your attention!!
>
> Giulio Cesare
>

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Gregory Stark
"Josh Berkus" <[EMAIL PROTECTED]> writes:

> Actually, 32 made a significant difference as I recall ... do you still have 
> the figures for that, Jignesh?

Well it made a difference but it didn't remove the bottleneck, it just moved
it. IIRC under that benchmark Jignesh was able to run with x sessions
efficiently with 8 clog buffers, x + 100 or so sessions with 16 clog buffers
and x + 200 or so sessions with 32 clog buffers.

It happened that x + 200 was > the number of sessions he wanted to run the
benchmark at so it helped the benchmark results quite a bit. But that was just
an artifact of how many sessions the benchmark needed. A user who needs 1200
sessions or who has a different transaction load might find he needs more clog
buffers to alleviate the bottleneck. And of course most (all?) normal users
use far fewer sessions and won't run into this bottleneck at all.

Raising NUM_CLOG_BUFFERS just moves around the arbitrary bottleneck. This
benchmark is useful in that it gives us an idea where the bottleneck lies for
various values of NUM_CLOG_BUFFERS but it doesn't tell us what value realistic
users are likely to bump into.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org