Re: [PERFORM] Delete performance again

2008-10-03 Thread Marc Mamin
Hi,
 
Maybe you can try this syntax. I'm not sure, but it eventually perform better:
 
 
delete from company_alias USING comprm
where company_alias.company_id =comprm.id


Cheers,

Marc

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] 7.4 - basic tuning question

2008-10-03 Thread Richard Huxton
Simon Waters wrote:

The best advice is to "upgrade at your earliest convenience" with
performance questions and 7.4 - you're missing a *lot* of improvements.
You say you're planning to anyway, and I'd recommend putting effort into
the upgrade rather than waste effort on tuning a system you're leaving.

> I assume that the histogram_bounds for strings are alphabetical in order, so 
> that "DEMOSTART" falls between "DELETE" and "IDEMAIL". Even on a worst case 
> of including both these common values, the planner ought to have assumed that 
> less than <10% of records were likely covered by the value selected, so it 
> seems unlikely to me that not using the index would be a good idea.

Well, the real question is how many blocks need to be read to find those
DEMOSTART rows. At some point around 5-10% of the table it's easier just
to read the whole table than go back and fore between index and table.
The precise point will depend on how much RAM you have, disk speeds etc.

> => SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
> (...lots of time passes...)
>  count
> ---
>   1432
> (1 row)

OK, not many. The crucial bit is below though. These are the 10 values
it will hold stats on, and all it knows is that DEMOSTART has less than
57000 entries. OK, it's more complicated than that, but basically there
are values it tracks and everything else. So - it assumes that all other
 values have the same chance of occuring.

> => SELECT COUNT(*), event FROM log GROUP BY event ORDER BY count;
> 
>   count |   event
> +---
[snip]
>   57022 | NEWUSR
>   64907 | PUBREC0
>   65449 | UNPUBLISH
>   92843 | LOGOUT
>   99018 | KILLSESS
>  128900 | UPLOAD
>  134994 | LOGIN
>  137608 | NEWPAGE
>  447556 | PUBREC1
>  489572 | PUBLISH

Which is why it guesses 20436 rows below. If you'd done "SET
enable_seqscan = off" then run the explain again it should have
estimated a cost for the index that was more than 54317.14

> => EXPLAIN SELECT * FROM log WHERE event='DEMOSTART';
>  QUERY PLAN
> 
>  Seq Scan on log  (cost=0.00..54317.14 rows=20436 width=93)
>Filter: (event = 'DEMOSTART'::text)
> (2 rows)
> 
> 
> => ALTER TABLE log ALTER COLUMN events SET STATISTICS 50; ANALYSE
> LOG(event);
> ALTER TABLE
> ANALYZE
> 
> 
> => EXPLAIN SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
> QUERY PLAN
> 
> ---
>  Aggregate  (cost=5101.43..5101.43 rows=1 width=0)
>->  Index Scan using log_event on log  (cost=0.00..5098.15 rows=1310
> width=0)
>  Index Cond: (event = 'DEMOSTART'::text)
> (3 rows)

Not bad - now it knows how many rows it will find, and it sees that the
index is cheaper. It's not completely accurate - it uses a statistical
sampling (and of course it's out of date as soon as you update the table).

HTH

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] 7.4 - basic tuning question

2008-10-03 Thread Simon Waters
Hi,

we have a log table on one server with 1.9 million records.

One column "event" (type text) in that table is a string that (currently) 
takes a small number of distinct values (~43) (hmm that could have been 
normalised better).

We noted on querying for events of a specific type, that the queries were 
slower than expected. It simply wasn't using the index (btree, default 
settings) on this column on this server (the test server, with less records, 
was fine).

Using "ALTER TABLE SET STATISTICS" to increase the number of buckets to 50 
resolved the issue, we went pretty much straight there on discovering there 
are no "HINTS".

However we aren't quite sure why this case was pathological, and my brain 
doesn't grok the documentation quite.

I assume that the histogram_bounds for strings are alphabetical in order, so 
that "DEMOSTART" falls between "DELETE" and "IDEMAIL". Even on a worst case 
of including both these common values, the planner ought to have assumed that 
less than <10% of records were likely covered by the value selected, so it 
seems unlikely to me that not using the index would be a good idea.

What am I missing? (and yes there is a plan to upgrade!).


=> SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
(...lots of time passes...)
 count
---
  1432
(1 row)


=> SELECT COUNT(*), event FROM log GROUP BY event ORDER BY count;

  count |   event
+---
  6 | DNRFAIL
 14 | ADMDNR
 14 | UPGRADE
 18 | FOCRENEW
 21 | AUTOCN
 25 | ADMCC
 27 | TEMPIN
 31 | DNRCANCEL
 43 | EXPIRED
128 | DIRECTBUY
130 | CANCEL
130 | CANCELQ
154 | FOCBUY
173 | EXPCCWARN
179 | OFFER
209 | DNROK
214 | TEMPRE
356 | CCWARN
429 | ADMLOGIN
719 | SUBSCRIBE
787 | CCSUCCESS
988 | CCFAILURE
   1217 | TEMPNEW
   1298 | PAYPAL
   1431 | DEMOSTART
   1776 | CCREQUEST
   2474 | ACCTUPD
  15169 | SYSMAINT
  42251 | IDEMAIL
  46964 | DELETE
  50764 | RELOGIN
  57022 | NEWUSR
  64907 | PUBREC0
  65449 | UNPUBLISH
  92843 | LOGOUT
  99018 | KILLSESS
 128900 | UPLOAD
 134994 | LOGIN
 137608 | NEWPAGE
 447556 | PUBREC1
 489572 | PUBLISH


=> EXPLAIN SELECT * FROM log WHERE event='DEMOSTART';
 QUERY PLAN

 Seq Scan on log  (cost=0.00..54317.14 rows=20436 width=93)
   Filter: (event = 'DEMOSTART'::text)
(2 rows)


=> ALTER TABLE log ALTER COLUMN events SET STATISTICS 50; ANALYSE
LOG(event);
ALTER TABLE
ANALYZE


=> EXPLAIN SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
QUERY PLAN

---
 Aggregate  (cost=5101.43..5101.43 rows=1 width=0)
   ->  Index Scan using log_event on log  (cost=0.00..5098.15 rows=1310
width=0)
 Index Cond: (event = 'DEMOSTART'::text)
(3 rows)


=> SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
(...almost no time passes...)
 count
---
  1432
(1 row)


BEFORE
pajax=> select * from pg_stats where tablename = 'log' and attname='event';
 schemaname | tablename | attname | null_frac | avg_width | n_distinct |

most_common_vals| 
most_common_freqs | 
 
histogram_bounds   | correlation
+---+-+---+---+++---+-+-
 public | log   | event   | 0 |10 | 25 | 
{PUBLISH,PUBREC1,NEWPAGE,UPLOAD,LOGIN,KILLSESS,LOGOUT} | 
{0.257333,0.248333,0.072,0.0696667,0.061,0.054,0.0506667} | 
{ACCTUPD,DELETE,IDEMAIL,NEWUSR,NEWUSR,PUBREC0,PUBREC0,RELOGIN,SYSMAINT,UNPUBLISH,UNPUBLISH}
 |
0.120881
(1 row)

AFTER
pajax=>  select * from pg_stats where tablename='log' and attname='event';
 schemaname | tablename | attname | null_frac | avg_width | n_distinct |

most_common_vals|   
   
most_common_freqs  |
   
histogram_bounds
| 
correlation
+---+-+---+---+++-+---

Re: [PERFORM] Slow Inserts on large tables

2008-10-03 Thread Tom Lane
"Peter Childs" <[EMAIL PROTECTED]> writes:
> 2008/10/3 Peter Eisentraut <[EMAIL PROTECTED]>:
>> Then show us your checkpointing-related parameters.

> I've currently got them set to

> checkpoint_segments = 3
> checkpoint_timeout = 180s
> checkpoint_completion_target = 0.5

> after reading that doing more smaller checkpoints might make each
> checkpoint work quicker and hence less of a performance hit when they
> actually happen.

That concept is actually pretty obsolete in 8.3: with spread-out
checkpoints it basically shouldn't hurt to increase the checkpoint
interval, and could actually help because the bgwriter doesn't have
such a tight deadline to finish the checkpoint.  In any case you
*definitely* need to increase checkpoint_segments --- the value
you've got could be forcing a checkpoint every few seconds not
every few minutes.

What I would suggest is turning on log_checkpoints and then seeing
if there's any correlation between your slow insert commands and the
checkpoints.  I'm suspicious that the problem is somewhere else.
(For instance, have you got anything that might take a lock on the
table?  Maybe enabling log_lock_waits would be a good idea too.)

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] dedicated server & postgresql 8.1 conf tunning

2008-10-03 Thread Thomas Spreng


On 2. Oct, 2008, at 10:00, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote:
Unfornatly, i can't update pgsql to 8.3 since it's not in debian  
stable.


Did you consider using backport packages (http://www.backports.org) for
Debian Etch? They are providing postgresql v.8.3.3 packages for Debian  
Etch.


Cheers.

PS: We are also running backported postgresql packages using Debian Etch
on our production servers without any problems.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Inserts on large tables

2008-10-03 Thread Iulian Dragan
Peter,

(please take this with a pinch of salt as I am no expert)

Here is  a possible scenario:
Each of your checkpoints takes 90 seconds or more (you told it  so with the 
checkpoint_completion_target). 
If your insert fills 3 checkpoint segments (48 megs ) in less than 90 seconds 
then a new checkpoint request is issued. And maybe a third one, and so on. I 
imagine that this can flood the disk cache with write requests at some point 
although I can't explain how.
Have a look at the log, see the interval between the checkpoint requests and 
try to make this (a lot) larger than the checkpoint duration.
Start by increasing your checkpoint_segments (to, say, 16). If this doesn't 
work, maybe the timeout is too short, or the 90 seconds target to generous.

Regards,

Iulian

--- On Fri, 10/3/08, Peter Childs <[EMAIL PROTECTED]> wrote:
From: Peter Childs <[EMAIL PROTECTED]>
Subject: Re: [PERFORM] Slow Inserts on large tables
To: 
Cc: "Postgresql Performance" 
Date: Friday, October 3, 2008, 9:47 AM

2008/10/3 Peter Eisentraut <[EMAIL PROTECTED]>:
> Peter Childs wrote:
>>
>> I have a problem where by an insert on a "large" table will
sometimes
>> take longer than usual.
>
>> I think the problem might have something to do with checkpoints,
>
> Then show us your checkpointing-related parameters.  Or try to set them to
a
> lot higher values so checkpoints happen more rarely and see if that makes
a
> difference.
>
>

More often or less often?

I've currently got them set to

checkpoint_segments = 3
checkpoint_timeout = 180s
checkpoint_completion_target = 0.5

after reading that doing more smaller checkpoints might make each
checkpoint work quicker and hence less of a performance hit when they
actually happen.

Regards

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

Re: [PERFORM] Slow Inserts on large tables

2008-10-03 Thread Peter Childs
2008/10/3 Peter Eisentraut <[EMAIL PROTECTED]>:
> Peter Childs wrote:
>>
>> I have a problem where by an insert on a "large" table will sometimes
>> take longer than usual.
>
>> I think the problem might have something to do with checkpoints,
>
> Then show us your checkpointing-related parameters.  Or try to set them to a
> lot higher values so checkpoints happen more rarely and see if that makes a
> difference.
>
>

More often or less often?

I've currently got them set to

checkpoint_segments = 3
checkpoint_timeout = 180s
checkpoint_completion_target = 0.5

after reading that doing more smaller checkpoints might make each
checkpoint work quicker and hence less of a performance hit when they
actually happen.

Regards

Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Inserts on large tables

2008-10-03 Thread Peter Eisentraut

Peter Childs wrote:

I have a problem where by an insert on a "large" table will sometimes
take longer than usual.



I think the problem might have something to do with checkpoints,


Then show us your checkpointing-related parameters.  Or try to set them 
to a lot higher values so checkpoints happen more rarely and see if that 
makes a difference.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Slow Inserts on large tables

2008-10-03 Thread Peter Childs
I have a problem where by an insert on a "large" table will sometimes
take longer than usual.

Usually the inserts are quick then from time to time they will take a
long time sometimes as much as 10seconds or longer. (But usually under
500ms which is when I start logging them)

The queries are slow drip fed so bulk loading really is not an option,
Its logging data. Used in analysis and for historical purposes mostly.

I think the problem might have something to do with checkpoints, I'm
relatively sure its not when the table expands as I've run a vacuum
verbose straight away after a longer insert and not found loads of
space in the fsm.

I'm using 8.3.1 (I thought I'd upgraded to 8.3.3 but it does not look
like the upgrade worked) I'm more than happy to upgrade just have to
find the down time (even a few seconds can be difficult)

Any help would be appreciated.

Regards

Peter Childs

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance