Re: [PERFORM] Query plan for heavy SELECT with lite sub-SELECTs

2006-11-03 Thread Arjen van der Meijden

Alvaro Herrera wrote:

Performance analysis of strange queries is useful, but the input queries
have to be meaningful as well.  Otherwise you end up optimizing bizarre
and useless cases.



I had a similar one a few weeks ago. I did some batch-processing over a 
bunch of documents and discovered postgresql was faster if I let it 
process just 1000 documents, in stead of all 45000 at the same time. But 
with 1000 it was faster than 1000x one document.


So I started with a query like:
SELECT docid, (SELECT work to be done for each document)
FROM documents
ORDER BY docid
LIMIT 1000
OFFSET ?

And I noticed the 44th iteration was much slower than the first.

Rewriting it to something like this made the last iteration about as 
fast as the first:

SELECT docid, (SELECT work to be done for each document)
FROM documents
WHERE docid IN (SELECT docid FROM documents
ORDER BY docid
LIMIT 1000
OFFSET ?
)

I know something like that isn't very set-based thinking, but then again 
the query's structure did come from a iterative algoritm, but turned out 
to be faster (less query-overhead) and easier to scale in PostgreSQL. 
I've tried a few more set-like structures, but those were all slower 
than this aproach probably because they would be were a little more 
complex. Some of them took more than 10x the amount of time...


Another real-life example would be to display the amount of replies to a 
topic in a topic listing of a forum or the name of the author of the 
last message. You probably don't want to count all the replies for each 
topic if you're only going to display headings 100 - 200.
And there are a few more examples to think of where a join+group by 
isn't going to work, but a subquery in the selectlist just does what you 
want.

Of course most of the time you won't be using a OFFSET then.

Best regards,

Arjen

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] profiling PL/pgSQL?

2006-11-03 Thread Drew Wilson
I have 700 lines of non-performant pgSQL code that I'd like to  
profile to see what's going on.


What's the best way to profile stored procedures?

Thanks,

Drew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] profiling PL/pgSQL?

2006-11-03 Thread A. Kretschmer
am  Fri, dem 03.11.2006, um  3:12:14 -0800 mailte Drew Wilson folgendes:
 I have 700 lines of non-performant pgSQL code that I'd like to  
 profile to see what's going on.
 
 What's the best way to profile stored procedures?

RAISE NOTICE, you can raise the aktual time within a transaction with
timeofday()


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47215,   D1: 0160/7141639 (mehr: - Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] profiling PL/pgSQL?

2006-11-03 Thread Richard Huxton

A. Kretschmer wrote:

am  Fri, dem 03.11.2006, um  3:12:14 -0800 mailte Drew Wilson folgendes:
I have 700 lines of non-performant pgSQL code that I'd like to  
profile to see what's going on.


What's the best way to profile stored procedures?


RAISE NOTICE, you can raise the aktual time within a transaction with
timeofday()


Of course you only have very small values of best available with 
plpgsql debugging.


There's a GUI debugger from EnterpriseDB I believe, but I've no idea how 
good it is. Any users/company bods care to let us know?


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Context switch storm

2006-11-03 Thread creimer


Hi,

We've migrated one of our servers from pg 7.4 to 8.1 and from times to times (4 hours) the server start doing a lot of context switching and all transactions become very slow.

The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25.

CPU and memory usage are ok.

What is producing this context switching storms?

It is a box with 16GB RAM and 4 XEONprocessors running RedHat Enterprise Linux AS.

Should I disable Hyperthreading?

Thank you in advance!

Reimer


Re: [PERFORM] Context switch storm

2006-11-03 Thread Gregory S. Williamson
Based on what other people have posted, hyperthreading seems not to be 
beneficial for postgres -- try searching through the archives of this list. 
(And then turn it off and see if it helps.)

You might also post a few details:

config settings (shared_buffers, work_mem, maintenance_work_mem, wal and 
checkpoint settings, etc.)

are you using autovacuum ?

all tables are vacuumed and analyzed regularly ? How big are they ? Do they and 
indexes fit in RAM ?

any particular queries that running and might be related (explain analyze 
results of them would be useful)

disk configuration

Other processes on this box ?

# of connections to it (I've seen this alone push servers over the edge)

HTH,

Greg Williamson
DBA
GlobeXplorer LLC

-Original Message-
From:   [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED]
Sent:   Fri 11/3/2006 2:32 AM
To: pgsql-performance@postgresql.org
Cc: 
Subject:[PERFORM] Context switch storm

Hi,
 
We've migrated one of our servers from pg 7.4 to 8.1 and from times to times (4 
hours) the server start doing a lot of context switching and all transactions 
become very slow.
 
The average context switching for this server as vmstat shows is 1 but when the 
problem occurs it goes to 25.
 
CPU and memory usage are ok.
 
What is producing this context switching storms?
 
It is a box with 16GB RAM and 4 XEON processors running RedHat Enterprise Linux 
AS.
 
Should I disable Hyperthreading?
 
Thank you in advance!
 
Reimer



---
Click link below if it is SPAM [EMAIL PROTECTED]
https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=454b34ac206028992556831[EMAIL
 PROTECTED]retrain=spamtemplate=historyhistory_page=1
!DSPAM:454b34ac206028992556831!
---




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Context switch storm

2006-11-03 Thread Richard Huxton

[EMAIL PROTECTED] wrote:

Hi,

We've migrated one of our servers from pg 7.4 to 8.1 and from times
to times (4 hours) the server start doing a lot of context switching
and all transactions become very slow.

The average context switching for this server as vmstat shows is 1
but when the problem occurs it goes to 25.

CPU and memory usage are ok.

What is producing this context switching storms?



It is a box with 16GB RAM and 4 XEON processors running RedHat
Enterprise Linux AS.


It's memory bandwidth issues on the older Xeons. If you search the 
archives you'll see a lot of discussion of this. I'd have thought 8.1 
would be better than 7.4 though.


You'll tend to see it when you have multiple clients and most queries 
can use RAM rather than disk I/O. My understanding of what happens is 
that PG requests data from RAM - it's not in cache so the process gets 
suspended to wait. The next process does the same, with the same result. 
  You end up with lots of processes all fighting over what data is in 
the cache and no-one gets much work done.



Should I disable Hyperthreading?


I seem to remember that helps, but do check the mailing list archives 
for discussion on this.


If you can keep your numbers of clients down below the critical level, 
you should find the overall workload is fine. The problem is of course 
that as the context-switching increases, each query takes longer which 
means more clients connect, which increases the context-swtching, which 
means...


HTH
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Context switch storm

2006-11-03 Thread Richard Huxton

Cosimo Streppone wrote:

Richard Huxton wrote:


[EMAIL PROTECTED] wrote:


The average context switching for this server as vmstat shows is 1
but when the problem occurs it goes to 25.


You'll tend to see it when you have multiple clients and most queries 
can use RAM rather than disk I/O. My understanding of what happens is 
that PG requests data from RAM - it's not in cache so the process gets 
suspended to wait. The next process does the same, with the same 
result.   You end up with lots of processes all fighting over what 
data is in the cache and no-one gets much work done.


Does this happen also with 8.0, or is specific to 8.1 ?


All versions suffer to a degree - they just push the old Xeon in the 
wrong way. However, more recent versions *should* be better than older 
versions. I believe some work was put in to prevent contention on 
various locks which should reduce context-switching across the board.



I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server


upgrade from 8.0.1 - the most recent is 8.0.9 iirc


when I raise `shared_buffers' from 8192 to 4.
I would expect an increase in tps/concurrent clients, but I see an average
performance below a certain threshold of users, and when concurrent users
get above that level, performance starts to drop, no matter what I do.


Are you seeing a jump in context-switching in top? You'll know when you 
do - it's a *large* jump. That's the key diagnosis. Otherwise it might 
simply be your configuration settings aren't ideal for that workload.



Server logs and io/vm statistics seem to indicate that there is little
or no disk activity but machine loads increases to 7.0/8.0.
After some minutes, the problem goes away, and performance returns
to acceptable levels.


That sounds like it. Query time increases across the board as all the 
clients fail to get any data back.



When the load increases, *random* database queries show this slowness,
even if they are perfectly planned and indexed.

Is there anything we can do?


Well, the client I saw it with just bought a dual-opteron server and 
used their quad-Xeon for something else. However, I do remember that 8.1 
seemed better than 7.4 before they switched. Part of that might just 
have been better query-planning and other efficiences though.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Setting nice values

2006-11-03 Thread Andreas Kostyrka
Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe:
 Sometimes it's the simple solutions that work best.  :)  Welcome to the
 world of pgsql, btw...

OTOH, there are also non-simple solutions to this, which might make
sense anyway: Install slony, and run your queries against a readonly
replica of your data.

Andreas



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: [PERFORM] Context switch storm

2006-11-03 Thread Andreas Kostyrka
The solution for us has been twofold:

upgrade to the newest PG version available at the time while we waited
for our new Opteron-based DB hardware to arrive.

Andreas


Am Freitag, den 03.11.2006, 13:29 + schrieb Richard Huxton:
 Cosimo Streppone wrote:
  Richard Huxton wrote:
  
  [EMAIL PROTECTED] wrote:
 
  The average context switching for this server as vmstat shows is 1
  but when the problem occurs it goes to 25.
 
  You'll tend to see it when you have multiple clients and most queries 
  can use RAM rather than disk I/O. My understanding of what happens is 
  that PG requests data from RAM - it's not in cache so the process gets 
  suspended to wait. The next process does the same, with the same 
  result.   You end up with lots of processes all fighting over what 
  data is in the cache and no-one gets much work done.
  
  Does this happen also with 8.0, or is specific to 8.1 ?
 
 All versions suffer to a degree - they just push the old Xeon in the 
 wrong way. However, more recent versions *should* be better than older 
 versions. I believe some work was put in to prevent contention on 
 various locks which should reduce context-switching across the board.
 
  I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server
 
 upgrade from 8.0.1 - the most recent is 8.0.9 iirc
 
  when I raise `shared_buffers' from 8192 to 4.
  I would expect an increase in tps/concurrent clients, but I see an average
  performance below a certain threshold of users, and when concurrent users
  get above that level, performance starts to drop, no matter what I do.
 
 Are you seeing a jump in context-switching in top? You'll know when you 
 do - it's a *large* jump. That's the key diagnosis. Otherwise it might 
 simply be your configuration settings aren't ideal for that workload.
 
  Server logs and io/vm statistics seem to indicate that there is little
  or no disk activity but machine loads increases to 7.0/8.0.
  After some minutes, the problem goes away, and performance returns
  to acceptable levels.
 
 That sounds like it. Query time increases across the board as all the 
 clients fail to get any data back.
 
  When the load increases, *random* database queries show this slowness,
  even if they are perfectly planned and indexed.
  
  Is there anything we can do?
 
 Well, the client I saw it with just bought a dual-opteron server and 
 used their quad-Xeon for something else. However, I do remember that 8.1 
 seemed better than 7.4 before they switched. Part of that might just 
 have been better query-planning and other efficiences though.
 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: [PERFORM] Context switch storm

2006-11-03 Thread Richard Troy

On Fri, 3 Nov 2006, Richard Huxton wrote:

 It's memory bandwidth issues on the older Xeons. If you search the
 archives you'll see a lot of discussion of this. I'd have thought 8.1
 would be better than 7.4 though.

Hmmm... I just checked; one of our production systems is a multi-cpu Xeon
based system of uncertain age (nobody remember 'zactly). While we haven't
seen this problem yet, it's scheduled to take over demo-duty shortly and
it would be an embarassment if we had this trouble during a demo... Is
there any easy way to tell if you're at risk?

Thanks,
Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Context switch storm

2006-11-03 Thread Richard Huxton

Richard Troy wrote:

On Fri, 3 Nov 2006, Richard Huxton wrote:

It's memory bandwidth issues on the older Xeons. If you search the
archives you'll see a lot of discussion of this. I'd have thought 8.1
would be better than 7.4 though.


Hmmm... I just checked; one of our production systems is a multi-cpu Xeon
based system of uncertain age (nobody remember 'zactly). While we haven't
seen this problem yet, it's scheduled to take over demo-duty shortly and
it would be an embarassment if we had this trouble during a demo... Is
there any easy way to tell if you're at risk?


Try:
- multiple clients
- query doing sorts that fit into work_mem

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Context switch storm

2006-11-03 Thread Cosimo Streppone

Richard Huxton wrote:


[EMAIL PROTECTED] wrote:

Hi,

We've migrated one of our servers from pg 7.4 to 8.1 and from times
to times (4 hours) the server start doing a lot of context switching
and all transactions become very slow.

The average context switching for this server as vmstat shows is 1
but when the problem occurs it goes to 25.


You'll tend to see it when you have multiple clients and most queries 
can use RAM rather than disk I/O. My understanding of what happens is 
that PG requests data from RAM - it's not in cache so the process gets 
suspended to wait. The next process does the same, with the same result. 
  You end up with lots of processes all fighting over what data is in 
the cache and no-one gets much work done.


Does this happen also with 8.0, or is specific to 8.1 ?
I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server
when I raise `shared_buffers' from 8192 to 4.
I would expect an increase in tps/concurrent clients, but I see an average
performance below a certain threshold of users, and when concurrent users
get above that level, performance starts to drop, no matter what I do.

Server logs and io/vm statistics seem to indicate that there is little
or no disk activity but machine loads increases to 7.0/8.0.
After some minutes, the problem goes away, and performance returns
to acceptable levels.

When the load increases, *random* database queries show this slowness,
even if they are perfectly planned and indexed.

Is there anything we can do?

--
Cosimo


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Context switch storm

2006-11-03 Thread creimer
 If you can keep your numbers of clients down below the critical  level,  you should find the overall workload is fine. 


We have at about 600 connections. Is this a case to use a connection pool (pg_pool) system?

And why this happens only with 8.0 and 8.1 and not with the 7.4?




Re: [PERFORM] Context switch storm

2006-11-03 Thread Richard Huxton

[EMAIL PROTECTED] wrote:
If you can keep your numbers of clients down below the critical 
level, you should find the overall workload is fine.


We have at about 600 connections. Is this a case to use a connection
pool (pg_pool) system?


Possibly - that should help. I'm assuming that most of your queries are 
very short, so you could probably get that figure down a lot lower. 
You'll keep the same amount of queries running through the system, just 
queue them up.



And why this happens only with 8.0 and 8.1 and not with the 7.4?


Not sure. Maybe 8.x is making more intensive use of your memory, 
possibly with a change in your plans.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query plan for heavy SELECT with lite sub-SELECTs

2006-11-03 Thread Tom Lane
Arjen van der Meijden [EMAIL PROTECTED] writes:
 ... Rewriting it to something like this made the last iteration about as 
 fast as the first:

 SELECT docid, (SELECT work to be done for each document)
 FROM documents
 WHERE docid IN (SELECT docid FROM documents
   ORDER BY docid
   LIMIT 1000
   OFFSET ?
 )

The reason for this, of course, is that the LIMIT/OFFSET filter is the
last step in a query plan --- it comes *after* computation of the SELECT
output list.  (So does ORDER BY, if an explicit sort step is needed.)
So if you have an expensive-to-compute output list, a trick like Arjen's
will help.  I don't think you can use an IN though, at least not if
you want to preserve the sort ordering in the final result.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] profiling PL/pgSQL?

2006-11-03 Thread Jonah H. Harris

On 11/3/06, Richard Huxton dev@archonet.com wrote:

There's a GUI debugger from EnterpriseDB I believe, but I've no idea how
good it is. Any users/company bods care to let us know?


If you visit:
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/edb-debugger/#dirlist

We have both a PL/pgSQL profiler and tracer available.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Context switch storm

2006-11-03 Thread Tom Lane
[EMAIL PROTECTED] writes:
 And why this happens only with 8.0 and 8.1 and not with the 7.4?

8.0 and 8.1 are vulnerable to this behavior because of conflicts for
access to pg_subtrans (which didn't exist in 7.4).  The problem occurs
when you have old open transactions, causing the window over which
pg_subtrans must be accessed to become much wider than normal.
8.2 should eliminate or at least alleviate the issue, but in the
meantime see if you can get your applications to not sit on open
transactions.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Context switch storm

2006-11-03 Thread Andreas Kostyrka
Am Freitag, den 03.11.2006, 14:38 + schrieb Richard Huxton:
 [EMAIL PROTECTED] wrote:
  If you can keep your numbers of clients down below the critical 
  level, you should find the overall workload is fine.
  
  We have at about 600 connections. Is this a case to use a connection
  pool (pg_pool) system?
 
 Possibly - that should help. I'm assuming that most of your queries are 
 very short, so you could probably get that figure down a lot lower. 
 You'll keep the same amount of queries running through the system, just 
 queue them up.
that have 
Ah, yes, now that you mention, avoid running many queries with a
similiar timing behaviour, PG8 seems to have a lock design that's very
bad for the memory architecture of the Xeons.

So running SELECT * FROM table WHERE id=1234567890; from 600 clients in
parallel can be quite bad than say a complicated 6-way join :(

Andreas

 
  And why this happens only with 8.0 and 8.1 and not with the 7.4?
 
 Not sure. Maybe 8.x is making more intensive use of your memory, 
 possibly with a change in your plans.
 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


[PERFORM] EXISTS optimization

2006-11-03 Thread Kevin Grittner
To support migration of existing queries, it would be nice not to have
to rewrite EXISTS clauses as IN clauses.  Here is one example of a query
which optimizes poorly:
 
DELETE FROM CaseDispo
WHERE EXISTS
(
SELECT * FROM Consolidation C
WHERE C.caseNo = '2006CM000123'
  AND C.xrefOrConsol = 'C'
  AND C.countyNo = 30
  AND CaseDispo.caseNo = C.crossRefCase
  AND CaseDispo.countyNo = C.countyNo
  AND CaseDispo.dispoDate = DATE '2005-10-31'
);
 
 Seq Scan on CaseDispo  (cost=0.00..1227660.52 rows=176084 width=6)
(actual time=501.557..501.557 rows=0 loops=1)
   Filter: (subplan)
   SubPlan
 -  Result  (cost=0.00..3.46 rows=1 width=48) (actual
time=0.000..0.000 rows=0 loops=352167)
   One-Time Filter: (($2)::date = '2005-10-31'::date)
   -  Index Scan using Consolidation_pkey on Consolidation
C  (cost=0.00..3.46 rows=1 width=48) (actual time=0.008..0.008 rows=0
loops=84)
 Index Cond: (((caseNo)::bpchar =
'2006CM000123'::bpchar) AND (($0)::bpchar = (crossRefCase)::bpchar)
AND ((countyNo)::smallint = 30) AND (($1)::smallint =
(countyNo)::smallint))
 Filter: (xrefOrConsol = 'C'::bpchar)
 Total runtime: 501.631 ms
(9 rows)
 
To most programmers, it would be obvious that this is an exact logical
equivalent to:
 
DELETE FROM CaseDispo
WHERE countyNo = 30
  AND dispoDate = DATE '2005-10-31'
  AND caseNo IN
(
SELECT crossRefCase FROM Consolidation C
WHERE C.caseNo = '2006CM000123'
  AND C.xrefOrConsol = 'C'
  AND C.countyNo = 30
);
 
 Nested Loop  (cost=7.02..10.50 rows=1 width=6) (actual
time=0.036..0.036 rows=0 loops=1)
   -  HashAggregate  (cost=7.02..7.03 rows=1 width=18) (actual
time=0.034..0.034 rows=0 loops=1)
 -  Index Scan using Consolidation_pkey on Consolidation
C  (cost=0.00..7.02 rows=1 width=18) (actual time=0.032..0.032 rows=0
loops=1)
   Index Cond: (((caseNo)::bpchar =
'2006CM000123'::bpchar) AND ((countyNo)::smallint = 30))
   Filter: (xrefOrConsol = 'C'::bpchar)
   -  Index Scan using CaseDispo_pkey on CaseDispo 
(cost=0.00..3.46 rows=1 width=24) (never executed)
 Index Cond: (((CaseDispo.caseNo)::bpchar =
(outer.crossRefCase)::bpchar) AND ((CaseDispo.dispoDate)::date =
'2005-10-31'::date) AND ((CaseDispo.countyNo)::smallint = 30))
 Total runtime: 0.109 ms
(8 rows)
 
On this particular query, three orders of magnitude only gets you up to
half a second, but the same thing happens on longer running queries. 
And even that half a second is significant when a user has to sit there
and wait for the hourglass to clear on a regular basis.  Clearly, the
problem is not in the costing -- it recognizes the high cost of the
EXISTS form.  The problem is that it doesn't recognize that these are
logically equivalent.
 
Is there any work in progress to expand the set of plans examined for
an EXISTS clause?  If not, can we add such an enhancement to the TODO
list?  Do we need a good write-up on what optimizations are legal for
EXISTS?
 
-Kevin
 



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate