Re: [PERFORM] performance config help

2010-01-11 Thread Pierre Frédéric Caillau d


Each of the 256 requests was being processed by a php process.  So, it  
could

certainly be faster.  But, the fact that we're seeing the db performance
degrade would seem to indicate that our application is fast enough to  
punish

the db.  Isn't that true?


	Not necessarily. Your DB still has lots of idle CPU, so perhaps it's your  
client which is getting over the top. Or you have locking problems in your  
DB.

Things to test :

- vmstat on the benchmark client
- iptraf on the network link
- monitor ping times between client and server during load test

	Some time ago, I made a benchmark simulating a forum. Postgres was  
saturating the gigabit ethernet between server and client...


	If those PHP processes run inside Apache, I'd suggest switching to  
lighttpd/fastcgi, which has higher performance, and uses a limited,  
controllable set of PHP processes (and therefore DB connections), which in  
turn uses much less memory.


PS : try those settings :

fsync = fdatasync
wal_buffers = 64MB
walwriter_delay = 2ms
synchronous commits @ 1 s delay

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 4:17 PM, Greg Smith  wrote:
> Scott Marlowe wrote:
>>
>> The DDR3 Nehalem and DDR2 AMD are both actually pretty close in real
>> world use on 4 or more socket machines.  Most benchmarks on memory
>> bandwidth give no huge advantage to either one or the other.  They
>> both max out at about 25GB/s.
>>
>
> The most fair comparison I've seen so far is
> http://www.advancedclustering.com/company-blog/stream-benchmarking.html
> which puts the faster Intel solutions at 37GB/s, while the Opterons bog down
> at 20GB/s.  That matches my own tests pretty well too--Intel's got at least
> a 50% lead here in many cases.

But that's with only 2 sockets.  I'd like to see something comparing 4
or 8 socket machines.  Hmmm, off to googol.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Greg Smith

Scott Marlowe wrote:

The DDR3 Nehalem and DDR2 AMD are both actually pretty close in real
world use on 4 or more socket machines.  Most benchmarks on memory
bandwidth give no huge advantage to either one or the other.  They
both max out at about 25GB/s.
  
The most fair comparison I've seen so far is 
http://www.advancedclustering.com/company-blog/stream-benchmarking.html 
which puts the faster Intel solutions at 37GB/s, while the Opterons bog 
down at 20GB/s.  That matches my own tests pretty well too--Intel's got 
at least a 50% lead here in many cases.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 3:04 PM, Greg Smith  wrote:
> Scott Marlowe wrote:
>>
>> So, I took a break from writing and searched for some more info on the
>> 74xx series CPUs, and from reading lots of articles, including this
>> one:
>> http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3414
>> It seems apparent that the 74xx series if a great CPU, as long as
>> you're not memory bound.
>>
>
> This is why I regularly end up recommending people consider Intel's designs
> here so often, the ones that use triple-channel DDR3, instead of any of the
> AMD ones.  It's extremely easy to end up with a memory-bound workload
> nowadays, at which point all the CPU power in the world doesn't help you
> anymore.

The DDR3 Nehalem and DDR2 AMD are both actually pretty close in real
world use on 4 or more socket machines.  Most benchmarks on memory
bandwidth give no huge advantage to either one or the other.  They
both max out at about 25GB/s.

It's the older Xeon base 74xx chipsets without integrated memory
controllers that seem to have such horrible bandwidth because they're
not multi-channel.

For dual socket the Nehalem is pretty much the king.  By the time you
get to 8 sockets AMD is still ahead.  Just avoid anything older than
nehalem or istanbul.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choice of bitmap scan over index scan

2010-01-11 Thread Robert Haas
On Mon, Jan 11, 2010 at 2:41 PM, Jeremy Harris  wrote:
> On 01/11/2010 02:53 AM, Robert Haas wrote:
>>
>> On Sun, Jan 10, 2010 at 9:04 AM, Jeremy Harris  wrote:
>>>
>>> Needing to use an external (on-disk) sort method, when taking
>>> only 90MB, looks odd.
>
> [...]
>>
>> Well, you'd need to have work_mem>  90 MB for that not to happen, and
>> very few people can afford to set that setting that high.  If you have
>> a query with three or four sorts using 90 MB a piece, and five or ten
>> users running them, you can quickly kill the box...
>
> Oh.  That's, um, a real pity given the cost of going external.  Any hope
> of a more dynamic allocation of memory resource in the future?
> Within a single query plan, the number of sorts is known; how about
> a sort-mem limit per query rather than per sort (as a first step)?

Unfortunately, it's not the case that the number of sorts is known -
the amount of memory available to the sort affects its cost, so if we
knew we were going to have more or less memory it would affect whether
we chose the plan involving the sort in the first place.

My previous musings on this topic are here:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg00125.php

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Greg Smith

Bob Dusek wrote:


How do I learn more about the actual lock contention in my db?


There's a page with a sample query and links to more info at 
http://wiki.postgresql.org/wiki/Lock_Monitoring


One other little thing:  when you're running "top", try using "top -c" 
instead.  That should show you exactly what all the postmaster backends 
are actually doing, which is really handy to sort out issues in this area.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Dusek, Bob
> > How do I learn more about the actual lock contention in my db?  
> > Lock contention makes some sense.  Each of the 256 requests are
> > relatively similar.  So, I don't doubt that lock contention could
> > be an issue.  I just don't know how to observe it or correct it. 
> > It seems like if we have processes that are contending for locks,
> > there's not much we can do about it.
>  
> I'm not sure what the best way would be to measure it, but in prior
> discussions the general mood seemed to be that if you had so many
> active sessions that you were running into the issue, the best
> solution was to use a connection pool to avoid it.

Sorry.. by "not much we can do about it", I meant, from a query perspective.  I 
mean, we can't use locking hints or anything like that in Postgres that I know 
of.   

I do understand that the connection pool will help this.  


>  
> -Kevin
> 
> -- 
> Sent via pgsql-performance mailing list 
> (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Greg Smith

Scott Marlowe wrote:

So, I took a break from writing and searched for some more info on the
74xx series CPUs, and from reading lots of articles, including this
one:
http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3414
It seems apparent that the 74xx series if a great CPU, as long as
you're not memory bound.
  


This is why I regularly end up recommending people consider Intel's 
designs here so often, the ones that use triple-channel DDR3, instead of 
any of the AMD ones.  It's extremely easy to end up with a memory-bound 
workload nowadays, at which point all the CPU power in the world doesn't 
help you anymore.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Kevin Grittner
Bob Dusek  wrote:
 
> How do I learn more about the actual lock contention in my db?  
> Lock contention makes some sense.  Each of the 256 requests are
> relatively similar.  So, I don't doubt that lock contention could
> be an issue.  I just don't know how to observe it or correct it. 
> It seems like if we have processes that are contending for locks,
> there's not much we can do about it.
 
I'm not sure what the best way would be to measure it, but in prior
discussions the general mood seemed to be that if you had so many
active sessions that you were running into the issue, the best
solution was to use a connection pool to avoid it.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] work_mem vs temp files issue

2010-01-11 Thread Tom Lane
Jaime Casanova  writes:
> the temp files shoul be coming from hash operations but AFAICS the
> files are small and every hash operation should be using until
> work_mem memory, right?

No, when a hash spills to disk the code has to guess the partition sizes
(number of buckets per partition) in advance.  So it wouldn't be at all
surprising if the actual sizes come out substantially different from
work_mem.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Bob Dusek
>
>
> I haven't been keeping up on the hardware, so I defer to you on
> that.  It certainly seems like it would fit with the symptoms. On
> the other hand, I haven't seen anything yet to convince me that it
> *couldn't* be a client-side or network bottleneck, or the sort of
> lock contention bottleneck that showed up in some of Sun's
> benchmarks.  If it were my problem, I'd be trying to rule out
> whichever one of those could be tested most easily, iteratively.
>
>
How do I learn more about the actual lock contention in my db?   Lock
contention makes some sense.  Each of the 256 requests are relatively
similar.  So, I don't doubt that lock contention could be an issue.  I just
don't know how to observe it or correct it.  It seems like if we have
processes that are contending for locks, there's not much we can do about
it.

Also, as you suggested, identifying what queries are taking most of
> the time and trying to optimize them is a route that might help,
> regardless.
>

We often undertake query optimization.  And, we often learn things about our
app or make small performance gains from it.  Sometimes, we are even able to
make big changes to the application to make large gains based on how we see
queries performing.

So, I agree that it's a good thing.  However, query optimizing is tough,
since you can't necessarily predict the sizes of your tables in a real-time
system that is used differently by different users.


> -Kevin
>


Re: [PERFORM] [PERFORMANCE] work_mem vs temp files issue

2010-01-11 Thread Jaime Casanova
On Mon, Jan 11, 2010 at 3:18 PM, Tom Lane  wrote:
> Jaime Casanova  writes:
>> LOG:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
>> LOG:  switching to bounded heapsort at 641 tuples: CPU 0.08s/0.13u sec 
>> elapsed 0.25 sec
>> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.5", size 471010
[... some more temp files logged ...]
>> LOG:  internal sort ended, 118 KB used: CPU 0.10s/0.19u sec elapsed 0.33 sec
>
> Hmm.  Not clear where the temp files are coming from, but it's *not* the
> sort --- the "internal sort ended" line shows that that sort never went
> to disk.  What kind of plan is feeding the sort node?
>

i'm sure i have seen on disk sorts even when the files are small, but
still i see a problem here...

the temp files shoul be coming from hash operations but AFAICS the
files are small and every hash operation should be using until
work_mem memory, right?

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157
 QUERY 
PLAN 
 

-
 Limit  (cost=9090.60..9090.65 rows=20 width=166) (actual time=441.619..441.674 
rows=20 loops=1)
   ->  Sort  (cost=9089.85..9106.26 rows=6567 width=166) (actual 
time=441.062..441.362 rows=320 loops=1)
 Sort Key: tcf.fech_hora
 Sort Method:  top-N heapsort  Memory: 118kB
 ->  Hash Join  (cost=6660.18..8783.76 rows=6567 width=166) (actual 
time=310.561..427.393 rows=6511 loops=1)
   Hash Cond: (tcsc.id_soli_compra = tcf.id_soli_compra)
   ->  Hash Join  (cost=5600.47..7551.97 rows=6548 width=158) 
(actual time=281.279..378.434 rows=6544 loops=1)
 Hash Cond: (tcsc.esta_solicitud = tgdc.seq_det_cat)
 ->  Hash Join  (cost=5577.86..7439.33 rows=6548 width=147) 
(actual time=279.590..361.704 rows=6545 loops=1)
   Hash Cond: ((tcsc.persona_id)::numeric = 
tgp.persona_id)
   ->  Seq Scan on tcom_solicitud_compra tcsc  
(cost=0.00..799.77 rows=6548 width=125) (actual time=0.013..18.433 ro
ws=6545 loops=1)
 Filter: ((esta_solicitud <> 425) AND 
(esta_solicitud <> 424) AND (esta_solicitud <> 416))
   ->  Hash  (cost=3878.94..3878.94 rows=83594 
width=34) (actual time=278.837..278.837 rows=83594 loops=1)
 ->  Seq Scan on tgen_persona tgp  
(cost=0.00..3878.94 rows=83594 width=34) (actual time=0.006..136.582 rows
=83594 loops=1)
 ->  Hash  (cost=14.49..14.49 rows=649 width=15) (actual 
time=1.657..1.657 rows=649 loops=1)
   ->  Seq Scan on tgen_det_cat tgdc  (cost=0.00..14.49 
rows=649 width=15) (actual time=0.008..0.803 rows=649 loops=
1)
   ->  Hash  (cost=977.62..977.62 rows=6567 width=16) (actual 
time=29.264..29.264 rows=6556 loops=1)
 ->  Seq Scan on tcom_fecha tcf  (cost=0.00..977.62 
rows=6567 width=16) (actual time=0.016..20.532 rows=6556 loops=1)
   Filter: (seq_det_cat = 388)
 Total runtime: 443.434 ms
(20 rows)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Kevin Grittner
Scott Marlowe  wrote:
 
> FYI, on an 8 or 16 core machine, 10k to 30k context switches per
> second aren't that much.
 
Yeah, on our 16 core machines under heavy load we hover around 30k. 
He was around 50k, which is why I said it looked like it was
"becoming a problem."
 
> If you're climbing past 100k you might want to look out.
 
We hit that at one point; cutting our connection pool size brought
it down and improved performance dramatically.  I don't think I'd
wait for 100k to address it next time.
 
> The more I read up on the 74xx CPUs and look at the numbers here
> the more I think it's just that this machine has X bandwidth and
> it's using it all up.  You could put 1,000 cores in it, and it
> wouldn't go any faster.  My guess is that a 4x6 core AMD machine
> or even a 2x6 Nehalem would be much faster at this job.  Only way
> to tell is to run something like the stream benchmark and see how
> it scales, memory-wise, as you add cores to the benchmark.
 
I haven't been keeping up on the hardware, so I defer to you on
that.  It certainly seems like it would fit with the symptoms. On
the other hand, I haven't seen anything yet to convince me that it
*couldn't* be a client-side or network bottleneck, or the sort of
lock contention bottleneck that showed up in some of Sun's
benchmarks.  If it were my problem, I'd be trying to rule out
whichever one of those could be tested most easily, iteratively.
 
Also, as you suggested, identifying what queries are taking most of
the time and trying to optimize them is a route that might help,
regardless.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Tom Lane
Scott Marlowe  writes:
> On Mon, Jan 11, 2010 at 1:13 PM, Dusek, Bob  wrote:
>> I haven't been involved in any benchmarking of PG8 with fsync=off, but we 
>> certainly did it with PG 7.4.  fsync=0ff, for our purposes, was MUCH faster.

> And many changes have been made since then to make fsyncing much
> faster.  You may be grinding the valves on a 2009 Ferrari because
> pappy used to have to do it on his 1958 pickup truck here.

Perhaps more to the point, synchronous_commit can get most of the same
speedup with much less risk to your database.  You really owe it to
yourself to redo that benchmarking with a recent PG release.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] work_mem vs temp files issue

2010-01-11 Thread Tom Lane
Jaime Casanova  writes:
> LOG:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
> LOG:  switching to bounded heapsort at 641 tuples: CPU 0.08s/0.13u sec 
> elapsed 0.25 sec
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.5", size 471010
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.10", size 81096
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.0", size 467373
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.9", size 110200
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.3", size 470011
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.8", size 157192
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.4", size 468681
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.12", size 101624
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.1", size 472285
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.11", size 100744
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.6", size 467173
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.7", size 141888
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.2", size 476227
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.13", size 89072
> LOG:  performsort starting: CPU 0.10s/0.19u sec elapsed 0.33 sec
> LOG:  performsort done: CPU 0.10s/0.19u sec elapsed 0.33 sec
> LOG:  internal sort ended, 118 KB used: CPU 0.10s/0.19u sec elapsed 0.33 sec

Hmm.  Not clear where the temp files are coming from, but it's *not* the
sort --- the "internal sort ended" line shows that that sort never went
to disk.  What kind of plan is feeding the sort node?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 1:13 PM, Dusek, Bob  wrote:
> I haven't been involved in any benchmarking of PG8 with fsync=off, but we 
> certainly did it with PG 7.4.  fsync=0ff, for our purposes, was MUCH faster.

And many changes have been made since then to make fsyncing much
faster.  You may be grinding the valves on a 2009 Ferrari because
pappy used to have to do it on his 1958 pickup truck here.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Dusek, Bob

> I have slave dbs running on four 7200RPM SATA drives with fsync off.
> They only get updated from the master db so if they go boom, I just
> recreate their node.  There's times fsync off is ok, you just have to
> know that that db is now considered "disposable".
> 
> However, I'd suggest doing some benchmarking to PROVE that you're
> seeing an improvement from fsync being off.  If there's no
> improvement, then you might as well leave it on and save yourself some
> headache later on when the machine gets powered off suddenly etc.

I haven't been involved in any benchmarking of PG8 with fsync=off, but we 
certainly did it with PG 7.4.  fsync=0ff, for our purposes, was MUCH faster.
 
> -- 
> Sent via pgsql-performance mailing list 
> (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 12:34 PM, Bob Dusek  wrote:
> Yeah :)  We haven't run into much trouble.  But, we cut our teeth doing
> performance analysis of our app using PG 7.4.  And, people on this list seem
> to be adamantly against this config these days.  Is this safer in older
> versions of PG?  Or, are the risks the same?

It's always been unsafe.  Just that 7.4 was so slow that sometimes you
didn't really get to choose.

> We have some asynchronous communications processes that communicate
> permanent db changes to an enterprise-level data warehouse.  And, we can
> recover that data back down to the server if the server goes belly-up.  If
> something does go belly up, we really only lose the bit of data that hasn't
> been communicated yet.  It's true, that this data is important.  However,
> it's also true that it's very costly to guarantee this that very small
> amount of data isn't lost.  And, practically speaking (for our purposes) it
> seems that the data's not worth the cost.

I have slave dbs running on four 7200RPM SATA drives with fsync off.
They only get updated from the master db so if they go boom, I just
recreate their node.  There's times fsync off is ok, you just have to
know that that db is now considered "disposable".

However, I'd suggest doing some benchmarking to PROVE that you're
seeing an improvement from fsync being off.  If there's no
improvement, then you might as well leave it on and save yourself some
headache later on when the machine gets powered off suddenly etc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choice of bitmap scan over index scan

2010-01-11 Thread Jeremy Harris

On 01/11/2010 02:53 AM, Robert Haas wrote:

On Sun, Jan 10, 2010 at 9:04 AM, Jeremy Harris  wrote:

Needing to use an external (on-disk) sort method, when taking
only 90MB, looks odd.

[...]

Well, you'd need to have work_mem>  90 MB for that not to happen, and
very few people can afford to set that setting that high.  If you have
a query with three or four sorts using 90 MB a piece, and five or ten
users running them, you can quickly kill the box...


Oh.  That's, um, a real pity given the cost of going external.  Any hope
of a more dynamic allocation of memory resource in the future?
Within a single query plan, the number of sorts is known; how about
a sort-mem limit per query rather than per sort (as a first step)?

Cheers,
Jeremy



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 12:36 PM, Scott Marlowe  wrote:
> FYI, on an 8 or 16 core machine, 10k to 30k context switches per
> second aren't that much.  If you're climbing past 100k you might want
> to look out.
>
> The more I read up on the 74xx CPUs and look at the numbers here the
> more I think it's just that this machine has X bandwidth and it's
> using it all up.  You could put 1,000 cores in it, and it wouldn't go
> any faster.  My guess is that a 4x6 core AMD machine or even a 2x6
> Nehalem would be much faster at this job.  Only way to tell is to run
> something like the stream benchmark and see how it scales,
> memory-wise, as you add cores to the benchmark.

Also I'm guessing that query profiling may help, if we can get the
queries to request less data to trundle through then we might be able
to get Bob the performance needed to keep up.

But at some point he's gonna have to look at partitioning his database
onto multiple machines some how.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 11:20 AM, Kevin Grittner
 wrote:
> Bob Dusek  wrote:
>> Kevin Grittner  wrote:
>>> Bob Dusek  wrote:
>
>>> Anyway, my benchmarks tend to show that best throughput occurs at
>>> about (CPU_count * 2) plus effective_spindle_count.  Since you
>>> seem to be fully cached, effective_spindle_count would be zero,
>>> so I would expect performance to start to degrade when you have
>>> more than about 32 sessions active.
>>>
>> That's a little disheartening for a single or dual CPU system.
>
> Not at all.  You only have so many resources to keep busy at any one
> moment.  It is generally more efficient to only context switch
> between as many processes as can keep those resources relatively
> busy; otherwise valuable resources are spent switching among the
> various processes rather than doing useful work.
>
> [Regular readers of this list might want to skip ahead while I run
> through my usual "thought experiment on the topic.  ;-) ]
>
> Imagine this hypothetical environment -- you have one CPU running
> requests.  There are no other resources to worry about and no
> latency to the clients.  Let's say that the requests take one second
> each.  The client suddenly has 100 requests to run.  Assuming
> context switching is free, you could submit all at once, and 100
> seconds later, you get 100 responses, with an average response time
> of 100 seconds.  Let's put a (again free) connection pooler in
> there.  You submit those 100 requests, but they are fed to the
> database one at a time.  You get one response back in one second,
> the next in two seconds, the last in 100 seconds.  No request took
> any longer, and the average response time was 50.5 seconds -- almost
> a 50% reduction.
>
> Now context switching is not free, and you had tens of thousands of
> them per second.

FYI, on an 8 or 16 core machine, 10k to 30k context switches per
second aren't that much.  If you're climbing past 100k you might want
to look out.

The more I read up on the 74xx CPUs and look at the numbers here the
more I think it's just that this machine has X bandwidth and it's
using it all up.  You could put 1,000 cores in it, and it wouldn't go
any faster.  My guess is that a 4x6 core AMD machine or even a 2x6
Nehalem would be much faster at this job.  Only way to tell is to run
something like the stream benchmark and see how it scales,
memory-wise, as you add cores to the benchmark.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Bob Dusek
On Mon, Jan 11, 2010 at 1:20 PM, Kevin Grittner  wrote:

> Bob Dusek  wrote:
> > Kevin Grittner  wrote:
> >> Bob Dusek  wrote:
>
> >> Anyway, my benchmarks tend to show that best throughput occurs at
> >> about (CPU_count * 2) plus effective_spindle_count.  Since you
> >> seem to be fully cached, effective_spindle_count would be zero,
> >> so I would expect performance to start to degrade when you have
> >> more than about 32 sessions active.
> >>
> > That's a little disheartening for a single or dual CPU system.
>
> Not at all.  You only have so many resources to keep busy at any one
> moment.  It is generally more efficient to only context switch
> between as many processes as can keep those resources relatively
> busy; otherwise valuable resources are spent switching among the
> various processes rather than doing useful work.
>
> [Regular readers of this list might want to skip ahead while I run
> through my usual "thought experiment on the topic.  ;-) ]
>
> Imagine this hypothetical environment -- you have one CPU running
> requests.  There are no other resources to worry about and no
> latency to the clients.  Let's say that the requests take one second
> each.  The client suddenly has 100 requests to run.  Assuming
> context switching is free, you could submit all at once, and 100
> seconds later, you get 100 responses, with an average response time
> of 100 seconds.  Let's put a (again free) connection pooler in
> there.  You submit those 100 requests, but they are fed to the
> database one at a time.  You get one response back in one second,
> the next in two seconds, the last in 100 seconds.  No request took
> any longer, and the average response time was 50.5 seconds -- almost
> a 50% reduction.
>
> Now context switching is not free, and you had tens of thousands of
> them per second.  Besides the hit on CPU availability during each
> switch, you're reducing the value of the L1 and L2 caches.  So in
> reality, you could expect your "request storm" to perform
> significantly worse in comparison to the connection pooled
> configuration.  In reality, you have more than one resource to keep
> busy, so the pool should be sized greater than one; but it's still
> true that there is some point at which getting a request to the
> database server delays the response to that request more than
> queuing it for later execution would.  Some database products build
> in a way to manage this; in PostgreSQL it's on you to do so.
>

I appreciate the explanation.  We were thinking that since we have so much
CPU available, we weren't hitting Postgres' peak and that maybe a config
change would help.  But, thus far, it sounds like we're hardware-bound, and
an application connection pool seems inevitable.

>> Your vmstat output suggests that context switches are becoming a
> >> problem, and I wouldn't be surprised if I heard that the network
> >> is an issue.  You might want to have someone take a look at the
> >> network side to check.
> >>
> > This is all happening on a LAN, and network throughput doesn't
> > seem to be an issue.  It may be a busy network, but I'm not sure
> > about a problem.  Can you elaborate on your suspicion, based on
> > the vmstat?  I haven't used vmstat much.
>
> It was simply this: all that CPU idle time while it was swamped with
> requests suggests that there might be a bottleneck outside the
> database server.  That could be, as another post suggests, the
> client software.  It could also be the network.  (It could also be
> contention on locks within PostgreSQL from the large number of
> requests, but that's covered by the connection pooling suggestion.)
>

I'm curious if it would be worth our effort to enable the pg_stat stuff and
try to analyze the system that way.  We don't have a lot of experience with
that, but if we could learn something critical from it, we will do it.


> > The problem with connection pooling is that we actually have to
> > achieve more than 40 per second, which happens to be the sweet
> > spot with our current config.
>
> Well, if you're considering a connection pool which can only submit
> one request per second, you're looking at the wrong technology.  We
> use a custom connection pool built into our software, so I'm not
> very familiar with the "drop in" packages out there, but we start
> the next queued request based on the completion of a request --
> there's no polling involved.
>

I'm thinking we'll have to roll our own.  In a way, we have already done the
connection pooling.  We're experimenting with a new architecture with much
more demanding performance requirements.  We were emboldened by the hardware
specs.


> Between the RAID 0, fsync = off, and full_page_writes = off -- you
> really had better not be staking anything important on this data.
> This configuration would make The Flying Wallendas break out in a
> sweat.  It suggests to me that you might want to look into a better
> RAID controller -- a high quality controller with battery-backup
> (BB

Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 10:54 AM, Bob Dusek  wrote:
>> You want to use some connection pooling which queues requests when
>> more than some configurable number of connections is already active
>> with a request.  You probably want to run that on the server side.
>> As for the postgresql.conf, could you show what you have right now,
>> excluding all comments?
>
> The problem with connection pooling is that we actually have to achieve more
> than 40 per second, which happens to be the sweet spot with our current
> config.

Number of parallel processes doesn't equal # reqs/second.  If your
maximum throughput occurs at 40 parallel requests, you'll get more
done reducing the maximum number of concurrent processes to 40 and
letting them stack up in a queue waiting for a spot to run.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] work_mem vs temp files issue

2010-01-11 Thread Jaime Casanova
On Mon, Jan 11, 2010 at 2:14 PM, Jaime Casanova
 wrote:
> On Mon, Jan 11, 2010 at 2:07 PM, Robert Haas  wrote:
>> On Mon, Jan 11, 2010 at 1:15 PM, Jaime Casanova
>>  wrote:
>>> On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane  wrote:

 It might be useful to turn on trace_sort to see if the small files
 are coming from sorts.  If they're from hashes I'm afraid there's
 no handy instrumentation ...

>>>
>>> yes they are, this is the log (i deleted the STATEMENT lines because
>>> they were redundant), seems like all the temp files are used to
>>> execute the same sentence...
>>>
>>> BTW, this is my laptop no the server.
>>
>> I think maybe there was supposed to be an attachment here?
>>
>
> i knew i was forgotting something ;)
> ah! and this is in 8.5dev but it's the same in 8.3
>

oh! boy this can't be happen!
attaching again

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


log
Description: Binary data

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] work_mem vs temp files issue

2010-01-11 Thread Jaime Casanova
On Mon, Jan 11, 2010 at 2:07 PM, Robert Haas  wrote:
> On Mon, Jan 11, 2010 at 1:15 PM, Jaime Casanova
>  wrote:
>> On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane  wrote:
>>>
>>> It might be useful to turn on trace_sort to see if the small files
>>> are coming from sorts.  If they're from hashes I'm afraid there's
>>> no handy instrumentation ...
>>>
>>
>> yes they are, this is the log (i deleted the STATEMENT lines because
>> they were redundant), seems like all the temp files are used to
>> execute the same sentence...
>>
>> BTW, this is my laptop no the server.
>
> I think maybe there was supposed to be an attachment here?
>

i knew i was forgotting something ;)
ah! and this is in 8.5dev but it's the same in 8.3

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] work_mem vs temp files issue

2010-01-11 Thread Robert Haas
On Mon, Jan 11, 2010 at 1:15 PM, Jaime Casanova
 wrote:
> On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane  wrote:
>>
>> It might be useful to turn on trace_sort to see if the small files
>> are coming from sorts.  If they're from hashes I'm afraid there's
>> no handy instrumentation ...
>>
>
> yes they are, this is the log (i deleted the STATEMENT lines because
> they were redundant), seems like all the temp files are used to
> execute the same sentence...
>
> BTW, this is my laptop no the server.

I think maybe there was supposed to be an attachment here?

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Bob Dusek
>
>> RAID-0
>>
>
> And how many drives?
>
> Just two.

We have an application server that is processing requests.  Each request
>> consists of a combination of selects, inserts, and deletes.  We actually see
>> degredation when we get more than 40 concurrent requests.  The exact number
>> of queries executed by each request isn't known.  It varies per request.
>>  But, an example request would be about 16 inserts and 113 selects.  Any
>> given request can't execute more than a single query at a time.
>>
>
> So, you are concurrently trying to achieve more than around 640 writes and
> 4520 reads (worst case figures...). This should be interesting. For 40
> concurrent requests you will probably need at least 4 drives in RAID-0 to
> sustain the write rates (and I'll guess 5-6 to be sure to cover read
> requests also, together with plenty of RAM).
>
>   avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>  34.290.007.090.030.00   58.58
>>
>> Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> sda   0.00   112.20  0.00 133.40 0.00  1964.8014.73
>>   0.423.17   0.04   0.48
>>
>> The iowait seems pretty low, doesn't it?
>>
>
> Yes, but you are issuing 133 write operations per seconds per drive(s) -
> this is nearly the limit of what you can get with 15k RPM drives (actually,
> the limit should be somewhere around 200..250 IOPS but 133 isn't that far).
>
>
As you mentioned in a separate response, we have fsync shut off.
Regardless, we shut off a lot of logging in our app and reduced that number
to approx 20 per second.  So, a lot of those writes were coming from outside
the db.  We do a lot of logging.  We should consider turning some off, it
seems.


 top - 10:11:43 up 12 days, 20:48,  6 users,  load average: 10.48, 4.16,
>> 2.83
>> Tasks: 798 total,   8 running, 790 sleeping,   0 stopped,   0 zombie
>> Cpu0  : 33.3%us,  7.6%sy,  0.0%ni, 59.1%id,  0.0%wa,  0.0%hi,  0.0%si,
>>  0.0%st
>>
>
> There is one other possibility - since the CPUs are not very loaded, are
> you sure the client application with which you are testing is fast enough to
> issue enough request to saturate the database?
>

Each of the 256 requests was being processed by a php process.  So, it could
certainly be faster.  But, the fact that we're seeing the db performance
degrade would seem to indicate that our application is fast enough to punish
the db.  Isn't that true?



>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] performance config help

2010-01-11 Thread Kevin Grittner
Bob Dusek  wrote:
> Kevin Grittner  wrote:
>> Bob Dusek  wrote:
 
>> Anyway, my benchmarks tend to show that best throughput occurs at
>> about (CPU_count * 2) plus effective_spindle_count.  Since you
>> seem to be fully cached, effective_spindle_count would be zero,
>> so I would expect performance to start to degrade when you have
>> more than about 32 sessions active.
>>
> That's a little disheartening for a single or dual CPU system.
 
Not at all.  You only have so many resources to keep busy at any one
moment.  It is generally more efficient to only context switch
between as many processes as can keep those resources relatively
busy; otherwise valuable resources are spent switching among the
various processes rather than doing useful work.
 
[Regular readers of this list might want to skip ahead while I run
through my usual "thought experiment on the topic.  ;-) ]
 
Imagine this hypothetical environment -- you have one CPU running
requests.  There are no other resources to worry about and no
latency to the clients.  Let's say that the requests take one second
each.  The client suddenly has 100 requests to run.  Assuming
context switching is free, you could submit all at once, and 100
seconds later, you get 100 responses, with an average response time
of 100 seconds.  Let's put a (again free) connection pooler in
there.  You submit those 100 requests, but they are fed to the
database one at a time.  You get one response back in one second,
the next in two seconds, the last in 100 seconds.  No request took
any longer, and the average response time was 50.5 seconds -- almost
a 50% reduction.
 
Now context switching is not free, and you had tens of thousands of
them per second.  Besides the hit on CPU availability during each
switch, you're reducing the value of the L1 and L2 caches.  So in
reality, you could expect your "request storm" to perform
significantly worse in comparison to the connection pooled
configuration.  In reality, you have more than one resource to keep
busy, so the pool should be sized greater than one; but it's still
true that there is some point at which getting a request to the
database server delays the response to that request more than
queuing it for later execution would.  Some database products build
in a way to manage this; in PostgreSQL it's on you to do so.
 
>> Your vmstat output suggests that context switches are becoming a
>> problem, and I wouldn't be surprised if I heard that the network
>> is an issue.  You might want to have someone take a look at the
>> network side to check.
>>
> This is all happening on a LAN, and network throughput doesn't
> seem to be an issue.  It may be a busy network, but I'm not sure
> about a problem.  Can you elaborate on your suspicion, based on
> the vmstat?  I haven't used vmstat much.
 
It was simply this: all that CPU idle time while it was swamped with
requests suggests that there might be a bottleneck outside the
database server.  That could be, as another post suggests, the
client software.  It could also be the network.  (It could also be
contention on locks within PostgreSQL from the large number of
requests, but that's covered by the connection pooling suggestion.)
 
> The problem with connection pooling is that we actually have to
> achieve more than 40 per second, which happens to be the sweet
> spot with our current config.
 
Well, if you're considering a connection pool which can only submit
one request per second, you're looking at the wrong technology.  We
use a custom connection pool built into our software, so I'm not
very familiar with the "drop in" packages out there, but we start
the next queued request based on the completion of a request --
there's no polling involved.
 
Between the RAID 0, fsync = off, and full_page_writes = off -- you
really had better not be staking anything important on this data. 
This configuration would make The Flying Wallendas break out in a
sweat.  It suggests to me that you might want to look into a better
RAID controller -- a high quality controller with battery-backup
(BBU) cache, configured for write-back, might allow you to change
all these to safe settings.  If you also switch to a RAID
configuration with some redundancy, you'll be much safer
 
-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 10:49 AM, Bob Dusek  wrote:
>> Depends, is that the first iteration of output?  if so, ignore it and
>> show me the second and further on.  Same for vmstat...  In fact let
>> them run for a minute or two and attach the results...  OTOH, if that
>> is the second or later set of output, then you're definitely not IO
>> bound, and I don't see why the CPUs are not being better utilized.
>>
> I was probably not clear... the output I pasted was from the third iteration
> of output from iostat.  And, the vmstat output I pasted was from the sixth
> iteration of output

Yeah, you're definitely CPU/Memory bound it seems.

> We can take some measurements at 40 concurrent requests and see where we
> stand.

We'll probably not see much difference, if you're waiting on memory.

> So, we should probably try cranking our random_page_cost value down.  When
> we dump our db with "pg_dump --format=t", it's about 15 MB.  We should be
> able to keep the thing in memory.

Yeah, I doubt that changing it will make a huge difference given how
small your db is.

>> There are several common bottlenecks you can try to tune away from.
>> IO doesn't look like a problem for you.  Neither does CPU load.  So,
>> then we're left with context switching time and memory to CPU
>> bandwidth.  If your CPUs are basically becoming data pumps then the
>> speed of your FSB becomes VERY critical, and some older Intel mobos
>> didn't have a lot of CPU to Memory bandwidth and adding CPUs made it
>> worse, not better.  More modern Intel chipsets have much faster CPU to
>> Memory BW, since they're using the same kind of fabric switching that
>> AMD uses on highly parallel machines.
>
> Each CPU is 2.13 GHz, with 8MB Cache, and the FSB is 1066 MHz.  Does that
> bus speed seem slow?

When 16 cores are all sharing the same bus (which a lot of older
designs do) then yes.  I'm not that familiar with the chipset you're
running, but I don't think that series CPU has an integrated memory
controller.  Does it break the memory into separate chunks that
different cpus can access without stepping on each other's toes?

Later Intel and all AMDs since the Opteron have built in memory
controllers.  This meant that going from 2 to 4 cpus in an AMD server
doubled your memory bandwidth, while going from 2 to 4 cpus on older
intel designs left it the same so that each cpu got 1/2 as much
bandwidth as it had before when there were 2.

> It's hard to go to the money tree  and say "we're only using about half of
> your CPUs, but you need to get better ones."

Well, if the problem is that you've got a chipset that can't utilize
all your CPUs because of memory bw starvation, it's your only fix.
You should set up some streaming read / write to memory tests you can
run singly, then on 2, 4, 8 16 cores and see how fast memory access is
as you add more threads.  I'm betting you'll get throughput well
before 16 cores are working on the problem.

>> If your limit is your hardware, then the only solution is a faster
>> machine.  It may well be that a machine with dual fast Nehalem
>> (2.4GHz+) quad core CPUs will be faster.  Or 4 or 8 AMD CPUs with
>> their faster fabric.
>
> It sounds like we could spend less money on memory and more on faster hard
> drives and faster CPUs.

I'm pretty sure you could live with slower hard drives here, and fsync
on as well possibly.  It looks like it's all cpu <-> memory bandwidth.
 But I'm just guessing.

> But, man, that's a tough sell.  This box is a giant, relative to anything
> else we've worked with.

Yeah, I understand.  We're looking at having to upgrade our dual cpu /
quad core AMD 2.1GHz machine to 4 hex core cpus this summer, possibly
dodecacore cpus even.

So, I took a break from writing and searched for some more info on the
74xx series CPUs, and from reading lots of articles, including this
one:

http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3414

It seems apparent that the 74xx series if a great CPU, as long as
you're not memory bound.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] work_mem vs temp files issue

2010-01-11 Thread Jaime Casanova
On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane  wrote:
>
> It might be useful to turn on trace_sort to see if the small files
> are coming from sorts.  If they're from hashes I'm afraid there's
> no handy instrumentation ...
>

yes they are, this is the log (i deleted the STATEMENT lines because
they were redundant), seems like all the temp files are used to
execute the same sentence...

BTW, this is my laptop no the server.

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Bob Dusek
On Mon, Jan 11, 2010 at 12:17 PM, Kevin Grittner <
kevin.gritt...@wicourts.gov> wrote:

> Bob Dusek  wrote:
> > Scott Marlowe wrote:
> >> Bob Dusek  wrote:
>
> >>> 4X E7420 Xeon, Four cores (for a total of 16 cores)
>
> >> What method of striped RAID?
> >
> > RAID-0
>
> I hope you have a plan for what to do when any one drive in this
> array fails, and the entire array is unusable.
>

Point noted.


> Anyway, my benchmarks tend to show that best throughput occurs at
> about (CPU_count * 2) plus effective_spindle_count.  Since you seem
> to be fully cached, effective_spindle_count would be zero, so I
> would expect performance to start to degrade when you have more than
> about 32 sessions active.
>

That's a little disheartening for a single or dual CPU system.


> > We actually see degredation when we get more than 40 concurrent
> > requests.
>
> Close enough.
>
> > when we run 200+ requests, we see multi-second queries.
>
> No surprise there.  Your vmstat output suggests that context
> switches are becoming a problem, and I wouldn't be surprised if I
> heard that the network is an issue.  You might want to have someone
> take a look at the network side to check.
>

This is all happening on a LAN, and network throughput doesn't seem to be an
issue.  It may be a busy network, but I'm not sure about a problem.  Can you
elaborate on your suspicion, based on the vmstat?  I haven't used vmstat
much.

>
> You want to use some connection pooling which queues requests when
> more than some configurable number of connections is already active
> with a request.  You probably want to run that on the server side.
> As for the postgresql.conf, could you show what you have right now,
> excluding all comments?
>

The problem with connection pooling is that we actually have to achieve more
than 40 per second, which happens to be the sweet spot with our current
config.

I posted our changes from the default in a reply to another message.  I
don't want to duplicate.  Can you check those out?

>
> -Kevin
>


Re: [PERFORM] performance config help

2010-01-11 Thread Bob Dusek
>
>
> > This is to be expected, to some extent, as we would expect some
> perfromance
> > degradation with higher utilization.  But, the hardware doesn't appear to
> be
> > very busy, and that's where we're hoping for some help.
>
> It's likely in io wait.
>
> >> What do the following commands tell you?
> >>
> >> iostat -x 10 (first iteration doesn't count)
> >
> > Here's some iostat output (from the 3rd data point from iostat -x 10)...
> > this was taken while we were processing 256 simultaneous requests.
> >
> >  avg-cpu:  %user   %nice %system %iowait  %steal   %idle
> >   34.290.007.090.030.00   58.58
> >
> > Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> > avgqu-sz   await  svctm  %util
> > sda   0.00   112.20  0.00 133.40 0.00  1964.8014.73
> > 0.423.17   0.04   0.48
> > sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> > 0.000.00   0.00   0.00
> > sda2  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> > 0.000.00   0.00   0.00
> > sda3  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> > 0.000.00   0.00   0.00
> > sda4  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> > 0.000.00   0.00   0.00
> > sda5  0.00   112.20  0.00 133.40 0.00  1964.8014.73
> > 0.423.17   0.04   0.48
> > sdb   0.00 0.00  0.00  0.00 0.00 0.00 0.00
> > 0.000.00   0.00   0.00
> > dm-0  0.00 0.00  0.00  0.40 0.00 3.20 8.00
> > 0.000.00   0.00   0.00
> > dm-1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> > 0.000.00   0.00   0.00
> > dm-2  0.00 0.00  0.00 99.90 0.00   799.20 8.00
> > 0.151.50   0.01   0.12
> > dm-3  0.00 0.00  0.00  0.60 0.00 4.80 8.00
> > 0.000.33   0.17   0.01
> > dm-4  0.00 0.00  0.00 144.70 0.00  1157.60 8.00
> > 0.463.17   0.02   0.35
> > dm-5  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> > 0.000.00   0.00   0.00
> >
> > The iowait seems pretty low, doesn't it?
>
> Depends, is that the first iteration of output?  if so, ignore it and
> show me the second and further on.  Same for vmstat...  In fact let
> them run for a minute or two and attach the results...  OTOH, if that
> is the second or later set of output, then you're definitely not IO
> bound, and I don't see why the CPUs are not being better utilized.
>
> I was probably not clear... the output I pasted was from the third
iteration of output from iostat.  And, the vmstat output I pasted was from
the sixth iteration of output

How many concurrent queries are you running when you take these
> measurements?  Can you take them with lower and higher numbers of
> concurrent users and compare the two?  normally I'd be looking for
> context switching taking more and more time in a heavily loaded
> system, but I'm not seeing it in your vmstat numbers either.
>
> We took those measurements with 256 concurrent requests being processed.
So, at most, we have 256 concurrent queries executed by our application.
There aren't other applications using the db in our tests.

We can take some measurements at 40 concurrent requests and see where we
stand.


> What are your settings for
>
> effective_cache_size
>

 effective_cache_size = 24GB


> random_page_cost
>

Using the default...

#random_page_cost = 4.0


> work_mem
>

 work_mem = 64MB


> with your machine and the extra memory, you can probably uptune the
> work_mem to 8 Megs safely if it's at the default of 1MB.  With a
> database that fits in RAM, you can often turn down random_page_cost to
> near 1.0 (1.2 to 1.4 is common for such scenarios.)  And effective
> cache size being larger (in the 20G range) will hint the planner that
> it's likely to find everything it needs in ram somewhere and not on
> the disk.
>

So, we should probably try cranking our random_page_cost value down.  When
we dump our db with "pg_dump --format=t", it's about 15 MB.  We should be
able to keep the thing in memory.

>
> There are several common bottlenecks you can try to tune away from.
> IO doesn't look like a problem for you.  Neither does CPU load.  So,
> then we're left with context switching time and memory to CPU
> bandwidth.  If your CPUs are basically becoming data pumps then the
> speed of your FSB becomes VERY critical, and some older Intel mobos
> didn't have a lot of CPU to Memory bandwidth and adding CPUs made it
> worse, not better.  More modern Intel chipsets have much faster CPU to
> Memory BW, since they're using the same kind of fabric switching that
> AMD uses on highly parallel machines.
>

Each CPU is 2.13 GHz, with 8MB Cache, and the FSB is 1066 MHz.  Does that
bus speed seem slow?

It's hard to go to the money tree  and say "we're only using about half of
your CPUs, but you need to get better ones."


> If your limit is your hardw

Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
We may want to start looking at query plans for the slowest queries.
Use explain analyze to find them and attach them here.  I kinda have a
feeling you're running into a limit on the speed of your memory
though, and there's no real cure for that.  You can buy a little time
with some query or db tuning, but 250 or more concurrent users is a
LOT.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 9:42 AM, Bob Dusek  wrote:
>> What method of striped RAID?  RAID-5?  RAID-10?  RAID-4?  RAID-0?
>
> RAID-0

Just wondering how many drives?

> To be more specific about the degradation, we've set the
> "log_min_duration_statement=200", and when we run with 40 concurrent
> requests, we don't see queries showing up in there.  When we run with 60
> concurrent requests, we start seeing queries show up, and when we run 200+
> requests, we see multi-second queries.
>
> This is to be expected, to some extent, as we would expect some perfromance
> degradation with higher utilization.  But, the hardware doesn't appear to be
> very busy, and that's where we're hoping for some help.

It's likely in io wait.

>> What do the following commands tell you?
>>
>> iostat -x 10 (first iteration doesn't count)
>
> Here's some iostat output (from the 3rd data point from iostat -x 10)...
> this was taken while we were processing 256 simultaneous requests.
>
>  avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>   34.29    0.00    7.09    0.03    0.00   58.58
>
> Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> avgqu-sz   await  svctm  %util
> sda   0.00   112.20  0.00 133.40 0.00  1964.80    14.73
> 0.42    3.17   0.04   0.48
> sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> 0.00    0.00   0.00   0.00
> sda2  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> 0.00    0.00   0.00   0.00
> sda3  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> 0.00    0.00   0.00   0.00
> sda4  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> 0.00    0.00   0.00   0.00
> sda5  0.00   112.20  0.00 133.40 0.00  1964.80    14.73
> 0.42    3.17   0.04   0.48
> sdb   0.00 0.00  0.00  0.00 0.00 0.00 0.00
> 0.00    0.00   0.00   0.00
> dm-0  0.00 0.00  0.00  0.40 0.00 3.20 8.00
> 0.00    0.00   0.00   0.00
> dm-1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> 0.00    0.00   0.00   0.00
> dm-2  0.00 0.00  0.00 99.90 0.00   799.20 8.00
> 0.15    1.50   0.01   0.12
> dm-3  0.00 0.00  0.00  0.60 0.00 4.80 8.00
> 0.00    0.33   0.17   0.01
> dm-4  0.00 0.00  0.00 144.70 0.00  1157.60 8.00
> 0.46    3.17   0.02   0.35
> dm-5  0.00 0.00  0.00  0.00 0.00 0.00 0.00
> 0.00    0.00   0.00   0.00
>
> The iowait seems pretty low, doesn't it?

Depends, is that the first iteration of output?  if so, ignore it and
show me the second and further on.  Same for vmstat...  In fact let
them run for a minute or two and attach the results...  OTOH, if that
is the second or later set of output, then you're definitely not IO
bound, and I don't see why the CPUs are not being better utilized.

How many concurrent queries are you running when you take these
measurements?  Can you take them with lower and higher numbers of
concurrent users and compare the two?  normally I'd be looking for
context switching taking more and more time in a heavily loaded
system, but I'm not seeing it in your vmstat numbers either.

What are your settings for

effective_cache_size
random_page_cost
work_mem

with your machine and the extra memory, you can probably uptune the
work_mem to 8 Megs safely if it's at the default of 1MB.  With a
database that fits in RAM, you can often turn down random_page_cost to
near 1.0 (1.2 to 1.4 is common for such scenarios.)  And effective
cache size being larger (in the 20G range) will hint the planner that
it's likely to find everything it needs in ram somewhere and not on
the disk.

There are several common bottlenecks you can try to tune away from.
IO doesn't look like a problem for you.  Neither does CPU load.  So,
then we're left with context switching time and memory to CPU
bandwidth.  If your CPUs are basically becoming data pumps then the
speed of your FSB becomes VERY critical, and some older Intel mobos
didn't have a lot of CPU to Memory bandwidth and adding CPUs made it
worse, not better.  More modern Intel chipsets have much faster CPU to
Memory BW, since they're using the same kind of fabric switching that
AMD uses on highly parallel machines.

If your limit is your hardware, then the only solution is a faster
machine.  It may well be that a machine with dual fast Nehalem
(2.4GHz+) quad core CPUs will be faster.  Or 4 or 8 AMD CPUs with
their faster fabric.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] cache false-sharing in lwlocks

2010-01-11 Thread Rayson Ho
Hi,

LWLockPadded is either 16 or 32 bytes, so modern systems (e.g. Core2
or AMD Opteron [1]) with cacheline size of 64 bytes can get
false-sharing in lwlocks.

I changed LWLOCK_PADDED_SIZE in src/backend/storage/lmgr/lwlock.c to
64, and ran sysbench OLTP read-only benchmark, and got a slight
improvement in throughput:

Hardware: single-socket Core 2, quad-core, Q6600  @ 2.40GHz
Software: Linux 2.6.28-17, glibc 2.9, gcc 4.3.3

PostgreSQL: 8.5alpha3

sysbench parameters: sysbench --num-threads=4 --max-requests=0
--max-time=120 --oltp-read-only=on --test=oltp

original: 3227, 3243, 3243
after: 3256, 3255, 3253

So there is a speedup of 1.005 or what other people usually call it, a
0.5% improvement.

However, it's a single socket machine, so all the cache traffic does
not need to go off-chip. Can someone with a multi-socket machine help
me run some test so that we can get a better idea of how this change
(patch attached) performs in bigger systems??

Thanks,
Rayson

P.S. And I just googled and found similar discussions about padding
LWLOCK_PADDED_SIZE, but the previous work was done on an IBM POWER
system, and the benchmark used was apachebench. IMO, the setup was too
complex to measure a small performance improvement in PostgreSQL.

[1] Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™ ccNUMA
Multiprocessor Systems Application Note
Index: src/backend/storage/lmgr/lwlock.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/storage/lmgr/lwlock.c,v
retrieving revision 1.55
diff -u -4 -r1.55 lwlock.c
--- src/backend/storage/lmgr/lwlock.c   2 Jan 2010 16:57:52 -   1.55
+++ src/backend/storage/lmgr/lwlock.c   11 Jan 2010 16:40:03 -
@@ -57,9 +57,11 @@
  *
  * LWLock is between 16 and 32 bytes on all known platforms, so these two
  * cases are sufficient.
  */
-#define LWLOCK_PADDED_SIZE (sizeof(LWLock) <= 16 ? 16 : 32)
+/* #define LWLOCK_PADDED_SIZE  (sizeof(LWLock) <= 16 ? 16 : 32) */
+
+#define LWLOCK_PADDED_SIZE 64
 
 typedef union LWLockPadded
 {
LWLock  lock;

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Ivan Voras

Ivan Voras wrote:

Yes, but you are issuing 133 write operations per seconds per drive(s) - 
this is nearly the limit of what you can get with 15k RPM drives 
(actually, the limit should be somewhere around 200..250 IOPS but 133 
isn't that far).


I saw in your other post you have fsync turned off so ignore this, it's 
not an IO problem in your case.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Kevin Grittner
Bob Dusek  wrote:
> Scott Marlowe wrote:
>> Bob Dusek  wrote:
 
>>> 4X E7420 Xeon, Four cores (for a total of 16 cores)
 
>> What method of striped RAID?
> 
> RAID-0
 
I hope you have a plan for what to do when any one drive in this
array fails, and the entire array is unusable.
 
Anyway, my benchmarks tend to show that best throughput occurs at
about (CPU_count * 2) plus effective_spindle_count.  Since you seem
to be fully cached, effective_spindle_count would be zero, so I
would expect performance to start to degrade when you have more than
about 32 sessions active.
 
> We actually see degredation when we get more than 40 concurrent
> requests.
 
Close enough.
 
> when we run 200+ requests, we see multi-second queries.
 
No surprise there.  Your vmstat output suggests that context
switches are becoming a problem, and I wouldn't be surprised if I
heard that the network is an issue.  You might want to have someone
take a look at the network side to check.
 
You want to use some connection pooling which queues requests when
more than some configurable number of connections is already active
with a request.  You probably want to run that on the server side. 
As for the postgresql.conf, could you show what you have right now,
excluding all comments?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Ivan Voras

Bob Dusek wrote:
On Mon, Jan 11, 2010 at 8:50 AM, Scott Marlowe > wrote:


On Mon, Jan 11, 2010 at 6:44 AM, Bob Dusek mailto:redu...@gmail.com>> wrote:
 > Hello,
 >
 > We're running Postgres 8.4.2 on Red Hat 5, on pretty hefty
hardware...
 >
 > 4X E7420 Xeon, Four cores (for a total of 16 cores)
 > 2.13 GHz, 8M Cache, 1066 Mhz FSB
 > 32 Gigs of RAM
 > 15 K RPM drives in striped raid

What method of striped RAID?  RAID-5?  RAID-10?  RAID-4?  RAID-0?


RAID-0


And how many drives?



 > Things run fine, but when we get a lot of concurrent queries
running, we see
 > a pretty good slow down.

Definte "a lot".


We have an application server that is processing requests.  Each request 
consists of a combination of selects, inserts, and deletes.  We actually 
see degredation when we get more than 40 concurrent requests.  The exact 
number of queries executed by each request isn't known.  It varies per 
request.  But, an example request would be about 16 inserts and 113 
selects.  Any given request can't execute more than a single query at a 
time.  


So, you are concurrently trying to achieve more than around 640 writes 
and 4520 reads (worst case figures...). This should be interesting. For 
40 concurrent requests you will probably need at least 4 drives in 
RAID-0 to sustain the write rates (and I'll guess 5-6 to be sure to 
cover read requests also, together with plenty of RAM).



 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  34.290.007.090.030.00   58.58

Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda   0.00   112.20  0.00 133.40 0.00  1964.80
14.73 0.423.17   0.04   0.48


The iowait seems pretty low, doesn't it?


Yes, but you are issuing 133 write operations per seconds per drive(s) - 
this is nearly the limit of what you can get with 15k RPM drives 
(actually, the limit should be somewhere around 200..250 IOPS but 133 
isn't that far).



top - 10:11:43 up 12 days, 20:48,  6 users,  load average: 10.48, 4.16, 2.83
Tasks: 798 total,   8 running, 790 sleeping,   0 stopped,   0 zombie
Cpu0  : 33.3%us,  7.6%sy,  0.0%ni, 59.1%id,  0.0%wa,  0.0%hi,  0.0%si,  
0.0%st


There is one other possibility - since the CPUs are not very loaded, are 
you sure the client application with which you are testing is fast 
enough to issue enough request to saturate the database?



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Bob Dusek
On Mon, Jan 11, 2010 at 9:07 AM, A. Kretschmer <
andreas.kretsch...@schollglas.com> wrote:

> In response to Bob Dusek :
> > Hello,
> >
> > We're running Postgres 8.4.2 on Red Hat 5, on pretty hefty hardware...
> >
> > 4X E7420 Xeon, Four cores (for a total of 16 cores)
> > 2.13 GHz, 8M Cache, 1066 Mhz FSB
> > 32 Gigs of RAM
> > 15 K RPM drives in striped raid
> >
> > Things run fine, but when we get a lot of concurrent queries running, we
> see a
> > pretty good slow down.
> >
> > We don't have much experience with this sort of hardware.   Does anyone
> have an
> > example config file we could use as a good starting point for this sort
> of
> > hardware?
>
> Have you tuned your postgresql.conf? (memory-parameter)
>
>
Here's a list of what I consider to be key changes we've made to the config
file (from default)..

for comparison purposes, the diff command was "diff postgresql.conf.dist
postgresql.conf.mod"
64c64
< max_connections = 100# (change requires restart)
---
> max_connections = 300# (change requires restart)
78c78
< ssl = true# (change requires restart)
---
> #ssl = off# (change requires restart)
106c106,107
< shared_buffers = 32MB# min 128kB
---
> #shared_buffers = 32MB# min 128kB
> shared_buffers = 8GB# min 128kB (rdk)
115a117
> work_mem = 64MB# min 64kB (vrk) (rdk)
117c119,121
< #max_stack_depth = 2MB# min 100kB
---
> maintenance_work_mem = 2GB# min 1MB (rdk)
> #max_stack_depth = 1MB# min 100kB
> max_stack_depth = 9MB# min 100kB (vrk)
127a132
> vacuum_cost_delay = 15ms# 0-100 milliseconds (rdk)
150c155
< #fsync = on# turns forced synchronization on or off
---
> fsync = off# turns forced synchronization on or off (rdk)

Please note, I've been reading this list a bit lately, and I'm aware of the
kind of advice that some offer with respect to fsync.  I understand that
with 8.4 we can turn this on and shut off "synchronous_commit".  I would be
interested in more information on that.  But the bottom line is that we've
gotten in the habit of shutting this off (on production servers) using
Postgres 7.4, as the performance gain is enormous, and with fsync=on, we
couldn't get the performance we needed.

151a157
> synchronous_commit = off# immediate fsync at commit
152a159
> wal_sync_method = open_sync# the default is the first option (vrk)
159c166
< #full_page_writes = on# recover from partial page writes
---
> full_page_writes = off# recover from partial page writes (rdk)
160a168
> wal_buffers = 8MB# min 32kB (rdk)
164c172
< #commit_delay = 0# range 0-10, in microseconds
---
> commit_delay = 10# range 0-10, in microseconds (vrk)
169a178
> checkpoint_segments = 256# in logfile segments, min 1, 16MB each
(rdk)
170a180
> checkpoint_timeout = 15min# range 30s-1h (rdk)
171a182
> checkpoint_completion_target = 0.7# checkpoint target duration, 0.0 -
1.0 (rdk)
206a218
> effective_cache_size = 24GB # (rdk)

I would be willing to send our entire config file to someone if that would
help... I didn't want to attach it to this email, because I'm not sure about
the etiquette of attaching files to emails on this list.


> Here are some links for you:
>
> 15:07 < akretschmer> ??performance
> 15:07 < rtfm_please> For information about performance
> 15:07 < rtfm_please> see
> http://revsys.com/writings/postgresql-performance.html
> 15:07 < rtfm_please> or
> http://wiki.postgresql.org/wiki/Performance_Optimization
> 15:07 < rtfm_please> or
> http://www.depesz.com/index.php/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/
>

We will spend more time on this and see if we can learn more.

But, we're hoping someone on this list can offer us some quick tips to help
us use up more of the 16 cpus we have available.

Thanks for pointing all of that out.


>
>
> HTH, Andreas
> --
> Andreas Kretschmer
> Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
> GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] performance config help

2010-01-11 Thread Bob Dusek
On Mon, Jan 11, 2010 at 8:50 AM, Scott Marlowe wrote:

> On Mon, Jan 11, 2010 at 6:44 AM, Bob Dusek  wrote:
> > Hello,
> >
> > We're running Postgres 8.4.2 on Red Hat 5, on pretty hefty hardware...
> >
> > 4X E7420 Xeon, Four cores (for a total of 16 cores)
> > 2.13 GHz, 8M Cache, 1066 Mhz FSB
> > 32 Gigs of RAM
> > 15 K RPM drives in striped raid
>
> What method of striped RAID?  RAID-5?  RAID-10?  RAID-4?  RAID-0?
>

RAID-0


>
> > Things run fine, but when we get a lot of concurrent queries running, we
> see
> > a pretty good slow down.
>
> Definte "a lot".
>
>
We have an application server that is processing requests.  Each request
consists of a combination of selects, inserts, and deletes.  We actually see
degredation when we get more than 40 concurrent requests.  The exact number
of queries executed by each request isn't known.  It varies per request.
But, an example request would be about 16 inserts and 113 selects.  Any
given request can't execute more than a single query at a time.

To be more specific about the degradation, we've set the
"log_min_duration_statement=200", and when we run with 40 concurrent
requests, we don't see queries showing up in there.  When we run with 60
concurrent requests, we start seeing queries show up, and when we run 200+
requests, we see multi-second queries.

This is to be expected, to some extent, as we would expect some perfromance
degradation with higher utilization.  But, the hardware doesn't appear to be
very busy, and that's where we're hoping for some help.

We want to have Postgres eating up as many resources as possible to chug
through our queries faster.  Right now, it's running slower with more
utilization, but there's still too much idle time for the CPUs.

> We don't have much experience with this sort of hardware.   Does anyone
> have
> > an example config file we could use as a good starting point for this
> sort
> > of hardware?
> >
> > We have a fair amount of row-level inserts and deletes going on (probably
> as
> > much data is inserted and deleted in a day than permanently resides in
> the
> > db).
>
> What do the following commands tell you?
>
> iostat -x 10 (first iteration doesn't count)
>

Here's some iostat output (from the 3rd data point from iostat -x 10)...
this was taken while we were processing 256 simultaneous requests.

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  34.290.007.090.030.00   58.58

Device: rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda   0.00   112.20  0.00 133.40 0.00  1964.8014.73
0.423.17   0.04   0.48
sda1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
sda2  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
sda3  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
sda4  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
sda5  0.00   112.20  0.00 133.40 0.00  1964.8014.73
0.423.17   0.04   0.48
sdb   0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
dm-0  0.00 0.00  0.00  0.40 0.00 3.20 8.00
0.000.00   0.00   0.00
dm-1  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00
dm-2  0.00 0.00  0.00 99.90 0.00   799.20 8.00
0.151.50   0.01   0.12
dm-3  0.00 0.00  0.00  0.60 0.00 4.80 8.00
0.000.33   0.17   0.01
dm-4  0.00 0.00  0.00 144.70 0.00  1157.60 8.00
0.463.17   0.02   0.35
dm-5  0.00 0.00  0.00  0.00 0.00 0.00 0.00
0.000.00   0.00   0.00

The iowait seems pretty low, doesn't it?

vmstat 10 (again, first iteration doesn't count)
>

Here's the vmstat output, with the 6th data element clipped out, which seems
representative of the whole... (also taken during 256 simultaneous request)

[r...@ecpe1 pg_log]# vmstat 10
procs ---memory-- ---swap-- -io --system--
-cpu--
r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
st
8  0  0 21818136 696692 387117200 0  1042 2261 48418 34  7
59  0  0

That's a lot of free mem, which is to be expected.  Our database is not very
large.

top
>

(taken during 256 simultaneous requests)

top - 10:11:43 up 12 days, 20:48,  6 users,  load average: 10.48, 4.16, 2.83
Tasks: 798 total,   8 running, 790 sleeping,   0 stopped,   0 zombie
Cpu0  : 33.3%us,  7.6%sy,  0.0%ni, 59.1%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu1  : 31.6%us,  5.9%sy,  0.0%ni, 62.5%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu2  : 33.0%us,  6.6%sy,  0.0%ni, 60.4%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu3  : 35.4%us,  6.2%sy,  0.0%ni, 58.0%id,  0.0%wa,  0.0%hi,  0.3%si,
0.0%st
Cpu4  : 36.3%us,  5.6%sy,  0.0%ni, 58.1%id,  0.0%wa,  

Re: [PERFORM] Choice of bitmap scan over index scan

2010-01-11 Thread Kevin Grittner
Mathieu De Zutter  wrote:
 
> So if this query usually does *not* hit the cache, it will be
> probably faster if I leave it like that? While testing a query I
> execute it that much that it's always getting into the cache.
> However, since other applications run on the same server, I think
> that infrequently used data gets flushed after a while, even if
> the DB could fit in the RAM.
 
You definitely were hitting the cache almost exclusively in the
EXPLAIN ANALYZE results you sent.  If that's not typically what
happens, we'd be able to better analyze the situation with an
EXPLAIN ANALYZE of a more typical run.  That said, if you are doing
physical reads, reading backwards on the index is going to degrade
pretty quickly if you're using a normal rotating magnetic medium,
because the blocks are arranged on the disk in a format to support
fast reads in a forward direction.  Given that and other factors,
the bitmap scan will probably be much faster if you do wind up going
to the disk most of the time.
 
On the other hand, there's no reason to lie to the optimizer about
how much memory is on the machine.  You can't expect it to make sane
choices on the basis of misleading assumptions.  For starters, try
setting effective_cache_size to at least 1GB.  That doesn't reserve
any space, it just tells the optimizer what it can assume about how
much data can be cached, and a large setting will tend to encourage
more indexed access.
 
Given that when you focus on one piece of the database, the caching
effects are pretty dramatic, you really should reduce
random_page_cost to 2, even with the in-and-out-of-cache scenario
you describe.  These aren't "magic bullets" that solve all
performance problems, but you would be giving the optimizer a
fighting chance at costing plans in a way that the one with the
lowest calculated cost is actually the one which will run the
fastest.
 
Also, if the pressure on RAM is that high on this machine, it would
probably be cost-effective to add another 2GB of RAM, so that you
could be a bit more generous in your allocation of RAM to the
database.  It might make your problems queries an order of magnitude
or more faster with very little effort.  If a quick google search is
any indication, you can get that much RAM for less than $50 these
days, if you've got a slot open.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread A. Kretschmer
In response to Bob Dusek :
> Hello,
> 
> We're running Postgres 8.4.2 on Red Hat 5, on pretty hefty hardware...
> 
> 4X E7420 Xeon, Four cores (for a total of 16 cores)
> 2.13 GHz, 8M Cache, 1066 Mhz FSB
> 32 Gigs of RAM
> 15 K RPM drives in striped raid
> 
> Things run fine, but when we get a lot of concurrent queries running, we see a
> pretty good slow down.
> 
> We don't have much experience with this sort of hardware.   Does anyone have 
> an
> example config file we could use as a good starting point for this sort of
> hardware?

Have you tuned your postgresql.conf? (memory-parameter)

Here are some links for you:

15:07 < akretschmer> ??performance
15:07 < rtfm_please> For information about performance
15:07 < rtfm_please> see http://revsys.com/writings/postgresql-performance.html
15:07 < rtfm_please> or http://wiki.postgresql.org/wiki/Performance_Optimization
15:07 < rtfm_please> or 
http://www.depesz.com/index.php/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/


HTH, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-11 Thread Scott Marlowe
On Mon, Jan 11, 2010 at 6:44 AM, Bob Dusek  wrote:
> Hello,
>
> We're running Postgres 8.4.2 on Red Hat 5, on pretty hefty hardware...
>
> 4X E7420 Xeon, Four cores (for a total of 16 cores)
> 2.13 GHz, 8M Cache, 1066 Mhz FSB
> 32 Gigs of RAM
> 15 K RPM drives in striped raid

What method of striped RAID?  RAID-5?  RAID-10?  RAID-4?  RAID-0?

> Things run fine, but when we get a lot of concurrent queries running, we see
> a pretty good slow down.

Definte "a lot".

> We don't have much experience with this sort of hardware.   Does anyone have
> an example config file we could use as a good starting point for this sort
> of hardware?
>
> We have a fair amount of row-level inserts and deletes going on (probably as
> much data is inserted and deleted in a day than permanently resides in the
> db).

What do the following commands tell you?

iostat -x 10 (first iteration doesn't count)
vmstat 10 (again, first iteration doesn't count)
top

What you're looking for is iowait / utilization.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] performance config help

2010-01-11 Thread Bob Dusek
Hello,

We're running Postgres 8.4.2 on Red Hat 5, on pretty hefty hardware...

4X E7420 Xeon, Four cores (for a total of 16 cores)
2.13 GHz, 8M Cache, 1066 Mhz FSB
32 Gigs of RAM
15 K RPM drives in striped raid

Things run fine, but when we get a lot of concurrent queries running, we see
a pretty good slow down.

We don't have much experience with this sort of hardware.   Does anyone have
an example config file we could use as a good starting point for this sort
of hardware?

We have a fair amount of row-level inserts and deletes going on (probably as
much data is inserted and deleted in a day than permanently resides in the
db).

Bob


Re: [PERFORM] PG optimization question

2010-01-11 Thread Pierre Frédéric Caillau d
On Sun, 10 Jan 2010 19:45:32 +0100, Robert Haas   
wrote:



2010/1/10 Pierre Frédéric Caillaud :



If you transfer (delete from staging, insert into archive) in one
transaction , then it will be always visible in exactly one of them,
and exatly once in a view over both staging and archive(s).


       Does the latest version implement this :

INSERT INTO archive (...) DELETE FROM staging WHERE ... RETURNING ...


No.  There are no plans to support that, though there are proposals to  
support:


WITH x AS (DELETE FROM staging WHERE ... RETURNING ...) INSERT INTO
archive (...) SELECT ... FROM x

I'm not sure how much that will help though since, in the designs so
far discused, the tuples won't be pipelined.

...Robert



	Yeah, but it's a lot more user-friendly than SELECT FOR UPDATE, INSERT  
SELECT, DELETE...




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choice of bitmap scan over index scan

2010-01-11 Thread Pierre Frédéric Caillau d
Postgres is being conservative. The plan it uses (bitmap index scan)  
will perform much better than an index scan when the data is not in the  
cache, by maybe an order of magnitude, depending on your hardware setup.


The index scan may perform better at the moment, but the bitmap index  
scan is safer.


	Suppose you make a query that will need to retrieve 5% of the rows in a  
table...


	If the table is nicely clustered (ie you want the latest rows in a table  
where they are always appended at the end with no holes, for instance),  
bitmap index scan will mark 5% of the pages for reading, and read them  
sequentially (fast). Plain index scan will also scan the rows more or less  
sequentially, so it's going to be quite fast too.


	Now if your table is not clustered at all, or clustered on something  
which has no correlation to your current query, you may hit the worst case  
: reading a ramdom sampling of 5% of the pages. Bitmap index scan will  
sort these prior to reading, so the HDD/OS will do smart things. Plain  
index scan won't.


	- worst case for bitmap index scan is a seq scan... slow, but if you have  
no other choice, it's OK.
	- worst case for plain index scan is a lot worse since it's a random  
seekfest.


	If everything is cached in RAM, there is not much difference (plain index  
scan can be faster if the bitmap "recheck cond" is slow).


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choice of bitmap scan over index scan

2010-01-11 Thread Matthew Wakeling

On Mon, 11 Jan 2010, Mathieu De Zutter wrote:

> seq_page_cost = 0.1
> random_page_cost = 0.1



So if this query usually does *not* hit the cache, it will be probably faster 
if I leave
it like that? While testing a query I execute it that much that it's always 
getting into
the cache. However, since other applications run on the same server, I think 
that
infrequently used data gets flushed after a while, even if the DB could fit in 
the RAM.


Postgres is being conservative. The plan it uses (bitmap index scan) will 
perform much better than an index scan when the data is not in the cache, 
by maybe an order of magnitude, depending on your hardware setup.


The index scan may perform better at the moment, but the bitmap index scan 
is safer.


Matthew

--
Change is inevitable, except from vending machines.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choice of bitmap scan over index scan

2010-01-11 Thread Mathieu De Zutter
On Mon, Jan 11, 2010 at 3:52 AM, Robert Haas  wrote:

> On Sun, Jan 10, 2010 at 10:53 AM, Kevin Grittner
>  wrote:
> > seq_page_cost = 0.1
> > random_page_cost = 0.1
>
> These might not even be low enough.  The reason why bitmap index scans
> win over plain index scans, in general, is because you make one pass
> through the heap to get all the rows you need instead of bouncing
> around doing lots of random access.  But of course if all the data is
> in RAM then this argument falls down.
>
> If these aren't enough to get the query planner to DTRT, then the OP
> might want to try lowering them further and seeing how low he has to
> go to flip the plan...
>

So if this query usually does *not* hit the cache, it will be probably
faster if I leave it like that? While testing a query I execute it that much
that it's always getting into the cache. However, since other applications
run on the same server, I think that infrequently used data gets flushed
after a while, even if the DB could fit in the RAM.

-- 
Mathieu