Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Shane Ambler
Greg Smith wrote: On Thu, 27 Dec 2007, Shane Ambler wrote: So in theory a modern RAID 1 setup can be configured to get similar read speeds as RAID 0 but would still drop to single disk speeds (or similar) when writing, but RAID 0 can get the faster write performance. The trick is, you need a

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Shane Ambler
Mark Mielke wrote: Shane Ambler wrote: So in a perfect setup (probably 1+0) 4x 300MB/s SATA drives could deliver 1200MB/s of data to RAM, which is also assuming that all 4 channels have their own data path to RAM and aren't sharing. (anyone know how segregated the on board controllers such as t

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Tom Lane
Greg Smith <[EMAIL PROTECTED]> writes: > On Wed, 26 Dec 2007, Guillaume Smet wrote: >> beta RPMs are by default compiled with --enable-debug and >> --enable-cassert which doesn't help them to fly fast... > Got that right. Last time I was going crazy after running pgbench with > those options and

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
Shane Ambler wrote: So in theory a modern RAID 1 setup can be configured to get similar read speeds as RAID 0 but would still drop to single disk speeds (or similar) when writing, but RAID 0 can get the faster write performance. Unfortunately, it's a bit more complicated than that. RAID 1 has

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Greg Smith
On Thu, 27 Dec 2007, Shane Ambler wrote: So in theory a modern RAID 1 setup can be configured to get similar read speeds as RAID 0 but would still drop to single disk speeds (or similar) when writing, but RAID 0 can get the faster write performance. The trick is, you need a perfect controller

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Shane Ambler
Fernando Hevia wrote: I'll start a little ways back first - Well, here rises another doubt. Should I go for a single RAID 1+0 storing OS + Data + WAL files or will I be better off with two RAID 1 separating data from OS + Wal files? earlier you wrote - Database will be about 30 GB in size in

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Devrim GÜNDÜZ
Hi, On Wed, 2007-12-26 at 18:35 -0500, Greg Smith wrote: > Probably need to put a disclaimer about that fact *somewhere*. We mention about that in README.rpm-dist file, but I think we should mention about that at a more visible place. Regards, -- Devrim GÜNDÜZ , RHCE PostgreSQL Replication, Co

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Greg Smith
On Wed, 26 Dec 2007, Guillaume Smet wrote: beta RPMs are by default compiled with --enable-debug and --enable-cassert which doesn't help them to fly fast... Got that right. Last time I was going crazy after running pgbench with those options and not having realized what I changed, I was gett

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Guillaume Smet
On Dec 26, 2007 10:52 PM, Guillaume Smet <[EMAIL PROTECTED]> wrote: > Let's go with 8.2.5 on the same server (-s 100 / 16 clients / 50k > transactions per client / only read using -S option): > 64MB: 33814 tps > 512MB: 35833 tps > 1024MB: 36986 tps > It's more consistent with what I expected. I ha

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Greg Smith
On Wed, 26 Dec 2007, [EMAIL PROTECTED] wrote: yes, the two linux software implementations only read from one disk, but I have seen hardware implementations where it reads from both drives, and if they disagree it returns a read error rather then possibly invalid data (it's up to the admin to f

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
[EMAIL PROTECTED] wrote: however I was addressing the point that for reads you can't do any checking until you have read in all the blocks. if you never check the consistency, how will it ever be proven otherwise. A scheme often used is to mark the disk/slice as "clean" during clean system shut

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread david
On Wed, 26 Dec 2007, Mark Mielke wrote: [EMAIL PROTECTED] wrote: I could see a raid 1 array not doing consistancy checking (after all, it has no way of knowing what's right if it finds an error), but since raid 5/6 can repair the data I would expect them to do the checking each time. Your mes

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
Bill Moran wrote: What do you mean "heard of"? Which raid system do you know of that reads all drives for RAID 1? I'm fairly sure that FreeBSD's GEOM does. Of course, it couldn't be doing consistency checking at that point. According to this: http://www.freebsd.org/cgi/man.cgi?quer

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread david
On Wed, 26 Dec 2007, Mark Mielke wrote: [EMAIL PROTECTED] wrote: On Wed, 26 Dec 2007, Mark Mielke wrote: Florian Weimer wrote: seek/read/calculate/seek/write since the drive moves on after the read), when you read you must read _all_ drives in the set to check the data integrity. I don't kn

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread david
On Wed, 26 Dec 2007, Mark Mielke wrote: [EMAIL PROTECTED] wrote: Thanks for the explanation David. It's good to know not only what but also why. Still I wonder why reads do hit all drives. Shouldn't only 2 disks be read: the one with the data and the parity disk? no, becouse the parity is of t

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Bill Moran
In response to Mark Mielke <[EMAIL PROTECTED]>: > Bill Moran wrote: > > In order to recalculate the parity, it has to have data from all disks. > > Thus, > > if you have 4 disks, it has to read 2 (the unknown data blocks included in > > the parity calculation) then write 2 (the new data block and

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Bill Moran
In response to Mark Mielke <[EMAIL PROTECTED]>: > [EMAIL PROTECTED] wrote: > > On Wed, 26 Dec 2007, Mark Mielke wrote: > > > >> Florian Weimer wrote: > seek/read/calculate/seek/write since the drive moves on after the > read), when you read you must read _all_ drives in the set to check

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
Bill Moran wrote: In order to recalculate the parity, it has to have data from all disks. Thus, if you have 4 disks, it has to read 2 (the unknown data blocks included in the parity calculation) then write 2 (the new data block and the new parity data) Caching can help some, but if your data end

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
[EMAIL PROTECTED] wrote: I could see a raid 1 array not doing consistancy checking (after all, it has no way of knowing what's right if it finds an error), but since raid 5/6 can repair the data I would expect them to do the checking each time. Your messages are spread across the thread. :-)

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
[EMAIL PROTECTED] wrote: On Wed, 26 Dec 2007, Mark Mielke wrote: Florian Weimer wrote: seek/read/calculate/seek/write since the drive moves on after the read), when you read you must read _all_ drives in the set to check the data integrity. I don't know of any RAID implementation that perform

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
[EMAIL PROTECTED] wrote: Thanks for the explanation David. It's good to know not only what but also why. Still I wonder why reads do hit all drives. Shouldn't only 2 disks be read: the one with the data and the parity disk? no, becouse the parity is of the sort (A+B+C+P) mod X = 0 so if X=10 (

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread david
On Wed, 26 Dec 2007, Mark Mielke wrote: Florian Weimer wrote: seek/read/calculate/seek/write since the drive moves on after the read), when you read you must read _all_ drives in the set to check the data integrity. I don't know of any RAID implementation that performs consistency checking on

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Guillaume Smet
On Dec 26, 2007 7:23 PM, Greg Smith <[EMAIL PROTECTED]> wrote: > Ah, now this is really interesting, as it rules out all the write > components and should be easy to replicate even on a smaller server. As > you've already dumped a bunch of time into this the only other thing I > would suggest chec

Re: [PERFORM] pg_dump performance

2007-12-26 Thread Jared Mauch
On Wed, Dec 26, 2007 at 11:35:59PM +0200, Heikki Linnakangas wrote: > I run a quick oprofile run on my laptop, with a table like that, filled > with dummy data. It looks like indeed ~30% of the CPU time is spent in > sprintf, to convert the integers and inets to string format. I think you > coul

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
Florian Weimer wrote: seek/read/calculate/seek/write since the drive moves on after the read), when you read you must read _all_ drives in the set to check the data integrity. I don't know of any RAID implementation that performs consistency checking on each read operation. 8-( Dave ha

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread david
On Wed, 26 Dec 2007, Florian Weimer wrote: seek/read/calculate/seek/write since the drive moves on after the read), when you read you must read _all_ drives in the set to check the data integrity. I don't know of any RAID implementation that performs consistency checking on each read operation

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread david
On Wed, 26 Dec 2007, Fernando Hevia wrote: David Lang Wrote: with only four drives the space difference between raid 1+0 and raid 5 isn't that much, but when you do a write you must write to two drives (the drive holding the data you are changing, and the drive that holds the parity data for th

Re: [PERFORM] pg_dump performance

2007-12-26 Thread Heikki Linnakangas
Jared Mauch wrote: On Wed, Dec 26, 2007 at 10:52:08PM +0200, Heikki Linnakangas wrote: Jared Mauch wrote: pg_dump is utilizing about 13% of the cpu and the corresponding postgres backend is at 100% cpu time. (multi-core, multi-cpu, lotsa ram, super-fast disk). ... Any tips on getting p

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Bill Moran
In response to "Fernando Hevia" <[EMAIL PROTECTED]>: > > > David Lang Wrote: > > > > with only four drives the space difference between raid 1+0 and raid 5 > > isn't that much, but when you do a write you must write to two drives (the > > drive holding the data you are changing, and the drive tha

Re: [PERFORM] pg_dump performance

2007-12-26 Thread Jared Mauch
On Wed, Dec 26, 2007 at 10:52:08PM +0200, Heikki Linnakangas wrote: > Jared Mauch wrote: >> pg_dump is utilizing about 13% of the cpu and the >> corresponding postgres backend is at 100% cpu time. >> (multi-core, multi-cpu, lotsa ram, super-fast disk). >> ... >> Any tips on getting pg_dum

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Florian Weimer
> seek/read/calculate/seek/write since the drive moves on after the > read), when you read you must read _all_ drives in the set to check > the data integrity. I don't know of any RAID implementation that performs consistency checking on each read operation. 8-( ---(end of

Re: [PERFORM] pg_dump performance

2007-12-26 Thread Heikki Linnakangas
Jared Mauch wrote: pg_dump is utilizing about 13% of the cpu and the corresponding postgres backend is at 100% cpu time. (multi-core, multi-cpu, lotsa ram, super-fast disk). ... Any tips on getting pg_dump (actually the backend) to perform much closer to 500k/sec or more? This would al

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia
> David Lang Wrote: > > with only four drives the space difference between raid 1+0 and raid 5 > isn't that much, but when you do a write you must write to two drives (the > drive holding the data you are changing, and the drive that holds the > parity data for that stripe, possibly needing to r

[PERFORM] pg_dump performance

2007-12-26 Thread Jared Mauch
I've been looking at the performance of pg_dump in the past week off and on trying to see if I can get it to work a bit faster and was looking for tips on this. doing a pg_dump on my 34311239 row table (1h of data btw) results in a wallclock time of 187.9 seconds or ~182k rows/sec

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread david
On Wed, 26 Dec 2007, Fernando Hevia wrote: Mark Mielke Wrote: In my experience, software RAID 5 is horrible. Write performance can decrease below the speed of one disk on its own, and read performance will not be significantly more than RAID 1+0 as the number of stripes has only increased from

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia
Mark Mielke Wrote: >In my experience, software RAID 5 is horrible. Write performance can >decrease below the speed of one disk on its own, and read performance will >not be significantly more than RAID 1+0 as the number of stripes has only >increased from 2 to 3, and if reading while writing, you

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia
> Bill Moran wrote: > > RAID 10. > > I snipped the rest of your message because none of it matters. Never use > RAID 5 on a database system. Ever. There is absolutely NO reason to > every put yourself through that much suffering. If you hate yourself > that much just commit suicide, it's les

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Greg Smith
On Wed, 26 Dec 2007, Guillaume Smet wrote: It's not checkpointing either as using pgbench-tools, I can see that tps and latency are quite stable during the entire run. Btw, thanks Greg for these nice tools. I stole the graph idea from Mark Wong's DBT2 code and one of these days I'll credit hi

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Greg Smith
On Wed, 26 Dec 2007, Mark Mielke wrote: I believe hardware RAID 5 is also horrible, but since the hardware hides it from the application, a hardware RAID 5 user might not care. Typically anything doing hardware RAID 5 also has a reasonable sized write cache on the controller, which softens th

[PERFORM] Anyone running on RHEL Cluster?

2007-12-26 Thread Chris Hoover
Is anyone running their production PostgreSQL server on the RHEL Cluster software? If so, how is it working for you? My linux admin is looking at trying to increase the robustness of our environment and wanting to try and eliminate as many single points of failure as possible. So, I am looking f

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Guillaume Smet
On Dec 26, 2007 4:41 PM, Guillaume Smet <[EMAIL PROTECTED]> wrote: > Then I decided to perform read-only tests using -S option (pgbench -S > -s 100 -c 16 -t 3 -U postgres bench). And still the same > behaviour: > shared_buffers=64MB : 20k tps > shared_buffers=1024MB : 8k tps Some more informat

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Pavel Stehule
Hello I tested it and it is true. In my configuration 1GRam, Fedora 8, is PostgreSQL most fast with 32M shared buffers :(. Diff is about 5% to 256M Regards Pavel Stehule On 26/12/2007, Guillaume Smet <[EMAIL PROTECTED]> wrote: > On Dec 26, 2007 12:21 PM, Simon Riggs <[EMAIL PROTECTED]> wrote: >

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
Fernando Hevia wrote: Database will be about 30 GB in size initially and growing 10 GB per year. Data is inserted overnight in two big tables and during the day mostly read-only queries are run. Parallelism is rare. I have read about different raid levels with Postgres but the advice found

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Guillaume Smet
On Dec 26, 2007 12:21 PM, Simon Riggs <[EMAIL PROTECTED]> wrote: > bgwriter_lru_maxpages = 0 > > So we can see if the bgwriter has any hand in this? It doesn't change the behaviour I have. It's not checkpointing either as using pgbench-tools, I can see that tps and latency are quite stable during

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Bill Moran
RAID 10. I snipped the rest of your message because none of it matters. Never use RAID 5 on a database system. Ever. There is absolutely NO reason to every put yourself through that much suffering. If you hate yourself that much just commit suicide, it's less drastic. -- Bill Moran Collabor

[PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Fernando Hevia
Hi list, I am building kind of a poor man's database server: Pentium D 945 (2 x 3 Ghz cores) 4 GB RAM 4 x 160 GB SATA II 7200 rpm (Intel server motherboard has only 4 SATA ports) Database will be about 30 GB in size initially and growing 10 GB per year. Data is inserted overnight in two big tab

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Guillaume Smet
On Dec 26, 2007 12:21 PM, Simon Riggs <[EMAIL PROTECTED]> wrote: > Can you try with > > bgwriter_lru_maxpages = 0 > > So we can see if the bgwriter has any hand in this? I will. I'm currently running tests with less concurrent clients (16) with exactly the same results: 64M 4213.314902 256M 4012.7

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Guillaume Smet
On Dec 26, 2007 12:06 PM, Cédric Villemain <[EMAIL PROTECTED]> wrote: > Which kernel do you have ? Kernel of the distro. So a RH flavoured 2.6.18. -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desi

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Simon Riggs
On Wed, 2007-12-26 at 01:06 +0100, Guillaume Smet wrote: > I lowered the number of concurrent clients to 50 because 100 is quite > high and I obtain the same sort of results: > shared_buffers=32MB: 1869 tps > shared_buffers=64MB: 1844 tps > shared_buffers=512MB: 1676 tps > shared_buffers=1024MB: 1

Re: [PERFORM] More shared buffers causes lower performances

2007-12-26 Thread Cédric Villemain
Guillaume Smet a écrit : Hi all, I'm currently benchmarking the new PostgreSQL server of one of our customers with PostgreSQL 8.3 beta4. I have more or less the same configuration Stefan tested in his blog [1]: - Dell 2900 with two brand new X5365 processors (quad core 3.0 GHz), 16 GB of memory