Re: [PERFORM] Planning a new server - help needed

2008-03-30 Thread Guy Rouillier

PFC wrote:



Why do you claim that 'More platters also means slower seeks
and generally slower performance.'?


More platters - more heads - heavier head assembly - slower seek 
time

But..
More platters - higher density - less seek distance (in mm of head 
movement) - faster seek time


More platters means more tracks under the read heads at a time, so 
generally *better* performance.  All other things (like rotational 
speed) being equal, of course.


--
Guy Rouillier

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-29 Thread James Mansion

Greg Smith wrote:
As for SCSI vs. SATA, I collected up the usual arguments on both sides 
at http://www.postgresqldocs.org/index.php/SCSI_vs._IDE/SATA_Disks



Why do you claim that 'More platters also means slower seeks
and generally slower performance.'?

On the face of it, it should mean that the number of track step
operations is reduced, even if the drive doesn't buffer a slice
of tracks aross all platters (which would help if it did).

I'm not entirely sure why the extra platters should really count
as more moving parts since I think the platter assembly and
head assembly are both single parts in effect, albeit they will
be more massive with more platters. I'm not sure how much
extra bearing friction that will mean, but it is reasonable that
some extra energy is going to be needed.

Recent figures I've seen suggest that the increased storage
density per platter, plus the extra number of platters, means
that the streaming speed of good 7200rpm SATA drives is
very close to that of good 15000rpm SAS drives - and you
can choose which bit of the disk to use to reduce seek time and
maximise transfer rate with the oversize drives. You can
get about 100MB/s out of both technologies, streaming.

It may be worth considering an alternative approach. I suspect
that a god RAID1 or RAID1+0 is worthwhile for WAL, but
you might consider a RAID1 of a pair of SSDs for data. They
will use a lot of your budget, but the seek time is negligible so the
effective usable performance is higher than you get with
spinning disks - so you might trade a fancy controller with
battery-backed cache for straight SSD.

I haven't done this, so YMMV.  But the prices are getting
interesting for OLTP where most disks are massively
oversized. The latest Samsung and SanDisk are expensive
in the UK but the Transcend 16GB TS16GSSD25S-S  SATA
is about $300 equiv - it can do 'only' 'up to' 28MB/s write and
you wouldn't want to put WAL on one, but sustaining
15-20MB/s through random access to a real disk isn't
trivial. If average access is 10ms, and you write 100MB/s
streaming, then you have to ask yourself if you going to do
80 or more seeks a second.

James

James





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-29 Thread PFC



Why do you claim that 'More platters also means slower seeks
and generally slower performance.'?


More platters - more heads - heavier head assembly - slower seek time
But..
	More platters - higher density - less seek distance (in mm of head  
movement) - faster seek time


	As usual, no clear-cut case, a real-life test would tell more interesting  
things.



I'm not entirely sure why the extra platters should really count
as more moving parts since I think the platter assembly and
head assembly are both single parts in effect, albeit they will
be more massive with more platters. I'm not sure how much
extra bearing friction that will mean, but it is reasonable that
some extra energy is going to be needed.


	Since the bearings are only on one side of the axle (not both), a heavier  
platter assembly would put more stress on the bearing if the disk is  
subject to vibrations (like, all those RAID disks seeking together) which  
would perhaps shorten its life. Everything with conditionals of course ;)
	I remember reading a paper on vibration from many RAID disks somewhere a  
year or so ago, vibration from other disks seeking at the exact same time  
and in the same direction would cause resonances in the housing chassis  
and disturb the heads of disks, slightly worsening seek times and  
reliability. But, on the other hand, the 7 disks raided in my home storage  
server never complained, even though the $30 computer case vibrates all  
over the place when they seek. Perhaps if they were subject to 24/7 heavy  
torture, a heavier/better damped chassis would be a good investment.



It may be worth considering an alternative approach. I suspect
that a god RAID1 or RAID1+0 is worthwhile for WAL, but


	Actually, now that 8.3 can sync to disk every second instead of at every  
commit, I wonder, did someone do some enlightening benchmarks ? I remember  
benchmarking 8.2 on a forum style load and using a separate disk for WAL  
(SATA, write cache off) made a huge difference (as expected) versus one  
disk for everything (SATA, and write cache off). Postgres beat the crap  
out of MyISAM, lol.
	Seems like Postgres is one of the rare apps which gets faster and meaner  
with every release, instead of getting slower and more bloated like  
everyone else.


	Also, there is a thing called write barriers, which supposedly could be  
used to implement fsync-like behaviour without the penalty, if the disk,  
the OS, the controller, and the filesystem support it (that's a lot of  
ifs)...



I haven't done this, so YMMV.  But the prices are getting
interesting for OLTP where most disks are massively
oversized. The latest Samsung and SanDisk are expensive
in the UK but the Transcend 16GB TS16GSSD25S-S  SATA
is about $300 equiv - it can do 'only' 'up to' 28MB/s write and


	Gigabyte should revamp their i-RAM to use ECC RAM of a larger capacity...  
and longer lasting battery backup...
	I wonder, how many write cycles those Flash drives can take before  
reliability becomes a problem...





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-29 Thread James Mansion

PFC wrote:



Why do you claim that 'More platters also means slower seeks
and generally slower performance.'?


More platters - more heads - heavier head assembly - slower 
seek time

Note sure I've sen a lot of evidence of that in drive specifications!

Gigabyte should revamp their i-RAM to use ECC RAM of a larger 
capacity... and longer lasting battery backup...
You would think a decent capacitor or rechargable button battery would 
be enough to dump it to a flash memory.

No problem with flash wear then.
I wonder, how many write cycles those Flash drives can take before 
reliability becomes a problem...
Hard to get data isn't it? I believe its hundreds of thousands to 
millions now. Now each record in most OLTP
tables is rewritten a few times unless its stuff that can go into temp 
tables etc, which should be elsewhere.

Index pages clearly get rewritten often.

I suspect a mix of storage technologies will be handy for some time yet 
- WAL on disk, and temp tables on

disk with no synchronous fsync requirement.

I think life is about to get interesting in DBMS storage.  All good for 
us users.


James


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-29 Thread Greg Smith

On Sat, 29 Mar 2008, PFC wrote:


Why do you claim that 'More platters also means slower seeks
and generally slower performance.'?
	More platters - more heads - heavier head assembly - slower seek 
time


I recall seeing many designs with more platters that have slower seek 
times in benchmarks, and I always presumed this was the reason.  That's 
the basis for that comment.  I'll disclaim that section a bit.


Actually, now that 8.3 can sync to disk every second instead of at every 
commit, I wonder, did someone do some enlightening benchmarks ?


I've seen some really heavy workloads where using async commit helped 
group commits in a larger batches usefully, but I personally haven't found 
it to be all that useful if you're already got a caching controller to 
accelerate writes on the kinds of hardware most people have.  It's a great 
solution for situations without a usable write cache though.


Also, there is a thing called write barriers, which supposedly could be 
used to implement fsync-like behaviour without the penalty, if the disk, 
the OS, the controller, and the filesystem support it (that's a lot of 
ifs)...


The database can't use fsync-like behavior for the things it calls fsync 
for; it needs the full semantics.  You're either doing the full operation, 
or you're cheating and it doesn't do what it's supposed to.  Write 
barriers aren't any improvement over a good direct I/O sync write setup 
for the WAL.  There may be some limited value to that approach for the 
database writes at checkpoint time, but again there's a real fsync coming 
at the end of that and it's not satisfied until everything is on disk (or 
in a good disk controller cache).


Gigabyte should revamp their i-RAM to use ECC RAM of a larger 
capacity... and longer lasting battery backup...


I saw a rumor somewhere that they were close to having a new version of 
that using DDR2 ready, which would make it pretty easy to have 8GB on 
there.


I wonder, how many write cycles those Flash drives can take before 
reliability becomes a problem...


The newer SSD drives with good write leveling should last at least as long 
as you'd expect a mechanical drive to, even in a WAL application.  Lesser 
grades of flash used as disk could be a problem though.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-29 Thread Craig Ringer
Greg Smith wrote:

 Gigabyte should revamp their i-RAM to use ECC RAM of a larger
 capacity... and longer lasting battery backup...
 
 I saw a rumor somewhere that they were close to having a new version of
 that using DDR2 ready, which would make it pretty easy to have 8GB on
 there.

I'm hoping - probably in vain - that they'll also include a CF/SD slot
or some onboard flash so it can dump its contents to flash using the
battery backup.

For anybody wondering what the devices in question are, see:

http://www.anandtech.com/storage/showdoc.aspx?i=2480
http://www.gigabyte.com.tw/Products/Storage/Products_Overview.aspx?ProductID=2180

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Planning a new server - help needed

2008-03-28 Thread Laszlo Nagy

Hello,

I need to install a new server for postgresql 8.3. It will run two 
databases, web server and some background programs. We already have a 
server but it is becoming slow and we would like to have something that 
is faster. It is a cost sensitive application, and I would like to get 
your opinion in some questions.


The database itself is an OLTP system. There are many smaller tables, 
and some bigger ones (biggest table with 1.2 million records, table size 
966MB, indexes size 790MB). In the bigger tables there are only a few 
records updated frequently, most of the other records are not changed. 
The smaller tables are updated continuously.


Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg docs 
say that it is better to use FreeBSD because it can alter the I/O 
priority of processes dynamically. The latest legacy release is 6.3 
which is probably more stable. However, folks say that 7.0 has superior 
performance on the same hardware. Can I use 7.0 on a production server?


Question 2. SCSI or SATA? I plan to buy a RocketRAID 3520 controller 
with 8 SATA 2 disks. The operating system would be on another disk pair, 
connected to the motherboard's controller. I wonder if I can get more 
performance with SCSI, for the same amount of money? (I can spend about 
$1500 on the controller and the disks, that would cover 10 SATA 2 disks 
and the controller.)


Question 3. FreeBSD 7.0 can use the ZFS file system. I suspect that UFS 
2 + soft updates will be better, but I'm not sure. Which is better?


Question 4. How to make the partitions? This is the hardest question. 
Here is my plan:


- the OS resides on 2 disks, RAID 1
- the databases should go on 8 disks, RAID 0 + 1

However, the transaction log file should be on a separate disk and maybe 
I could gain more performance by putting indexes on a separate drive, 
but I do not want to reduce the number of disks in the RAID 0+1 array. 
Should I put indexes and transaction log on the RAID 1 array? Or should 
I invest a bit more money, add an SATA RAID controller with 16 channels 
and add more disks? Would it pay the bill? Another alternative is to put 
the biggest tables on a separate array so that it will be faster when we 
join these tables with other tables.


I know that it is hard to answer without knowing the structure of the 
databases. :-( I can make tests with different configurations later, but 
I would like to know your opinion first - what should I try?


Thanks,

  Laszlo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Claus Guttesen
  I need to install a new server for postgresql 8.3. It will run two
  databases, web server and some background programs. We already have a
  server but it is becoming slow and we would like to have something that
  is faster. It is a cost sensitive application, and I would like to get
  your opinion in some questions.

  The database itself is an OLTP system. There are many smaller tables,
  and some bigger ones (biggest table with 1.2 million records, table size
  966MB, indexes size 790MB). In the bigger tables there are only a few
  records updated frequently, most of the other records are not changed.
  The smaller tables are updated continuously.

  Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg docs
  say that it is better to use FreeBSD because it can alter the I/O
  priority of processes dynamically. The latest legacy release is 6.3
  which is probably more stable. However, folks say that 7.0 has superior
  performance on the same hardware. Can I use 7.0 on a production server?

I guess you mean postgresql 8.3.1? :-)

I use FreeBSD 7 release on a 8-way HP DL360 G5 with a ciss controller.
Works out of the box and I haven't had any issue with 7.0 at all.

  Question 2. SCSI or SATA? I plan to buy a RocketRAID 3520 controller
  with 8 SATA 2 disks. The operating system would be on another disk pair,
  connected to the motherboard's controller. I wonder if I can get more
  performance with SCSI, for the same amount of money? (I can spend about
  $1500 on the controller and the disks, that would cover 10 SATA 2 disks
  and the controller.)

SAS would probably be the way to go. I haven't tried the
rocketraid-controller. I use the built-in p400i-controller on my
servers using the ciss-driver. I've heard many positive remarks about
areca.

  Question 3. FreeBSD 7.0 can use the ZFS file system. I suspect that UFS
  2 + soft updates will be better, but I'm not sure. Which is better?

I'd stick with ufs2 atm. There are some issues with zfs which probably
have been ironed out by now but ufs2 has been deployed for a longer
time. Performance-wise they are about the same.

  Question 4. How to make the partitions? This is the hardest question.
  Here is my plan:

  - the OS resides on 2 disks, RAID 1
  - the databases should go on 8 disks, RAID 0 + 1

If you have enough disks raid-6 should perform almost as good as raid
1+0. I've setup 11 disks in raid-6 plus one hotspare so I can get more
space out of it. Enough disks are approx. eight and up.

-- 
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Reko Turja
Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg 
docs say that it is better to use FreeBSD because it can alter the 
I/O priority of processes dynamically. The latest legacy release is 
6.3 which is probably more stable. However, folks say that 7.0 has 
superior performance on the same hardware. Can I use 7.0 on a 
production server?


FreeBSD 7.x is pretty stable, and it has the advantage of having the 
new ULE and other things that can't be MFC'd to 6.x branch. And as a 
long time FreeBSD enthusiast having Cisco sponsoring Dtrace, Nokia 
sponsoring scheduler development etc. 7.x is definitely in my opinion 
now the branch to install and start following for ease of upgrading 
later. Of course, as always check that your intended hardware is 
supported.


ULE which is pretty much the key for performance boost in 7.x branch 
isn't yet the default scheduler, but will be in 7.1 and afterwards. 
This means you have to roll custom kernel if you want to use ULE.


Question 3. FreeBSD 7.0 can use the ZFS file system. I suspect that 
UFS 2 + soft updates will be better, but I'm not sure. Which is 
better?


For now I'd choose between UFS+gjournal or plain UFS, although with 
bigger disks journaling is a boon compared to fsck'ing plain UFS 
partition. ZFS isn't yet ready for production I think, but it seems to 
be getting there. This is opinion based on bug reports and discussions 
in stable and current mailing lists, not on personal testing though. 
My experiences with gjournal have been positive so far.


On the drives and controller - I'm not sure whether SCSI/SAS will give 
any noticeable boost over SATA, but based on personal experience SCSI 
is still ahead on terms of drive reliability. Whatever technology I'd 
choose, for production server getting decent battery backed controller 
would be the start. And of course a controller that does the RAID's in 
hardware.


-Reko 



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Laszlo Nagy



I guess you mean postgresql 8.3.1? :-)
  

Yep. Sorry.

 Question 3. FreeBSD 7.0 can use the ZFS file system. I suspect that UFS
 2 + soft updates will be better, but I'm not sure. Which is better?



I'd stick with ufs2 atm. There are some issues with zfs which probably
have been ironed out by now but ufs2 has been deployed for a longer
time. Performance-wise they are about the same.
  
Thank you. I suspected the same but it was good to get positive 
confirmation.

 Question 4. How to make the partitions? This is the hardest question.
 Here is my plan:

 - the OS resides on 2 disks, RAID 1
 - the databases should go on 8 disks, RAID 0 + 1



If you have enough disks raid-6 should perform almost as good as raid
1+0. 
Hmm, I have heard that RAID 1 or RAID 1 + 0 should be used for 
databases,  never RAID 5. I know nothing about RAID 6. I guess I must 
accept your suggestion since you have more experience than I have. :-) 
Obviously, it would be easier to manage a single RAID 6 array.

I've setup 11 disks in raid-6 plus one hotspare so I can get more
space out of it. Enough disks are approx. eight and up.
  
The RAID controller that I have selected can only handle 8 disks. I 
guess I need to find a different one with 16 channels and use more 
disks. So are you saying that with all disks in a bigger RAID 6 array, I 
will get the most out of the hardware? In that case, I'll try to get a 
bit more money from the management and build RAID 6 with 12 disks.


I also feel that I need to use a separate RAID 1 array (I prefer 
gmirror) for the base system.


Thanks,

  Laszlo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Greg Smith

On Fri, 28 Mar 2008, Laszlo Nagy wrote:

We already have a server but it is becoming slow and we would like to 
have something that is faster.


What's it slow at?  Have you identified the bottlenecks and current 
sources of sluggish behavior?  That sort of thing is much more informative 
to look into in regards to redesigning for new hardware than trivia like 
disk layout.  For all we know you're CPU bound.


The database itself is an OLTP system. There are many smaller tables, and 
some bigger ones (biggest table with 1.2 million records, table size 966MB, 
indexes size 790MB).


The total database size is the interesting number you left out here.  And 
you didn't mention how much RAM either.  That ratio has a lot of impact on 
how hard you'll push the disks.


Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg docs say 
that it is better to use FreeBSD because it can alter the I/O priority of 
processes dynamically.


You shouldn't make an OS decision based on a technical detail that small. 
I won't knock FreeBSD because it's a completely reasonable choice, but 
there's no credible evidence it's a better performer for the workload you 
expect than, say, Linux or even Solaris x64.  (The benchmarks the FreeBSD 
team posted as part of their 7.0 fanfare are not representative of real 
PostgreSQL performance, and are read-only as well).


All the reasonable OS choices here are close enough to one another (as 
long as you get FreeBSD 7, earlier versions are really slow) that you 
should be thinking in terms of reliability, support, and features rather 
than viewing this from a narrow performance perspective.  There's nothing 
about what you've described that sounds like it needs bleeding-edge 
performance to achieve. For reliability, I first look at how good the disk 
controller and its matching driver in the OS used is, which brings us to:


Question 2. SCSI or SATA? I plan to buy a RocketRAID 3520 controller with 8 
SATA 2 disks. The operating system would be on another disk pair, connected 
to the motherboard's controller. I wonder if I can get more performance with 
SCSI, for the same amount of money? (I can spend about $1500 on the 
controller and the disks, that would cover 10 SATA 2 disks and the 
controller.)


Highpoint has traditionally made disk controllers that were garbage.  The 
3520 is from a relatively new series of products from them, and it seems 
like a reasonable unit.  However:  do you want to be be deploying your 
system on a new card with zero track record for reliability, and from a 
company that has never done a good job before?  I can't think of any 
reason at all why you should take that risk.


The standard SATA RAID controller choices people suggest here are 3ware, 
Areca, and LSI Logic.  Again, unless you're really pushing what the 
hardware is capable of these are all close to each other performance-wise 
(see http://femme.tweakblogs.net/blog/196/highpoint-rocketraid-3220.html 
for something that include the Highpoint card).  You should be thinking in 
terms of known reliability and stability when you select a database 
controller card, and Highpoint isn't even on the list of vendors to 
consider yet by those standards.


As for SCSI vs. SATA, I collected up the usual arguments on both sides at 
http://www.postgresqldocs.org/index.php/SCSI_vs._IDE/SATA_Disks


However, the transaction log file should be on a separate disk and maybe I 
could gain more performance by putting indexes on a separate drive, but I do 
not want to reduce the number of disks in the RAID 0+1 array.


If you're looking at 8+ disks and have a caching controller with a battery 
backup, which appears to be your target configuration, there little reason 
to expect a big performance improvement from splitting the transaction log 
out onto a seperate disk.  As you note, doing that will reduce the spread 
of disk for the database which may cost you more in performance than 
seperate transaction logs gain.


It is worth considering creating a seperate filesystem on the big array to 
hold the xlog data through, because that gives you more flexibility in 
terms of mount parameters there.  For example, you can always turn off 
atime updates on the transaction log filesystem, and in many cases the 
filesystem journal updates can be optimized more usefully (the xlog 
doesn't require them).


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Dan Harris

Laszlo Nagy wrote:


Question 4. How to make the partitions? This is the hardest question. 
Here is my plan:


- the OS resides on 2 disks, RAID 1
- the databases should go on 8 disks, RAID 0 + 1
Make sure you understand the difference between RAID 1+0 and RAID 0+1.. 
I suspect you'll end up going with 1+0 instead.


-Dan



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Weinzierl Stefan

Laszlo Nagy schrieb:
[...]
The RAID controller that I have selected can only handle 8 disks. I 
guess I need to find a different one with 16 channels and use more 
disks. So are you saying that with all disks in a bigger RAID 6 array, I 
will get the most out of the hardware? In that case, I'll try to get a 
bit more money from the management and build RAID 6 with 12 disks.


Here a good SATA-Controllers for 4/8/12/16-Disks:
http://www.tekram.com/product2/product_detail.asp?pid=51

Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance