Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread Greg Smith
Scott Carey wrote: I'm also not sure how up to date RedHat's xfs version is -- there have been enhancements to xfs in the kernel mainline regularly for a long time. They seem to following SGI's XFS repo quite carefully and cherry-picking bug fixes out of there, not sure of how that relates

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread Scott Carey
On Mar 9, 2010, at 4:39 PM, Scott Carey wrote: > > On Mar 8, 2010, at 11:00 PM, Greg Smith wrote: > > * At least with CentOS 5.3 and thier xfs version (non-Redhat, CentOS extras) > sparse random writes could almost hang a file system. They were VERY slow. > I have not tested since. > Ju

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread Scott Carey
On Mar 8, 2010, at 11:00 PM, Greg Smith wrote: > Scott Carey wrote: >> For high sequential throughput, nothing is as optimized as XFS on Linux yet. >> It has weaknesses elsewhere however. >> > > I'm curious what you feel those weaknesses are. The recent addition of > XFS back into a more ma

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread david
On Tue, 9 Mar 2010, Pierre C wrote: On Tue, 09 Mar 2010 08:00:50 +0100, Greg Smith wrote: Scott Carey wrote: For high sequential throughput, nothing is as optimized as XFS on Linux yet. It has weaknesses elsewhere however. When files are extended one page at a time (as postgres does) fr

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread Ing. Marcos Ortiz Valmaseda
Pierre C escribió: On Tue, 09 Mar 2010 08:00:50 +0100, Greg Smith wrote: Scott Carey wrote: For high sequential throughput, nothing is as optimized as XFS on Linux yet. It has weaknesses elsewhere however. When files are extended one page at a time (as postgres does) fragmentation can

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread Michael Stone
Do keep the postgres xlog on a seperate ext2 partition for best performance. Other than that, xfs is definitely a good performer. Mike Stone -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgs

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread Kevin Grittner
"Pierre C" wrote: > Greg Smith wrote: >> I'm curious what you feel those weaknesses are. > > Handling lots of small files, especially deleting them, is really > slow on XFS. > Databases don't care about that. I know of at least one exception to that -- when we upgraded and got a newer versio

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-09 Thread Pierre C
On Tue, 09 Mar 2010 08:00:50 +0100, Greg Smith wrote: Scott Carey wrote: For high sequential throughput, nothing is as optimized as XFS on Linux yet. It has weaknesses elsewhere however. When files are extended one page at a time (as postgres does) fragmentation can be pretty high on

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-08 Thread Greg Smith
Scott Carey wrote: For high sequential throughput, nothing is as optimized as XFS on Linux yet. It has weaknesses elsewhere however. I'm curious what you feel those weaknesses are. The recent addition of XFS back into a more mainstream position in the RHEL kernel as of their 5.4 update

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-08 Thread Scott Carey
On Mar 2, 2010, at 2:10 PM, wrote: > On Tue, 2 Mar 2010, Scott Marlowe wrote: > >> On Tue, Mar 2, 2010 at 2:30 PM, Francisco Reyes >> wrote: >>> Scott Marlowe writes: >>> Then the real thing to compare is the speed of the drives for throughput not rpm. >>> >>> In a machine, simmil

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-08 Thread Scott Carey
On Mar 2, 2010, at 1:36 PM, Francisco Reyes wrote: > da...@lang.hm writes: > >> With sequential scans you may be better off with the large SATA drives as >> they fit more data per track and so give great sequential read rates. > > I lean more towards SAS because of writes. > One common thing w

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Scott Marlowe
On Wed, Mar 3, 2010 at 4:53 AM, Hannu Krosing wrote: > On Wed, 2010-03-03 at 10:41 +0100, Yeb Havinga wrote: >> Scott Marlowe wrote: >> > On Tue, Mar 2, 2010 at 1:51 PM, Yeb Havinga wrote: >> > >> >> With 24 drives it'll probably be the controller that is the limiting >> >> factor >> >> of bandw

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Yeb Havinga
Francisco Reyes wrote: Yeb Havinga writes: controllers. Also, I am not sure if it is wise to put the WAL on the same logical disk as the indexes, If I only have two controllers would it then be better to put WAL on the first along with all the data and the indexes on the external? Specially

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Greg Smith
Francisco Reyes wrote: Who are you using for SAS? One thing I like about 3ware is their management utility works under both FreeBSD and Linux well. 3ware has turned into a division within LSI now, so I have my doubts about their long-term viability as a separate product as well. LSI used to

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Francisco Reyes
Yeb Havinga writes: controllers. Also, I am not sure if it is wise to put the WAL on the same logical disk as the indexes, If I only have two controllers would it then be better to put WAL on the first along with all the data and the indexes on the external? Specially since the external encl

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Pierre C
With 24 drives it'll probably be the controller that is the limiting factor of bandwidth. Our HP SAN controller with 28 15K drives delivers 170MB/s at maximum with raid 0 and about 155MB/s with raid 1+0. I get about 150-200 MB/s on a linux software RAID of 3 cheap Samsung SATA 1TB dr

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Yeb Havinga
Francisco Reyes wrote: Going with a 3Ware SAS controller. Have some external enclosures with 16 15Krpm drives. They are older 15K rpms, but they should be good enough. Since the 15K rpms usually have better Transanctions per second I will put WAL and indexes in the external enclosure. It

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Yeb Havinga
Scott Marlowe wrote: On Tue, Mar 2, 2010 at 1:51 PM, Yeb Havinga wrote: With 24 drives it'll probably be the controller that is the limiting factor of bandwidth. Our HP SAN controller with 28 15K drives delivers 170MB/s at maximum with raid 0 and about 155MB/s with raid 1+0. So I'd go for th

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-03 Thread Yeb Havinga
Greg Smith wrote: Yeb Havinga wrote: With 24 drives it'll probably be the controller that is the limiting factor of bandwidth. Our HP SAN controller with 28 15K drives delivers 170MB/s at maximum with raid 0 and about 155MB/s with raid 1+0. You should be able to clear 1GB/s on sequential rea

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 7:44 PM, Francisco Reyes wrote: > Greg Smith writes: > >> http://www.3ware.com/KB/Article.aspx?id=15383  I consider them still a >> useful vendor for SATA controllers, but would never buy a SAS solution from >> them again until this is resolved. > > > Who are you using for S

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Greg Smith writes: http://www.3ware.com/KB/Article.aspx?id=15383 I consider them still a useful vendor for SATA controllers, but would never buy a SAS solution from them again until this is resolved. Who are you using for SAS? One thing I like about 3ware is their management utility works u

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Greg Smith
Scott Marlowe wrote: Time to do the ESD shuffle I think. Nah, I keep the crazy drive around as an interesting test case. Fun to see what happens when I connect to a RAID card; very informative about how thorough the card's investigation of the drive is. Our 15k5 seagates have been grea

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 6:03 PM, Greg Smith wrote: > Scott Marlowe wrote: >> >> We've had REAL good luck with the WD green and black drives.  Out of >> about 35 or so drives we've had two failures in the last year, one of >> each black and green. > > I've been happy with almost all the WD Blue driv

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Greg Smith
Scott Marlowe wrote: We've had REAL good luck with the WD green and black drives. Out of about 35 or so drives we've had two failures in the last year, one of each black and green. I've been happy with almost all the WD Blue drives around here (have about a dozen in service for around two yea

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 4:57 PM, Greg Smith wrote: > Scott Marlowe wrote: >> >> True, I just looked at the Hitachi 7200 RPM 2TB Ultrastar and it lists >> and average throughput of 134 Megabytes/second which is quite good. >> > > Yeah, but have you tracked the reliability of any of the 2TB drives ou

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Greg Smith
Scott Marlowe wrote: True, I just looked at the Hitachi 7200 RPM 2TB Ultrastar and it lists and average throughput of 134 Megabytes/second which is quite good. Yeah, but have you tracked the reliability of any of the 2TB drives out there right now? They're terrible. I wouldn't deploy anyt

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 4:50 PM, Greg Smith wrote: > If you only have 2 or 3 connections, I can't imagine that the improved seek > times of the 15K drives will be a major driving factor.  As already > suggested, 10K drives tend to be larger and can be extremely fast on > sequential workloads, parti

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Greg Smith
Francisco Reyes wrote: Anyone has any experience doing analytics with postgres. In particular if 10K rpm drives are good enough vs using 15K rpm, over 24 drives. Price difference is $3,000. Rarely ever have more than 2 or 3 connections to the machine. So far from what I have seen throughput is

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Greg Smith
Francisco Reyes wrote: Going with a 3Ware SAS controller. Already have simmilar machine in house. With RAID 1+0 Bonne++ reports around 400MB/sec sequential read. Increase read-ahead and I'd bet you can add 50% to that easy--one area the 3Ware controllers need serious help, as they admit: htt

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Scott Marlowe writes: While 16x15k older drives doing 500Meg seems only a little slow, the 24x10k drives getting only 400MB/s seems way slow. I'd expect a RAID-10 of those to read at somewhere in or just past the gig per Talked to the vendor. The likely issue is the card. They used a single c

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
da...@lang.hm writes: what filesystem is being used. There is a thread on the linux-kernel mailing list right now showing that ext4 seems to top out at ~360MB/sec while XFS is able to go to 500MB/sec+ EXT3 on Centos 5.4 Plan to try and see if I have time with the new machines to try FreeBSD+

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread david
On Tue, 2 Mar 2010, Scott Marlowe wrote: On Tue, Mar 2, 2010 at 2:30 PM, Francisco Reyes wrote: Scott Marlowe writes: Then the real thing to compare is the speed of the drives for throughput not rpm. In a machine, simmilar to what I plan to buy, already in house 24 x 10K rpm gives me about

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Scott Marlowe writes: Have you tried short stroking the drives to see how they compare then? Or is the reduced primary storage not a valid path here? No, have not tried it. By the time I got the machine we needed it in production so could not test anything. When the 2 new machines come I

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Greg Smith writes: in a RAID10, given proper read-ahead adjustment. I get over 200MB/s out of the 3-disk RAID0 on my home server without even trying hard. Can you Any links/suggested reading on "read-ahead adjustment". I understand this may be OS specific, but any info would be helpfull.

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 2:30 PM, Francisco Reyes wrote: > Scott Marlowe writes: > >> Then the real thing to compare is the speed of the drives for >> throughput not rpm. > > In a machine, simmilar to what I plan to buy, already in house 24 x 10K rpm > gives me about 400MB/sec while 16 x 15K rpm (2

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Greg Smith writes: in a RAID10, given proper read-ahead adjustment. I get over 200MB/s out of the 3-disk RAID0 Any links/suggested reads on read-ahead adjustment? It will probably be OS dependant, but any info would be usefull. -- Sent via pgsql-performance mailing list (pgsql-performance

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
da...@lang.hm writes: With sequential scans you may be better off with the large SATA drives as they fit more data per track and so give great sequential read rates. I lean more towards SAS because of writes. One common thing we do is create temp tables.. so a typical pass may be: * sequential

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 1:51 PM, Yeb Havinga wrote: > With 24 drives it'll probably be the controller that is the limiting factor > of bandwidth. Our HP SAN controller with 28 15K drives delivers 170MB/s at > maximum with raid 0 and about 155MB/s with raid 1+0. So I'd go for the 10K > drives and pu

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Scott Marlowe writes: Then the real thing to compare is the speed of the drives for throughput not rpm. In a machine, simmilar to what I plan to buy, already in house 24 x 10K rpm gives me about 400MB/sec while 16 x 15K rpm (2 to 3 year old drives) gives me about 500MB/sec -- Sent via pgs

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Yeb Havinga writes: With 24 drives it'll probably be the controller that is the limiting factor of bandwidth. Going with a 3Ware SAS controller. Our HP SAN controller with 28 15K drives delivers 170MB/s at maximum with raid 0 and about 155MB/s with raid 1+0. Already have simmilar machine

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 2:14 PM, wrote: > On Tue, 2 Mar 2010, Francisco Reyes wrote: > >> Anyone has any experience doing analytics with postgres. In particular if >> 10K rpm drives are good enough vs using 15K rpm, over 24 drives. Price >> difference is $3,000. >> >> Rarely ever have more than 2

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread david
On Tue, 2 Mar 2010, Francisco Reyes wrote: Anyone has any experience doing analytics with postgres. In particular if 10K rpm drives are good enough vs using 15K rpm, over 24 drives. Price difference is $3,000. Rarely ever have more than 2 or 3 connections to the machine. So far from what I h

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Scott Marlowe
On Tue, Mar 2, 2010 at 1:42 PM, Francisco Reyes wrote: > Anyone has any experience doing analytics with postgres. In particular if > 10K rpm drives are good enough vs using 15K rpm, over 24 drives. Price > difference is $3,000. > > Rarely ever have more than 2 or 3 connections to the machine. > >

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Dave Crooke
Seconded these days even a single 5400rpm SATA drive can muster almost 100MB/sec on a sequential read. The benefit of 15K rpm drives is seen when you have a lot of small, random accesses from a working set that is too big to cache the extra rotational speed translates to an average reduc

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Greg Smith
Yeb Havinga wrote: With 24 drives it'll probably be the controller that is the limiting factor of bandwidth. Our HP SAN controller with 28 15K drives delivers 170MB/s at maximum with raid 0 and about 155MB/s with raid 1+0. You should be able to clear 1GB/s on sequential reads with 28 15K driv

Re: [PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Yeb Havinga
Francisco Reyes wrote: Anyone has any experience doing analytics with postgres. In particular if 10K rpm drives are good enough vs using 15K rpm, over 24 drives. Price difference is $3,000. Rarely ever have more than 2 or 3 connections to the machine. So far from what I have seen throughput i

[PERFORM] 10K vs 15k rpm for analytics

2010-03-02 Thread Francisco Reyes
Anyone has any experience doing analytics with postgres. In particular if 10K rpm drives are good enough vs using 15K rpm, over 24 drives. Price difference is $3,000. Rarely ever have more than 2 or 3 connections to the machine. So far from what I have seen throughput is more important than TP