On Thu, Apr 27, 2006 at 08:57:51AM -0400, Ketema Harris wrote:
OK.  My thought process was that having non local storage as say a big raid
5 san ( I am talking 5 TB with expansion capability up to 10 )

That's two disk trays for a cheap slow array. (Versus a more expensive solution with more spindles and better seek performance.)

would allow
me to have redundancy, expandability, and hopefully still retain decent
performance from the db.  I also would hopefully then not have to do
periodic backups from the db server to some other type of storage.

No, backups are completely unrelated to your storage type; you need them either way. On a SAN you can use a SAN backup solution to back multiple systems up with a single backup unit without involving the host CPUs. This is fairly useless if you aren't amortizing the cost over a large environment.

Is this not a good idea?

It really depends on what you're hoping to get. As described, it's not clear. (I don't know what you mean by "redundancy, expandability" or "decent performance".)

How bad of a performance hit are we talking about?

Way too many factors for an easy answer. Consider the case of NAS vs SCSI direct attach storage. You're probably in that case comparing a single 125MB/s (peak) gigabit ethernet channel to (potentially several) 320MB/s (peak) SCSI channels. With a high-end NAS you might get 120MB/s off that GBE. With a (more realistic) mid-range unit you're more likely to get 40-60MB/s. Getting 200MB/s off the SCSI channel isn't a stretch, and you can fairly easily stripe across multiple SCSI channels. (You can also bond multiple GBEs, but then your cost & complexity start going way up, and you're never going to scale as well.) If you have an environment where you're doing a lot of sequential scans it isn't even a contest. You can also substitute SATA for SCSI, etc.

For a FC SAN the peformance numbers are a lot better, but the costs & complexity are a lot higher. An iSCSI SAN is somewhere in the middle.

Also, in regards to the commit data integrity, as far as the db is concerned once the data is sent to the san or nas isn't it "written"? The storage may have that write in cache, but from my reading and understanding of how these various storage devices work that is how they keep up performance.

Depends on the configuration, but yes, most should be able to report back a "write" once the data is in a non-volatile cache. You can do the same with a direct-attached array and eliminate the latency inherent in accessing the remote storage.

I would expect my bottleneck if any to be the actual Ethernet transfer to the storage, and I am going to try and compensate for that with a full gigabit backbone.

see above.

The advantages of a NAS or SAN are in things you haven't really touched on. Is the filesystem going to be accessed by several systems? Do you need the ability to do snapshots? (You may be able to do this with direct-attach also, but doing it on a remote storage device tends to be simpler.) Do you want to share one big, expensive, reliable unit between multiple systems? Will you be doing failover? (Note that failover requires software to go with the hardware, and can be done in a different way with local storage also.) In some environments the answers to those questions are yes, and the price premium & performance implications are well worth it. For a single DB server the answer is almost certainly "no".
Mike Stone

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Reply via email to