Thank you again, Austin. My ideal case would be high availability coupled with reliable data replication and integrity against accidental lost. I am willing to cede ground on the write speed; but the read has to be as optimized as possible. So far BTRFS, RAID10 on the 32TB test server is quite good both read & write and data lost/corruption has not been an issue yet. When I introduce the network/distributed layer, I would like the same. BTW does Ceph provides similar functionality, reliability and performace?
On Tue, Apr 26, 2016 at 6:04 AM, Austin S. Hemmelgarn <ahferro...@gmail.com> wrote: > On 2016-04-26 07:44, Juan Alberto Cirez wrote: >> >> Well, >> RAID1 offers no parity, striping, or spanning of disk space across >> multiple disks. >> >> RAID10 configuration, on the other hand, requires a minimum of four >> HDD, but it stripes data across mirrored pairs. As long as one disk in >> each mirrored pair is functional, data can be retrieved. >> >> With GlusterFS as a distributed volume, the files are already spread >> among the servers causing file I/O to be spread fairly evenly among >> them as well, thus probably providing the benefit one might expect >> with stripe (RAID10). >> >> The question I have now is: Should I use a RAID10 or RAID1 underneath >> of a GlusterFS stripped (and possibly replicated) volume? > > If you have enough systems and a new enough version of GlusterFS, I'd > suggest using raid1 on the low level, and then either a distributed > replicated volume or an erasure coded volume in GlusterFS. > Having more individual nodes involved will improve your scalability to > larger numbers of clients, and you can have more nodes with the same number > of disks if you use raid1 instead of raid10 on BTRFS. Using Erasure coding > in Gluster will provide better resiliency with higher node counts for each > individual file, at the cost of moderately higher CPU time being used. > FWIW, RAID5 and RAID6 are both specific cases of (mathematically) optimal > erasure coding (RAID5 is n,n+1 and RAID6 is n,n+2 using the normal > notation), but the equivalent forms in Gluster are somewhat risky with any > decent sized cluster. > > It is worth noting that I would not personally trust just GlusterFS or just > BTRFS with the data replication, BTRFS is still somewhat new (although I > haven't had a truly broken filesystem in more than a year), and GlusterFS > has a lot more failure modes because of the networking. > >> >> On Tue, Apr 26, 2016 at 5:11 AM, Austin S. Hemmelgarn >> <ahferro...@gmail.com> wrote: >>> >>> On 2016-04-26 06:50, Juan Alberto Cirez wrote: >>>> >>>> >>>> Thank you guys so very kindly for all your help and taking the time to >>>> answer my question. I have been reading the wiki and online use cases >>>> and otherwise delving deeper into the btrfs architecture. >>>> >>>> I am managing a 520TB storage pool spread across 16 server pods and >>>> have tried several methods of distributed storage. Last attempt was >>>> using Zfs as a base for the physical bricks and GlusterFS as a glue to >>>> string together the storage pool. I was not satisfied with the results >>>> (mainly Zfs). Once I have run btrfs for a while on the test server >>>> (32TB, 8x 4TB HDD RAID10) for a while I will try btrfs/ceph >>> >>> >>> For what it's worth, GlusterFS works great on top of BTRFS. I don't have >>> any claims to usage in production, but I've done _a lot_ of testing with >>> it >>> because we're replacing one of our critical file servers at work with a >>> couple of systems set up with Gluster on top of BTRFS, and I've been >>> looking >>> at setting up a small storage cluster at home using it on a couple of >>> laptops I have which have non-functional displays. Based on what I've >>> seen, >>> it appears to be rock solid with respect to the common failure modes, >>> provided you use something like raid1 mode on the BTRFS side of things. > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html