Re: btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?
On Thu, Feb 13, 2014 at 11:32:03AM -0500, Jim Salter wrote: That is FANTASTIC news. Thank you for wielding the LART gently. =) No LART necessary. :) Nobody knows everything, and it's not a particularly heavily-documented or written-about feature at the moment (mostly because it only exists in Chris's local git repo). I do a fair amount of public speaking and writing about next-gen filesystems (example: http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/) and I will be VERY sure to talk about the upcoming divorce of stripe size from array size in future presentations. This makes me positively giddy. FWIW, after writing the above article I got contacted by a proprietary storage vendor who wanted to tell me all about his midmarket/enterprise product, and he was pretty audibly flummoxed when I explained how btrfs-RAID1 distributes data and redundancy - his product does something similar (to be fair, his product also does a lot of other things btrfs doesn't inherently do, like clustered storage and synchronous dedup), and he had no idea that anything freely available did anything vaguely like it. That's quite entertaining for the bogglement factor. Although, again, see my comment above... Hugo. I have a feeling the storage world - even the relatively well-informed part of it that's aware of ZFS - has little to no inclination how gigantic of a splash btrfs is going to make when it truly hits the mainstream. This could be a pretty powerful setup IMO - if you implemented something like this, you'd be able to arbitrarily define your storage efficiency (percentage of parity blocks / data blocks) and your fault-tolerance level (how many drives you can afford to lose before failure) WITHOUT tying it directly to your underlying disks, or necessarily needing to rebalance as you add more disks to the array. This would be a heck of a lot more flexible than ZFS' approach of adding more immutable vdevs. Please feel free to tell me why I'm dumb for either 1. not realizing the obvious flaw in this idea or 2. not realizing it's already being worked on in exactly this fashion. =) The latter. :) -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Nothing right in my left brain. Nothing left in --- my right brain. signature.asc Description: Digital signature
Re: btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?
Hi Jim, On 02/13/2014 05:13 PM, Jim Salter wrote: This might be a stupid question but... There is no stupid questions, only stupid answers... Are there any plans to make parity RAID levels in btrfs similar to the current implementation of btrfs-raid1? It took me a while to realize how different and powerful btrfs-raid1 is from traditional raid1. The ability to string together virtually any combination of mutt hard drives together in arbitrary ways and yet maintain redundancy is POWERFUL, and is seriously going to be a killer feature advancing btrfs adoption in small environments. The one real drawback to btrfs-raid1 is that you're committed to n/2 storage efficiency, since you're using pure redundancy rather than parity on the array. I was thinking about that this morning, and suddenly it occurred to me that you ought to be able to create a striped parity array in much the same way as a btrfs-raid1 array. Let's say you have five disks, and you arbitrarily want to define a stripe length of four data blocks plus one parity block per stripe. I what it is different from a raid5 setup (which is supported by btrfs)? Right now, what you're looking at effectively amounts to a RAID3 array, like FreeBSD used to use. But, what if we add two more disks? Or three more disks? Or ten more? Is there any reason we can't keep our stripe length of four blocks + one parity block, and just distribute them relatively ad-hoc in the same way btrfs-raid1 distributes redundant data blocks across an ad-hoc array of disks? This could be a pretty powerful setup IMO - if you implemented something like this, you'd be able to arbitrarily define your storage efficiency (percentage of parity blocks / data blocks) and your fault-tolerance level (how many drives you can afford to lose before failure) WITHOUT tying it directly to your underlying disks May be that it is a good idea, but which would be the advantage to use less drives that the available ones for a RAID ? Regarding the fault tolerance level, few weeks ago there was a posting about a kernel library which would provide a generic RAID framework capable of several degree of fault tolerance (raid 5,6,7...) [give a look to [RFC v4 2/3] fs: btrfs: Extends btrfs/raid56 to support up to six parities 2014/1/25]. This definitely would be a big leap forward. BTW, the raid5/raid6 support in BTRFS is only for testing purpose. However Chris Mason, told few week ago that he will work on these issues. [...] necessarily needing to rebalance as you add more disks to the array. This would be a heck of a lot more flexible than ZFS' approach of adding more immutable vdevs. There is no needing to re-balance if you add more drives. The next chunk allocation will span all the available drives anyway. It is only required when you want to spans all data already written on all the drives. Regards Goffredo -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?
On Thu, Feb 13, 2014 at 09:22:07PM +0100, Goffredo Baroncelli wrote: Hi Jim, On 02/13/2014 05:13 PM, Jim Salter wrote: Let's say you have five disks, and you arbitrarily want to define a stripe length of four data blocks plus one parity block per stripe. I what it is different from a raid5 setup (which is supported by btrfs)? With what's above, yes, that's the current RAID-5 code. Right now, what you're looking at effectively amounts to a RAID3 array, like FreeBSD used to use. But, what if we add two more disks? Or three more disks? Or ten more? Is there any reason we can't keep our stripe length of four blocks + one parity block, and just distribute them relatively ad-hoc in the same way btrfs-raid1 distributes redundant data blocks across an ad-hoc array of disks? This could be a pretty powerful setup IMO - if you implemented something like this, you'd be able to arbitrarily define your storage efficiency (percentage of parity blocks / data blocks) and your fault-tolerance level (how many drives you can afford to lose before failure) WITHOUT tying it directly to your underlying disks May be that it is a good idea, but which would be the advantage to use less drives that the available ones for a RAID ? Performance, plus the ability to handle different sized drives. Hmm... maybe I should do an optimise option for the space planner... Regarding the fault tolerance level, few weeks ago there was a posting about a kernel library which would provide a generic RAID framework capable of several degree of fault tolerance (raid 5,6,7...) [give a look to [RFC v4 2/3] fs: btrfs: Extends btrfs/raid56 to support up to six parities 2014/1/25]. This definitely would be a big leap forward. BTW, the raid5/raid6 support in BTRFS is only for testing purpose. However Chris Mason, told few week ago that he will work on these issues. [...] necessarily needing to rebalance as you add more disks to the array. This would be a heck of a lot more flexible than ZFS' approach of adding more immutable vdevs. There is no needing to re-balance if you add more drives. The next chunk allocation will span all the available drives anyway. It is only required when you want to spans all data already written on all the drives. The balance opens up more usable space, unless the new device is (some nasty function of) the remaining free space on the other drives. It's not necessarily about spanning the data, although that's an effect, too. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- It used to take a lot of talent and a certain type of --- upbringing to be perfectly polite and have filthy manners at the same time. Now all it needs is a computer. signature.asc Description: Digital signature
btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?
This might be a stupid question but... Are there any plans to make parity RAID levels in btrfs similar to the current implementation of btrfs-raid1? It took me a while to realize how different and powerful btrfs-raid1 is from traditional raid1. The ability to string together virtually any combination of mutt hard drives together in arbitrary ways and yet maintain redundancy is POWERFUL, and is seriously going to be a killer feature advancing btrfs adoption in small environments. The one real drawback to btrfs-raid1 is that you're committed to n/2 storage efficiency, since you're using pure redundancy rather than parity on the array. I was thinking about that this morning, and suddenly it occurred to me that you ought to be able to create a striped parity array in much the same way as a btrfs-raid1 array. Let's say you have five disks, and you arbitrarily want to define a stripe length of four data blocks plus one parity block per stripe. Right now, what you're looking at effectively amounts to a RAID3 array, like FreeBSD used to use. But, what if we add two more disks? Or three more disks? Or ten more? Is there any reason we can't keep our stripe length of four blocks + one parity block, and just distribute them relatively ad-hoc in the same way btrfs-raid1 distributes redundant data blocks across an ad-hoc array of disks? This could be a pretty powerful setup IMO - if you implemented something like this, you'd be able to arbitrarily define your storage efficiency (percentage of parity blocks / data blocks) and your fault-tolerance level (how many drives you can afford to lose before failure) WITHOUT tying it directly to your underlying disks, or necessarily needing to rebalance as you add more disks to the array. This would be a heck of a lot more flexible than ZFS' approach of adding more immutable vdevs. Please feel free to tell me why I'm dumb for either 1. not realizing the obvious flaw in this idea or 2. not realizing it's already being worked on in exactly this fashion. =) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?
On Thu, Feb 13, 2014 at 11:13:58AM -0500, Jim Salter wrote: This might be a stupid question but... Are there any plans to make parity RAID levels in btrfs similar to the current implementation of btrfs-raid1? Yes. It took me a while to realize how different and powerful btrfs-raid1 is from traditional raid1. The ability to string together virtually any combination of mutt hard drives together in arbitrary ways and yet maintain redundancy is POWERFUL, and is seriously going to be a killer feature advancing btrfs adoption in small environments. The one real drawback to btrfs-raid1 is that you're committed to n/2 storage efficiency, since you're using pure redundancy rather than parity on the array. I was thinking about that this morning, and suddenly it occurred to me that you ought to be able to create a striped parity array in much the same way as a btrfs-raid1 array. Let's say you have five disks, and you arbitrarily want to define a stripe length of four data blocks plus one parity block per stripe. Right now, what you're looking at effectively amounts to a RAID3 array, like FreeBSD used to use. But, what if we add two more disks? Or three more disks? Or ten more? Is there any reason we can't keep our stripe length of four blocks + one parity block, and just distribute them relatively ad-hoc in the same way btrfs-raid1 distributes redundant data blocks across an ad-hoc array of disks? None whatsoever. This could be a pretty powerful setup IMO - if you implemented something like this, you'd be able to arbitrarily define your storage efficiency (percentage of parity blocks / data blocks) and your fault-tolerance level (how many drives you can afford to lose before failure) WITHOUT tying it directly to your underlying disks, or necessarily needing to rebalance as you add more disks to the array. This would be a heck of a lot more flexible than ZFS' approach of adding more immutable vdevs. Please feel free to tell me why I'm dumb for either 1. not realizing the obvious flaw in this idea or 2. not realizing it's already being worked on in exactly this fashion. =) The latter. :) One of the (many) existing problems with the parity RAID implementation as it is is that with large numbers of devices, it becomes quite inefficient to write data when you (may) need to modify dozens of devices. It's been Chris's stated intention for a while now to allow a bound to be placed on the maximum number of devices per stripe, which allows a degree of control over the space-yield - performance knob. It's one of the reasons that the usage tool [1] has a maximum stripes knob on it -- so that you can see the behaviour of the FS once that feature's in place. Hugo. [1] http://carfax.org.uk/btrfs-usage/ -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Nothing right in my left brain. Nothing left in --- my right brain. signature.asc Description: Digital signature
Re: btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?
That is FANTASTIC news. Thank you for wielding the LART gently. =) I do a fair amount of public speaking and writing about next-gen filesystems (example: http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/) and I will be VERY sure to talk about the upcoming divorce of stripe size from array size in future presentations. This makes me positively giddy. FWIW, after writing the above article I got contacted by a proprietary storage vendor who wanted to tell me all about his midmarket/enterprise product, and he was pretty audibly flummoxed when I explained how btrfs-RAID1 distributes data and redundancy - his product does something similar (to be fair, his product also does a lot of other things btrfs doesn't inherently do, like clustered storage and synchronous dedup), and he had no idea that anything freely available did anything vaguely like it. I have a feeling the storage world - even the relatively well-informed part of it that's aware of ZFS - has little to no inclination how gigantic of a splash btrfs is going to make when it truly hits the mainstream. This could be a pretty powerful setup IMO - if you implemented something like this, you'd be able to arbitrarily define your storage efficiency (percentage of parity blocks / data blocks) and your fault-tolerance level (how many drives you can afford to lose before failure) WITHOUT tying it directly to your underlying disks, or necessarily needing to rebalance as you add more disks to the array. This would be a heck of a lot more flexible than ZFS' approach of adding more immutable vdevs. Please feel free to tell me why I'm dumb for either 1. not realizing the obvious flaw in this idea or 2. not realizing it's already being worked on in exactly this fashion. =) The latter. :) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html