> 
> as you suggest bcache and flashcache seem to offer a way around this for
> mdadm but i've never used either of them - i was already using zfs by
> the time they became available. i don't think the SATA interface speed
> is a deal-breaker for them because the only way around that is spending
> huge amounts of money.
> 

There were some not-too-expensive battery backed PCI ramdisks available a while 
ago. Not anymore though.

> 
> > . Battery backed write cache. Bcache/flashcache offer this but they
> > have their shortcomings, in particular that most available cache
> > modules are still on top of the SATA channel.
> 
> this is the only real advantage of hardware raid over mdadm.  IMO, ZFS's
> ability to use an SSD or other fast block device as cache completely
> eliminates this last remaining superiority of hardware raid over
> software raid.

Yes I've now been enlightened on this subject :)

> > . Online resize/reconfigure
> 
> both btrfs and zfs offer this.
> 

Can it seamlessly continue over reboot? Obviously it can't progress while the 
system is rebooting like a hardware raid but I'd hope it could pick up where it 
left of automatically.

> > . BIOS boot support (see recent thread "RAID, again" by me)
> 
> this is a misfeature of a crappy BIOS rather than a fault with software
> raid.
> 
> any decent BIOS not only has the ability to choose which disk to boot
> from (rather than hard-code it to only boot from whichever disk is
> plugged into the first disk port) but will also let you specify a boot
> order so that it will try disk 1 followed by disk 2 and then disk 3 or
> whatever. they'll also typically let you press F2 or F12 or whatever at
> boot time to pop up a boot device selection menu.
> 
> even server motherboards like supermicro let you choose the boot device
> and have a boot menu option accessible over IPMI.
> 

This is where a lot of people get this wrong. Once the BIOS has succeeded in 
reading the bootsector from a boot disk it's committed. If the bootsector reads 
okay (even after a long time on a failing disk) but anything between the 
bootsector and the OS fails, your boot has failed. This 'anything between' 
includes the grub bootstrap, xen hypervisor, linux kernel, and initramfs, so 
it's a substantial amount of data to read from a disk that may be on its last 
legs. A good hardware RAID will have long since failed the disk by this point 
and booting will succeed.

My last remaining reservation on going ahead with some testing is is there an 
equivalent of clvm for zfs? Or is that the right approach for zfs? My main 
server cluster is:

2 machines each running 2 x 2TB disks with DRBD with the primary exporting the 
whole disk as an iSCSI volume
2 machines each importing the iSCSI volume running lvm (clvm) on top, and using 
the lv's as backing stores for xen VM's.

How would this best be done using zfs?

Thanks

James

_______________________________________________
luv-main mailing list
[email protected]
http://lists.luv.asn.au/listinfo/luv-main

Reply via email to