I believe I have the right way of going about it. I agree with the kernel developers who've stated that ZFS is a layering violation, and we can do better.
Consider a filesystem on a set of drives; it may want some data to be in a raid5 and some to be mirrored. The correct interface for the filesystem is to have two or more opaque volumes, which it may be able to resize depending on usage. Basically, the order we want is fs -> raid -> lvm. Given a set of identical drives, we want LVM to handle them separately and divide them up into LVs identically; then corresponding LVs are raided together. We might have a raid5 volume and a raid1 volume, each of which can be resized independently. The FS then makes use of both of them, putting metadata and frequently used data on the raid1 and everything else on the raid5 (btrfs, as I understand it, will be able to do this sort of thing eventually). The trouble is it's completely impractical with current tools; we need tighter integration between md and LVM. Basically, we need a new type of VG; you'd only be able to make it out of PVs that are the same size (or close). Then, when you create an LV you decide what kind of redundancy you want; the LVM internally creates multiple LVs and raids them together to make the LV the fs sees. This could, sort of, be done in userspace, but I think it'll work better if dm and md are better integrated in the kernel. I thought i'd throw out my ideas before I get too far in. Thoughts? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/