there are many way to do clustering and one thing that i would consider a "holy grail" would be something like pvm [1] because nothing else seems to have similar horizontal scaling of cpu at the kernel level
i would love to know the mechanism behind dell's equallogic san as it really is clustered lvm on steroids. GFS / orangefs / ocfs are not the easiest things to setup (ocfs is) and i've not found performance to be so great for writes. DRBD is only 2 devices as far as i understand, so not really super scalable i'm still not convinced over the likes of hadoop for storage, maybe i just don't have the scale to "get" it? the thing with clusters is that you want to be able to spin an extra node up and join it to the group and then you increase cpu / storage by n+1 but also you want to be able to spin nodes down dynamically and go down by n-1. i guess this is where hadoop is of benefit because that is not a happy thing for a typical file system. network load balancing is super easy, all info required is in each packet -- application load balancing requires more thought. this is where the likes of memcached can help but also why a good design of the cluster is better. localised data and tiered access etc... kind of why i would like to see a pvm kind of solution -- so that a page fault is triggered like swap memory which then fetches the relevant memory from the network: bearing in mind that a computer can typically trigger thousands of page faults a second and that memory access is very very many times faster than gigabit networking! [1] http://www.csm.ornl.gov/pvm/pvm_home.html