On Wed, Dec 14, 2011 at 10:00 AM, Scott Carey <sc...@richrelevance.com> wrote: >>As of today, there is no option except to use NFS. And as you yourself >>mention, the first HA prototype when it comes out will require NFS. > > How will it 'require' NFS? Won't any 'remote, high availability storage' > work? NFS is unreliable unless in my experience unless: ...> > A solution with a brief 'stall' in service while a SAN mount switched over > or similar with drbd should be possible and data safe, if this is being > built to truly 'require' NFS that is no better for me than the current > situation, which we manage using OS level tools for failover that will > temporarily break clients but resume availability quickly thereafter. > Where I would like the most help from hadoop is in making the failover > transparent to clients, not in solving the reliable storage problem or > failover scenarios that Storage and OS vendors do.
Currently our requirement is that we can have two client machines "mount" the storage, though only one needs to have it mounted rw at a time. This is certainly doable with DRBD in conjunction with a clustered filesystem like GPFS2. I believe Dhruba was doing some experimentation with an approach like this. It's not currently provided for, but it wouldn't be very difficult to extend the design so that the standby didn't even need read access until the failover event. It would just cause a longer failover period since the standby would have more edits to "catch up" with, etc. I don't think anyone's currently working on this, but if you wanted to contribute I can point you in the right direction. If you happen to be at the SF HUG tonight, grab me and I'll give you the rundown on what would be needed. -Todd -- Todd Lipcon Software Engineer, Cloudera