Re: Moving disks from one datanode to another

Charles Earl Wed, 07 Dec 2011 10:46:40 -0800

Jeff,
Interested in how you approach the virtualization on hadoop issue. 
In particular, I would like to have a VM launched as an environment which could 
in essence mount the local data node's disk (or replica).
For my application, the users in essence want the map task running in a given 
virtualized environment, but have the task run against HDFS store.
Conceptually, it would seem that you would want each VM to have separate 
physically mounted disk?
When I've used virtual disk this has shown 30% worse performance on 
write-oriented map than physical disk mount. This was with kvm with virtio, 
simple test with randomwriter.
I wonder if you had any suggestions in that regard.
I'm actually just now compiling & testing a vm based isolation module for the 
mesos (http://www.mesosproject.org/) in the hopes that this will address the 
need.
The machine-as-rack paradigm seems quite interesting.
Charles
On Dec 7, 2011, at 1:21 PM, Jeffrey Buell wrote:


>> I found a way:
>> 
>> 1) Configure second datanode with a set of fresh empty directories.
>> 2) Start second datanode, let it register with namenode.
>> 3) Shut down first and second datanode, then move blk* and subdir dirs
>> from data dirs of first node to data dirs of second datanode.
>> 4) Start first and second datanode.
>> 
>> This seems to work as intended. However, after some thinking I came to
>> worry about the replication. HDFS will now consider the two datanode
>> instances on the same host as two different hosts, which may cause
>> replication to put two copies of the same file on the same host.
>> 
>> It's probably not going to happen very often given that there's some
>> randomness involved. And in my case there's always a third copy on
>> another rack.
>> 
>> Still, it's less than optimal. Are there any ways to fool HDFS into
>> always placing all copies on different physical hosts in this rather
>> messed up configuration?
>> 
>> Thanks,
>> \EF
> 
> 
> This is the same issue as for running multiple virtual machines on each 
> physical host.  I've found (on 0.20.2) that this gives consistently better 
> performance than a single VM or a native OS instance 
> (http://www.vmware.com/resources/techresources/10222), at least for 
> I/O-intensive apps.  I'm still investigating why, but one possibility is that 
> a datanode can't efficiently handle too many disks (I have either 10 or 12 
> per physical host).  So I'm very interested in seeing if multiple datanodes 
> has a similar performance effect as multiple VMs (each with one DN).
> 
> Back to replication:  Hadoop doesn't know that the machines it's running on 
> might share a physical host, so there is a possibility that 2 copies end up 
> on the same host.  What I'm doing now is define each host as a rack, so the 
> second copy is guaranteed to go to a different host. I have a single physical 
> rack.  I'm tempted to call physical racks "super racks" to distinguish them 
> from logical racks.  A better scheme may be to divide the physical rack into 
> 2 logical racks, so that most of the time the third copy goes on a different 
> host than the second.  I think that is the best that can be done today.  
> Ideally we want to modify the block placement algorithm to recognize another 
> level in the topology hierarchy for the multiple VM/DN case.  A simpler 
> solution would be to add an option where the third copy is placed in a third 
> rack when available (and extended to n replicas on n racks instead of random 
> placement for n>3).  This would work for the single physical rack case with 
> each host defined as a rack for the topology.  Placing replicas on separate 
> racks may be desirable for some conventional configurations also (e.g., ones 
> with good inter-rack bandwidth).
> 
> Jeff

Re: Moving disks from one datanode to another

Reply via email to