> I have rack awareness configured and seems to work fine. My default rep > count is 2. Now I lost one rack due to switch failure. Here is what I > observe > > HDFS continues to write in the existing available rack. It still keeps two > copies of each block, but now these blocks are being stored in the same > rack. > > My questions: > > Is this the default HDFS behavior ?
Below is from the 'Hadoop : The Definitive Guide'. So, with a replication of 2 or 3, the first and the second blocks should be placed on different racks. Not sure why they are getting into the same rack. It makes sense to put on a block with replication of 2 on 2 different racks (if available) considering availability. **** Hadoop’s default strategy is to place the first replica on the same node as the client (for clients running outside the cluster, a node is chosen at random, although the system tries not to pick nodes that are too full or too busy). The second replica is placed on a different rack from the first (off-rack), chosen at random. The third replica is placed on the same rack as the second, but on a different node chosen at random. Praveen On Wed, Feb 8, 2012 at 2:47 PM, Mohamed Elsayed < mohammed.elsay...@bibalex.org> wrote: > On 02/07/2012 09:45 PM, Harsh J wrote: > >> Yes balancer may help. You'll also sometimes have to manually >> re-enforce the block placement policy in the stable releases >> presently, the policy violation recovery is not automatic: >> >> hadoop fs -setrep -R 3 / >> hadoop fs -setrep -R 2 / >> > When I execute the first command it goes well, but it halts on executing > the second one. I don't know the reason, but the replication factor becomes > 2 on all datanodes. Is it natural? > > -- > Mohamed Elsayed > >