RaidNode should monitor and fix blocks that violate RAID block placement 
-------------------------------------------------------------------------

                 Key: MAPREDUCE-2275
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2275
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: contrib/raid
            Reporter: Ramkumar Vadali
            Assignee: Ramkumar Vadali


When files are RAIDed, it is important to keep blocks in each RAID stripe and 
the corresponding parity blocks on as many different machines as possible. This 
ensures minimal probability of data loss when data nodes go dead.

BlockPlacementPolicyRaid ensures that parity blocks are not located on the same 
machines as the source blocks. But source blocks placement is not controlled 
directly in this manner. Instead, source blocks are allowed to be created using 
the default policy. After a source file is RAIDed, its replication is 
increased, and then decreased. BlockPlacementPolicyRaid then tries to keep the 
source blocks well-located when excess blocks are deleted. This is not 
guaranteed to ensure the correct block placement for RAID.

Also, if blocks are moved around by the balancer, the block placement could be 
violated.

We need periodic monitoring of block placement of RAIDed files and the 
corresponding parity blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to