Hi Eitan, yup, The namenode has a mapping of blocks to datanodes, which keeping in memory. and you are also right, DN also keeps block structure in memory.
but you've noticed, I restart the DN after I moved block data manually. then during the data node's start, Block Scanner can scan the block local files, then construct it in the memory, then, send block report to the NN, when NN received block report from this data node, NN update <blockID, datanode storageID> mapping. so I am right and tested it. but as Harsh mentioned, we'd stop both src and target data node, then move block data manually, then start these two peer data nodes. On Mon, Jul 8, 2013 at 9:23 PM, Eitan Rosenfeld <eita...@gmail.com> wrote: > Hi Azurry, I'd also like to be able to manually move blocks. > > One piece that is missing in your current approach is updating any > block mappings that the cluster relies on. > The namenode has a mapping of blocks to datanodes, and the datanode > has, as the comments say, a "block -> stream of bytes" mapping. > > As I understand it, the namenode's mappings have to be updated to > reflect the new block locations. > The datanode might not need intervention, I'm not sure. > > Can anyone else chime in on those areas? > > The balancer that Allan suggested likely demonstrates all of the ins > and outs in order successfully complete a block transfer. > Thus, the balancer is where I'll begin my efforts to learn how to > manually move blocks. > > Any other pointers would be helpful. > > Thank you, > Eitan > > On Mon, Jul 8, 2013 at 2:15 PM, Allan <wilsoncr...@gmail.com> wrote: > > If the imbalance is across data nodes then you need to run the balancer. > > > > Sent from my iPad > > > > On Jul 8, 2013, at 1:15 AM, Azuryy Yu <azury...@gmail.com> wrote: > > > >> Hi Dear all, > >> > >> There are some unbalanced data nodes in my cluster, some nodes reached > more > >> than 95% disk usage. > >> > >> so Can I move some block data from one node to another node directly? > >> > >> such as: from n1 to n2: > >> > >> 1) scp /data/xxxx/blk_* n2:/data/subdir11/ > >> 2) rm -rf data/xxxx/blk_* > >> 3) hadoop-dameon.sh stop datanode (on n1) > >> 4) hadoop-damon.sh start datanode(on n1) > >> 5) hadoop-dameon.sh stop datanode (on n2) > >> 6) hadoop-damon.sh start datanode(on n2) > >> > >> Am I right? Thanks for any inputs. >