Then, I stop my application (the application write to and read from HBase). After one hour, when I come back to see the status of HDFS, some blocks are deleted. Following is current status.
[schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" hadoop-schubert-namenode-nd0-rack0-cloud.log 2956 [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" hadoop-schubert-namenode-nd0-rack0-cloud.log 2962 node1: 464518 node2: 42495 node3: 7505 node4: 7205 node5: 7636 On each node, the datanode process is busy (top). I want to know the reason of these phenomenons. Thanks. Schubert On Wed, Mar 25, 2009 at 6:37 PM, schubert zhang <zson...@gmail.com> wrote: > From another point of view, I think HBase cannot control to delete blocks > on which node, it would just delete files, and HDFS delete blocks where the > blocks locating. > > Schubert > > On Wed, Mar 25, 2009 at 6:28 PM, schubert zhang <zson...@gmail.com> wrote: > >> Thanks Ryan. Balancer may take a long time. >> >> The number of block are too different. But maybe it is caused by HBase not >> deleting garbage blocks on regionserver1 and regionserver2 and maybe others. >> >> We grep the logs of hadoop and find there is no any "deleting block" in >> node1 and node2. >> >> Following is the grep (grep -c "ask 10.24.1.1?:50010 to delete") result of >> hasoop logs: >> >> namenode: >> >> -----grep -c "ask 10.24.1.12:50010 to delete"-----node1 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 4754 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 1062 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 0 >> >> -----grep -c "ask 10.24.1.14:50010 to delete"-----node2 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 1494 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 3305 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 3385 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 1494 >> >> -----grep -c "ask 10.24.1.16:50010 to delete"-----node3 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.16:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 8022 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.16:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 8238 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.16:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 4302 >> >> -----grep -c "ask 10.24.1.18:50010 to delete"-----node4 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.18:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 8591 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.18:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 9111 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.18:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 5038 >> >> -----grep -c "ask 10.24.1.20:50010 to delete"-----node5 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.20:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 3794 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.20:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 3946 >> [schub...@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.20:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 2989 >> >> So, I think it may caused by HBase. >> I just grep the log of the zero "delete block" node. and find: >> [schub...@nd1-rack0-cloud logs]$ grep -c "Deleting block" >> hadoop-schubert-datanode-nd1-rack0-cloud.log.2009-03-24 >> 104739 >> [schub...@nd1-rack0-cloud logs]$ grep -c "Deleting block" >> hadoop-schubert-datanode-nd1-rack0-cloud.log.2009-03-23 >> 465927 >> [schub...@nd1-rack0-cloud logs]$ grep -c "Deleting block" >> hadoop-schubert-datanode-nd1-rack0-cloud.log >> 0 >> >> >> >> >> On Wed, Mar 25, 2009 at 5:14 PM, Ryan Rawson <ryano...@gmail.com> wrote: >> >>> Try >>> hadoop/bin/start-balancer.sh >>> >>> HDFS doesnt auto-balance. Balancing in HDFS requires moving data around, >>> whereas balancing in HBase just means opening a file on a different >>> machine. >>> >>> On Wed, Mar 25, 2009 at 2:12 AM, schubert zhang <zson...@gmail.com> >>> wrote: >>> >>> > Hi all, >>> > I am using hbase-0.19.1 and hadoop-0.19. >>> > My cluster have 5+1 nodes, and there are about 512 regions in HBase >>> (256MB >>> > per region). >>> > >>> > But I found the blocks in HDFS is very unbalanced. Following is the >>> status >>> > from HDFS web GUI. >>> > >>> > (Node: I don't know if this mailing list can display html!) >>> > >>> > HDFS blocks: >>> > node1 509036 blocks >>> > node2 157937 blocks >>> > node3 15783 blocks >>> > node4 15117 blocks >>> > node5 20158 blocks >>> > >>> > But my HBase regions are very balanced. >>> > node1 88 regions >>> > node2 108 regions >>> > node3 111 regions >>> > node4 102 regions >>> > node5 105 regions >>> > >>> > >>> > >>> > NodeLast >>> > ContactAdmin StateConfigured >>> > Capacity (GB)Used >>> > (GB)Non DFS >>> > Used (GB)Remaining >>> > (GB)Used >>> > (%)Used >>> > (%)Remaining >>> > (%)Blocksnd1-rack0-cloud< >>> > >>> http://nd1-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 0In Service822.8578.6743.28200.8670.3324.41509036nd2-rack0-cloud< >>> > >>> http://nd2-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 0In Service822.8190.0242.96589.8223.0971.68157937nd3-rack0-cloud< >>> > >>> http://nd3-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 0In Service822.851.9542.61728.246.3188.5115783nd4-rack0-cloud< >>> > >>> http://nd4-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 6In Service822.846.1942.84733.775.6189.1815117nd5-rack0-cloud< >>> > >>> http://nd5-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 1In Service1215.6152.3762.911100.324.3190.5220158 >>> > >>> > >>> > But my HBase regions are very balanced. >>> > >>> > AddressStart CodeLoadnd1-rack0-cloud:60020 < >>> http://nd1-rack0-cloud:60030/> >>> > 1237967027050requests=383, regions=88, usedHeap=978, maxHeap=1991 >>> > nd2-rack0-cloud:60020 <http://nd2-rack0-cloud:60030/ >>> > >1237788871362requests=422, >>> > regions=108, usedHeap=1433, >>> > maxHeap=1991nd3-rack0-cloud:60020<http://nd3-rack0-cloud:60030/> >>> > 1237788881667requests=962, regions=111, usedHeap=1534, maxHeap=1991 >>> > nd4-rack0-cloud:60020 <http://nd4-rack0-cloud:60030/ >>> > >1237788859541requests=369, >>> > regions=102, usedHeap=1059, >>> > maxHeap=1991nd5-rack0-cloud:60020<http://nd5-rack0-cloud:60030/> >>> > 1237788899331requests=384, regions=105, usedHeap=1535, >>> > maxHeap=1983Total:servers: >>> > 5 requests=2520, regions=514 >>> > >>> >> >> >