Hi, I'm using HBase 0.94.12 above Hadoop 1.2.1 and I have one node for 
zookeeper, one node for a Namenode/Hmaster and three Datanode/Regionservers. 
All the machines are on Amazon EC2, instance m2.xlarge.

I set the replication at two, so I'm expecting if I kill a 
HregionServer/Datanode (for example by killing all java processes), all the 
regions on that node are recover on one of the other two alive 
HRegionservers.

But when I kill the node, I lost the regions on it and, worst of all, if on 
that node there is .META. or -ROOT- table, the entire cluster is not working 
at all!

If it could be helpfull, I load 500000 of rows in 'usertable' table with 
YCSB tool and these are the status 'simple' and /hadoop fsck /hbase output 
before/after the kill of the node:

before:

hbase(main):001:0> status 'simple'
3 live servers
    ip-10-235-11-139:60020 1385632293907
        requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=57, 
maxHeapMB=14983
    ip-10-253-29-220:60020 1385632293955
        requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=74, 
maxHeapMB=14983
    ip-10-253-29-249:60020 1385632294162
        requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=1935, 
maxHeapMB=14983
0 dead servers
Aggregate load: 0, regions: 4


FSCK started by ubuntu from /10.253.91.250 for path /hbase at Thu Nov 28 
09:57:20 UTC 2013
..................................Status: HEALTHY
 Total size:    2122147158 B
 Total dirs:    31
 Total files:   34 (Files currently being written: 3)
 Total blocks (validated):      59 (avg. block size 35968595 B) (Total open 
file blocks (not validated): 2)
 Minimally replicated blocks:   59 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1
FSCK ended at Thu Nov 28 09:57:20 UTC 2013 in 23 milliseconds


The filesystem under path '/hbase' is HEALTHY

-------------------------------------------------------------------------
-------------------------------------------------------------------------

and after (about 15 minutes):

hbase(main):001:0> status 'simple'
2 live servers
    ip-10-235-11-139:60020 1385632293907
        requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=63, 
maxHeapMB=14983
    ip-10-253-29-220:60020 1385632293955
        requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=117, 
maxHeapMB=14983
1 dead servers
    ip-10-253-29-249,60020,1385632294162
Aggregate load: 0, regions: 3


FSCK started by ubuntu from /10.253.91.250 for path /hbase at Thu Nov 28 
10:13:29 UTC 2013
....................Status: HEALTHY
 Total size:    948168097 B
 Total dirs:    27
 Total files:   20 (Files currently being written: 3)
 Total blocks (validated):      29 (avg. block size 32695451 B) (Total open 
file blocks (not validated): 2)
 Minimally replicated blocks:   29 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          2
 Number of racks:               1
FSCK ended at Thu Nov 28 10:13:29 UTC 2013 in 7 milliseconds


The filesystem under path '/hbase' is HEALTHY


I hope to have been clear and to provide sufficiently information, or I can 
post the hbase-site.xml and hdfs-site.xml configuration.

Thank you for your help!

Andrea

Reply via email to