Hi,

I met with a weird problem when using HBase. There are 3 machines: 1 master and 
 2 region servers (wlu-rs1/10.27.17.251 and wlu-rs2/10.27.16.11).
But when I use "status 'detailed'" to see region servers' status, it show there 
are three server, and one server appears twice (exactly same).
3 live servers
10.27.17.251:60020 1329975187706
10.27.16.11:60020 1329975209046
10.27.17.251:60020 1329975187706

When balance begins, region server 10.27.17.251 seems to move data from & to 
itself, and FATAL error occurs.

Log info of HMaster:

2012-02-23 00:01:00,629 INFO org.apache.hadoop.hbase.master.HMaster: balance 
hri=usertable,user172022781,1329972455493.943849e136aa6f7a343d47fed57da429., 
src=wlu-rs1,60020,1329968056162, dest=10.27.17.251,60020,1329968056162
2012-02-23 00:01:00,629 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Starting unassignment of region 
usertable,user172022781,1329972455493.943849e136aa6f7a343d47fed57da429. 
(offlining)
2012-02-23 00:01:09,712 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling new unassigned node: 
/hbase/unassigned/ad483f3806a03756f3f47cd8bd220d09 
(region=usertable,user819517397,1329972500402.ad483f3806a03756f3f47cd8bd220d09.,
 server=wlu-rs1,60020,1329968056162, state=RS_ZK_REGION_CLOSING)
2012-02-23 00:01:09,712 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_CLOSING, server=wlu-rs1,60020,1329968056162, 
region=ad483f3806a03756f3f47cd8bd220d09
2012-02-23 00:01:12,678 FATAL org.apache.hadoop.hbase.master.HMaster: Remote 
unexpected exception
java.io.IOException: Call to /10.27.17.251:60020 failed on local exception: 
java.io.EOFException
                at 
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:806)
                at 
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
                at 
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
                at $Proxy6.closeRegion(Unknown Source)
                at 
org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:601)
                at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1123)
                at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1070)
                at 
org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1930)
                at 
org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:694)
                at 
org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:585)
                at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
Caused by: java.io.EOFException
                at java.io.DataInputStream.readInt(Unknown Source)
                at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:539)
                at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:477)
2012-02-23 00:01:12,680 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
2012-02-23 00:01:12,680 INFO org.apache.hadoop.hbase.master.HMaster: balance


I use HBase0.90.3 and Hadoop0.20.2. Can anyone please help to figure this out?



Regards,
Wei

Reply via email to