Hi,
I met with a weird problem when using HBase. There are 3 machines: 1 master and
2 region servers (wlu-rs1/10.27.17.251 and wlu-rs2/10.27.16.11).
But when I use "status 'detailed'" to see region servers' status, it show there
are three server, and one server appears twice (exactly same).
3 live servers
10.27.17.251:60020 1329975187706
10.27.16.11:60020 1329975209046
10.27.17.251:60020 1329975187706
When balance begins, region server 10.27.17.251 seems to move data from & to
itself, and FATAL error occurs.
Log info of HMaster:
2012-02-23 00:01:00,629 INFO org.apache.hadoop.hbase.master.HMaster: balance
hri=usertable,user172022781,1329972455493.943849e136aa6f7a343d47fed57da429.,
src=wlu-rs1,60020,1329968056162, dest=10.27.17.251,60020,1329968056162
2012-02-23 00:01:00,629 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
Starting unassignment of region
usertable,user172022781,1329972455493.943849e136aa6f7a343d47fed57da429.
(offlining)
2012-02-23 00:01:09,712 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
Handling new unassigned node:
/hbase/unassigned/ad483f3806a03756f3f47cd8bd220d09
(region=usertable,user819517397,1329972500402.ad483f3806a03756f3f47cd8bd220d09.,
server=wlu-rs1,60020,1329968056162, state=RS_ZK_REGION_CLOSING)
2012-02-23 00:01:09,712 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
Handling transition=RS_ZK_REGION_CLOSING, server=wlu-rs1,60020,1329968056162,
region=ad483f3806a03756f3f47cd8bd220d09
2012-02-23 00:01:12,678 FATAL org.apache.hadoop.hbase.master.HMaster: Remote
unexpected exception
java.io.IOException: Call to /10.27.17.251:60020 failed on local exception:
java.io.EOFException
at
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:806)
at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy6.closeRegion(Unknown Source)
at
org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:601)
at
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1123)
at
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1070)
at
org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1930)
at
org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:694)
at
org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:585)
at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(Unknown Source)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:539)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:477)
2012-02-23 00:01:12,680 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
2012-02-23 00:01:12,680 INFO org.apache.hadoop.hbase.master.HMaster: balance
I use HBase0.90.3 and Hadoop0.20.2. Can anyone please help to figure this out?
Regards,
Wei