Hello J-D, We updated to hbase0.20.3, and the NPE did not appear, and the regionserver startted up succesully. But when our job was running, there were some exceptions like that: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server 172.16.1.208:60020 for region summary,,1267086513037, row 'SITE-0000000003\x01browser\x0120100303000000\x01Firefox', but failed after 10 attempts. Exceptions: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /172.16.1.208:60020 after attempts=1 at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1048) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:417) at com.longtuo.cactus.summary.db.SummaryTable.add(SummaryTable.java:321) at com.longtuo.cactus.summary.job.GeneralReducer.flushToHBase(GeneralReducer.java:77) at com.longtuo.cactus.summary.job.GeneralReducer.reduce(GeneralReducer.java:58) at com.longtuo.cactus.summary.job.GeneralReducer.reduce(GeneralReducer.java:1) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) ------- Task attempt_201003031929_0003_r_000000_0 failed to report status for 602 seconds. Killing!
How to deal with it? Btw, this node had been a datanode/regionserver in our cluster for a long time, and for some reason we removed it from the cluster, bakup it's data, reinstalled the os, and re-added it into the cluster. Any Suggestion? Thanks a lot. LvZheng
