I believe it is number (2) below. I'm getting "RetriesExhaustedException" for exactly the same server region in all my reduce jobs.
How did you get around this problem? Thanks -Yair -----Original Message----- From: Mat Hofschen [mailto:[email protected]] Sent: Wednesday, March 18, 2009 11:38 AM To: [email protected] Subject: Re: RetriesExhaustedException for TableReduce Hi Yair, check the logs of the machine that refuses connection. I had two problems during large imports: 1. *"Too many open files*" see http://wiki.apache.org/hadoop/Hbase/FAQ (6) 2. Regions not distributed, heavy write access to one machine. Hope this helps, Matthias On Tue, Mar 17, 2009 at 11:19 PM, Yair Even-Zohar <[email protected] > wrote: > While loading a large amount of data to a non-empty table using > Tablereduce I get the error below. > > The first 1-3 reduces are usually successful, and then I get this > message. > > > > This error has occur when I'm using either 2 or 8 servers and regardless > on the number of reduces (4, 16 or 160). It did not occur when loading a > small amount of data (well, the first few reduces are successful > anyway). > > > > I googled "org.apache.hadoop.hbase.client.RetriesExhaustedException:" > without much help. > > > > Thanks > > -Yair > > > > > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to > contact region server 10.249.203.0:60020 for region > ase,RnpdOFZn-goAAADK-uMA,1237315693597, row 'T82JYnln-goAAACeGdMA', but > failed after 10 attempts. > Exceptions: > java.io.IOException: Call to /10.249.203.0:60020 failed on local > exception: java.io.EOFException > java.net.ConnectException: Call to /10.249.203.0:60020 failed on > connection exception: java.net.ConnectException: Connection refused > java.net.ConnectException: Call to /10.249.203.0:60020 failed on > connection exception: java.net.ConnectException: Connection refused > java.net.ConnectException: Call to /10.249.203.0:60020 failed on > connection exception: java.net.ConnectException: Connection refused > java.net.ConnectException: Call to /10.249.203.0:60020 failed on > connection exception: java.net.ConnectException: Connection refused > java.net.ConnectException: Call to /10.249.203.0:60020 failed on > connection exception: java.net.ConnectException: Connection refused > java.net.ConnectException: Call to /10.249.203.0:60020 failed on > connection exception: java.net.ConnectException: Connection refused > java.net.ConnectException: Call to /10.249.203.0:60020 failed on > connection exception: java.net.ConnectException: Connection refused > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out > trying to locate root region > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out > trying to locate root region > > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegion > ServerWithRetries(HConnectionManager.java:841) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBa > tchOfRows(HConnectionManager.java:932) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372) > at > org.apache.hadoop.hbase.client.HTable.commit(HTable.java:1316) > at > org.apache.hadoop.hbase.client.HTable.commit(HTable.java:1296) > at > org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write > (TableOutputFormat.java:73) > at > org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write > (TableOutputFormat.java:53) > at > org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:405) > at > com.revenuescience.audiencesearch.fba.ClogUploader$TableUploader.reduce( > ClogUploader.java:223) > at > com.revenuescience.audiencesearch.fba.ClogUploader$TableUploader.reduce( > ClogUploader.java:202) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430) > at org.apache.hadoop.mapred.Child.main(Child.java:155) > > > >
