Hi Todd, The output is verbose but boils down to ---- Number of Tables: 11 Number of live region servers:26 Number of dead region servers:0 10/12/17 01:14:36 INFO ipc.HbaseRPC: Server at (b5120223) could not be reached after 1 tries, giving up.
344 lines of "ERROR: Region (...) is not served by any region server but is listed in META to be on server null." 349 lines of "ERROR: Region (...) is not served by any region server but is listed in META to be on server b5120223" 338 lines of "ERROR: Region (...) is not served by any region server but is listed in META to be on server b5120233" 47 lines of "ERROR: Region (...) is not served by any region server but is listed in META to be on server b5120201" 1 line of "ERROR: Region (...) does not have a corresponding entry in HDFS." Detected 1078 inconsistencies. This might not indicate a real problem because these regions could be in the midst of a split. Consider re-running with a larg er value of -timelag. Inconsistencies detected. ---- The "server null" regions seem to correspond to the regions which are stuck in the unassigned state. Let me know if you need more of the output; I would have to clean up some sensitive data before pasting it. Thanks, --Adam > From: [email protected] > Date: Wed, 15 Dec 2010 19:33:39 -0800 > Subject: Re: a couple of HBase issues > To: [email protected] > > Hi Adam, > > I bet all of your issues are related. > > Can you run "hbase hbck" and paste the results? > > -Todd > > On Wed, Dec 15, 2010 at 12:04 PM, Adam Portley <[email protected]> wrote: > > > > I ran into a few problems when playing around with some tables yesterday > > (hbase version 0.89.20100924) and was wondering if anyone has seen these > > before. > > > > > > 1) Multiple listings for a single table > > > > I altered the schema of an existing table through the hbase shell > > (disabling the table, changing the compression setting, and re-enabling > > it). All seemed to work fine but the next day I noticed that the list of > > user tables showed two entries for the table, one with the new schema and > > one with the old. Flushing and major_compacting META does not clear up the > > duplicate entry. One possible clue found in the logs is that it appears > > the region server hosting META died in the interim: > > > > 2010-12-14 18:41:49,107 WARN org.mortbay.log: /master.jsp: > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact > > region server (nodename) for region .META.,,1, row '', but failed after 10 > > attempts. > > java.io.IOException: Call to (nodename) failed on local exception: > > java.io.EOFException > > 2010-12-14 18:56:32,698 INFO org.apache.hadoop.hbase.master.ServerManager: > > (nodename),1291715592733 znode expired > > 2010-12-14 18:56:32,698 DEBUG org.apache.hadoop.hbase.master.ServerManager: > > Added=(nodename),1291715592733 to dead servers, added shutdown processing > > operation > > 2010-12-14 18:56:32,698 DEBUG > > org.apache.hadoop.hbase.master.RegionServerOperationQueue: Processing todo: > > ProcessServerShutdown of (nodename),1291715592733 > > 2010-12-14 18:56:32,699 INFO > > org.apache.hadoop.hbase.master.RegionServerOperation: Process shutdown of > > server (nodename),1291715592733: logSplit: false, rootRescanned: false, > > numberOfMetaRegions: 1, onlineMetaRegions.size(): 0 > > 2010-12-14 18:56:32,700 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: > > Splitting 6 hlog(s) in (node2)/hbase/.logs/(nodename),1291715592733 > > ... > > > > > > 2) Incremental bulk load failures. > > > > When trying to prepare HFiles for incremental bulk load (using > > HFileOutputFormat::configureIncrementalLoad) for the table above, I'm > > running into the following exception: > > > > 10/12/15 19:49:51 INFO mapreduce.HFileOutputFormat: Looking up current > > regions for table org.apache.hadoop.hbase.client.hta...@36d1c778 > > 10/12/15 19:49:52 INFO mapreduce.HFileOutputFormat: Configuring 602 reduce > > partitions to match current region count > > 10/12/15 19:49:52 INFO mapreduce.HFileOutputFormat: Writing partition > > information to (path)/partitions_1292442592073 > > > > 10/12/15 19:38:29 INFO mapred.JobClient: Task Id : > > attempt_201011120210_5338_m_000022_8, Status : FAILED > > java.lang.IllegalArgumentException: Can't read partitions file > > at > > org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111) > > at > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) > > at > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > > at > > org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:490) > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:575) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > > at org.apache.hadoop.mapred.Child.main(Child.java:159) > > Caused by: java.io.IOException: Wrong number of partitions in keyset:600 > > at > > org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:84) > > > > The number of partitions in keyset (600) does not match the configured > > number of reduce tasks (602). > > This incremental load procedure used to work for my table when it contained > > the original 500 regions but now that some regions have split it always > > fails. > > > > > > 3) Regions stuck in unassigned limbo > > > > I created a new table yesterday (using loadtable.rb) - table creation and > > initial bulk load succeeded - but the table is inaccessible because some of > > its regions are never assigned. I have tried disabling and re-enabling the > > table but a scan of META shows that the same regions are still unassigned > > 24 hours later. They have a info:regioninfo column but no info:server. Is > > there a way to force assignment of these regions? > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera
