RE: a couple of HBase issues

Adam Portley Thu, 16 Dec 2010 17:41:34 -0800

Hi Todd, 
The output is verbose but boils down to 
 
----
Number of Tables: 11
Number of live region servers:26
Number of dead region servers:0
10/12/17 01:14:36 INFO ipc.HbaseRPC: Server at (b5120223) could not be reached 
after 1 tries, giving up.


344 lines of "ERROR: Region (...) is not served by any region server but is 
listed in META to be on server null." 
349 lines of "ERROR: Region (...) is not served by any region server but is 
listed in META to be on server b5120223"
338 lines of "ERROR: Region (...) is not served by any region server but is 
listed in META to be on server b5120233"
47   lines of "ERROR: Region (...) is not served by any region server but is 
listed in META to be on server b5120201"
1     line  of "ERROR: Region (...) does not have a corresponding entry in 
HDFS."
 
Detected 1078 inconsistencies. This might not indicate a real problem because 
these regions could be in the midst of a split. Consider re-running with a larg
er value of -timelag.

Inconsistencies detected.
----
 
The "server null" regions seem to correspond to the regions which are stuck in 
the unassigned state.  
 
Let me know if you need more of the output; I would have to clean up some 
sensitive data before pasting it. 
 
Thanks, 
--Adam
 
 
> From: [email protected]
> Date: Wed, 15 Dec 2010 19:33:39 -0800
> Subject: Re: a couple of HBase issues
> To: [email protected]
> 
> Hi Adam,
> 
> I bet all of your issues are related.
> 
> Can you run "hbase hbck" and paste the results?
> 
> -Todd
> 
> On Wed, Dec 15, 2010 at 12:04 PM, Adam Portley <[email protected]> wrote:
> >
> > I ran into a few problems when playing around with some tables yesterday 
> > (hbase version 0.89.20100924) and was wondering if anyone has seen these 
> > before.
> >
> >
> > 1) Multiple listings for a single table
> >
> > I altered the schema of an existing table through the hbase shell 
> > (disabling the table, changing the compression setting, and re-enabling 
> > it).  All seemed to work fine but the next day I noticed that the list of 
> > user tables showed two entries for the table, one with the new schema and 
> > one with the old.  Flushing and major_compacting META does not clear up the 
> > duplicate entry.  One possible clue found in the logs is that it appears 
> > the region server hosting META died in the interim:
> >
> > 2010-12-14 18:41:49,107 WARN org.mortbay.log: /master.jsp: 
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 
> > region server (nodename) for region .META.,,1, row '', but failed after 10 
> > attempts.
> > java.io.IOException: Call to (nodename) failed on local exception: 
> > java.io.EOFException
> > 2010-12-14 18:56:32,698 INFO org.apache.hadoop.hbase.master.ServerManager: 
> > (nodename),1291715592733 znode expired
> > 2010-12-14 18:56:32,698 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> > Added=(nodename),1291715592733 to dead servers, added shutdown processing 
> > operation
> > 2010-12-14 18:56:32,698 DEBUG 
> > org.apache.hadoop.hbase.master.RegionServerOperationQueue: Processing todo: 
> > ProcessServerShutdown of (nodename),1291715592733
> > 2010-12-14 18:56:32,699 INFO 
> > org.apache.hadoop.hbase.master.RegionServerOperation: Process shutdown of 
> > server (nodename),1291715592733: logSplit: false, rootRescanned: false, 
> > numberOfMetaRegions: 1, onlineMetaRegions.size(): 0
> > 2010-12-14 18:56:32,700 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
> > Splitting 6 hlog(s) in (node2)/hbase/.logs/(nodename),1291715592733
> > ...
> >
> >
> > 2) Incremental bulk load failures.
> >
> > When trying to prepare HFiles for incremental bulk load (using 
> > HFileOutputFormat::configureIncrementalLoad) for the table above, I'm 
> > running into the following exception:
> >
> > 10/12/15 19:49:51 INFO mapreduce.HFileOutputFormat: Looking up current 
> > regions for table org.apache.hadoop.hbase.client.hta...@36d1c778
> > 10/12/15 19:49:52 INFO mapreduce.HFileOutputFormat: Configuring 602 reduce 
> > partitions to match current region count
> > 10/12/15 19:49:52 INFO mapreduce.HFileOutputFormat: Writing partition 
> > information to (path)/partitions_1292442592073
> >
> > 10/12/15 19:38:29 INFO mapred.JobClient: Task Id : 
> > attempt_201011120210_5338_m_000022_8, Status : FAILED
> > java.lang.IllegalArgumentException: Can't read partitions file
> >        at 
> > org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
> >        at 
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> >        at 
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >        at 
> > org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:490)
> >        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:575)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:159)
> > Caused by: java.io.IOException: Wrong number of partitions in keyset:600
> >        at 
> > org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:84)
> >
> > The number of partitions in keyset (600) does not match the configured 
> > number of reduce tasks (602).
> > This incremental load procedure used to work for my table when it contained 
> > the original 500 regions but now that some regions have split it always 
> > fails.
> >
> >
> > 3) Regions stuck in unassigned limbo
> >
> > I created a new table yesterday (using loadtable.rb) - table creation and 
> > initial bulk load succeeded - but the table is inaccessible because some of 
> > its regions are never assigned.  I have tried disabling and re-enabling the 
> > table but a scan of META shows that the same regions are still unassigned 
> > 24 hours later.  They have a info:regioninfo column but no info:server.  Is 
> > there a way to force assignment of these regions?
> >
> >
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

RE: a couple of HBase issues

Reply via email to