I ran into a few problems when playing around with some tables yesterday (hbase
version 0.89.20100924) and was wondering if anyone has seen these before.
1) Multiple listings for a single table
I altered the schema of an existing table through the hbase shell (disabling
the table, changing the compression setting, and re-enabling it). All seemed
to work fine but the next day I noticed that the list of user tables showed two
entries for the table, one with the new schema and one with the old. Flushing
and major_compacting META does not clear up the duplicate entry. One possible
clue found in the logs is that it appears the region server hosting META died
in the interim:
2010-12-14 18:41:49,107 WARN org.mortbay.log: /master.jsp:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server (nodename) for region .META.,,1, row '', but failed after 10
attempts.
java.io.IOException: Call to (nodename) failed on local exception:
java.io.EOFException
2010-12-14 18:56:32,698 INFO org.apache.hadoop.hbase.master.ServerManager:
(nodename),1291715592733 znode expired
2010-12-14 18:56:32,698 DEBUG org.apache.hadoop.hbase.master.ServerManager:
Added=(nodename),1291715592733 to dead servers, added shutdown processing
operation
2010-12-14 18:56:32,698 DEBUG
org.apache.hadoop.hbase.master.RegionServerOperationQueue: Processing todo:
ProcessServerShutdown of (nodename),1291715592733
2010-12-14 18:56:32,699 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: Process shutdown of
server (nodename),1291715592733: logSplit: false, rootRescanned: false,
numberOfMetaRegions: 1, onlineMetaRegions.size(): 0
2010-12-14 18:56:32,700 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Splitting 6 hlog(s) in (node2)/hbase/.logs/(nodename),1291715592733
...
2) Incremental bulk load failures.
When trying to prepare HFiles for incremental bulk load (using
HFileOutputFormat::configureIncrementalLoad) for the table above, I'm running
into the following exception:
10/12/15 19:49:51 INFO mapreduce.HFileOutputFormat: Looking up current regions
for table org.apache.hadoop.hbase.client.hta...@36d1c778
10/12/15 19:49:52 INFO mapreduce.HFileOutputFormat: Configuring 602 reduce
partitions to match current region count
10/12/15 19:49:52 INFO mapreduce.HFileOutputFormat: Writing partition
information to (path)/partitions_1292442592073
10/12/15 19:38:29 INFO mapred.JobClient: Task Id :
attempt_201011120210_5338_m_000022_8, Status : FAILED
java.lang.IllegalArgumentException: Can't read partitions file
at
org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:490)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:575)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:159)
Caused by: java.io.IOException: Wrong number of partitions in keyset:600
at
org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:84)
The number of partitions in keyset (600) does not match the configured number
of reduce tasks (602).
This incremental load procedure used to work for my table when it contained the
original 500 regions but now that some regions have split it always fails.
3) Regions stuck in unassigned limbo
I created a new table yesterday (using loadtable.rb) - table creation and
initial bulk load succeeded - but the table is inaccessible because some of its
regions are never assigned. I have tried disabling and re-enabling the table
but a scan of META shows that the same regions are still unassigned 24 hours
later. They have a info:regioninfo column but no info:server. Is there a way
to force assignment of these regions?