Hi all,
I recently upgraded to 0.20.4. I am not trying to add additional data to my
system, and I am getting the following error on my client
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9,
numtries=10, i=1, listsize=2,
region=doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679 for region
doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679, row
'7e0b8ec68d795612df55144b67e207bdf805d36f', but failed after 10 attempts.
Exceptions:
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1167)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1248)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
at
trinidad.hbase.mapreduce.ingest.ImportWoS$WoSParserMapper.cleanup(ImportWoS.java:192)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
When I look at the region server log, I see errors like:
2010-05-17 17:47:11,685 DEBUG
org.apache.hadoop.hbase.regionserver.HRegionServer: Batch puts interrupted at
index=0 because:Requested row out of range for HRegion
doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679,
startKey='7d6442c7951b178a6adc9c149ff13d6ea87feccd',
getEndKey()='7ddf19f548f2a75c53a638a4bdc88084f806be4e',
row='7e3d88c2ed5e2b02fe374333fb5d7502c6c5ff45'
To me, it looks like the table has gaps between the end of one region and the
beginning of the next region. E.g., from the list of regions from the doc
table:
doc,005bccc8dcd6ae360b359f42438fd1a651c02048,1274141748324
node-03:60030 51561009
005bccc8dcd6ae360b359f42438fd1a651c02048
00d79413bba4fbd869b0b58c3b23ad2b6fc960b4
doc,00d79413bba4fbd869b0b58c3b23ad2b6fc960b4,1274141747257
node-02:60030 494463444
00d79413bba4fbd869b0b58c3b23ad2b6fc960b4
013485105e0d328d465b2607057f92cb5f920011
...
doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679
node-03:60030 1541672177
7d6442c7951b178a6adc9c149ff13d6ea87feccd
7ddf19f548f2a75c53a638a4bdc88084f806be4e
doc,7e7b8dbcec790d28f4154e012226f6d6902a5ac9,1274142333168
node-03:60030 1688440578
7e7b8dbcec790d28f4154e012226f6d6902a5ac9
7ee05fd423269986ceb0dd88b1e4f73de42c5c5e
...
It looks like the first couple of regions are fine, but later regions have gaps.
I tried restarting hbase, doing a major compaction, and splitting the regions,
none of which fixed the problem. I was thinking of trying to copy the table
and seeing if that helped, but I can't seem to run the copy_table.rb script
either:
[had...@nz bin]$ /opt/hbase/bin/hbase org.jruby.Main copy_table.rb
file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/builtin/javasupport/core_ext/object.rb:33:in
`get_proxy_or_package_under_package': cannot load Java class
org.apache.hadoop.hbase.regionserver.HLogEdit (NameError)
from
file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/builtin/javasupport/java.rb:51:in
`method_missing'
from copy_table.rb:40
Any suggestions?
Thanks,
Dave