During write operation in reduce phase, region servers are killed.
(64,000 rows with 10,000 columns, 3 node)

----
09/01/14 13:07:59 INFO mapred.JobClient:  map 100% reduce 36%
09/01/14 13:11:38 INFO mapred.JobClient:  map 100% reduce 33%
09/01/14 13:11:38 INFO mapred.JobClient: Task Id :
attempt_200901140952_0010_r_000017_1, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server 61.247.201.163:60020 for region
DenseMatrix_randgnegu,,1231905480938, row '000000000000287', but
failed after 10 attempts.
Exceptions:
java.io.IOException: java.io.IOException: Server not running, aborting
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2103)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1611)
----

And, I can't stop the hbase.

[d8g053:/root]# hbase-trunk/bin/stop-hbase.sh
stopping 
master...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Can it be recovered?

----
Region server log:

2009-01-14 13:03:56,591 WARN org.apache.hadoop.hdfs.DFSClient:
DataStreamer Exception: java.io.IOException: Unable to create new
block.
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
2009-01-14 13:03:56,591 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_-4005955194083205373_14543 bad datanode[0]
nodes == null
2009-01-14 13:03:56,591 WARN org.apache.hadoop.hdfs.DFSClient: Could
not get block locations. Aborting...
2009-01-14 13:03:56,629 ERROR
org.apache.hadoop.hbase.regionserver.CompactSplitThread:
Compaction/Split failed for region
DenseMatrix_randllnma,000000000000,18,7-29116,1231898419257
java.io.IOException: Could not read from stream
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:119)
        at java.io.DataInputStream.readByte(DataInputStream.java:248)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:325)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:346)
        at org.apache.hadoop.io.Text.readString(Text.java:400)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2779)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2704)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
2009-01-14 13:03:56,631 INFO
org.apache.hadoop.hbase.regionserver.HRegion: starting  compaction on
region DenseMatrix_randllnma,00000000000,16,19-26373,1231898311583
2009-01-14 13:03:56,692 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2009-01-14 13:03:56,692 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2009-01-14 13:03:56,693 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2009-01-14 13:03:56,693 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2009-01-14 13:03:57,521 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new compressor
2009-01-14 13:03:57,810 INFO org.apache.hadoop.hdfs.DFSClient:
Exception in createBlockOutputStream java.io.IOException: Could not
read from stream
2009-01-14 13:03:57,810 INFO org.apache.hadoop.hdfs.DFSClient:
Abandoning block blk_-2612702056484946948_14554
2009-01-14 13:03:59,343 WARN org.apache.hadoop.hdfs.DFSClient:
DataStreamer Exception: java.io.IOException: Unable to create new
block.
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)

2009-01-14 13:03:59,344 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_-5255885897790790367_14543 bad datanode[0]
nodes == null
2009-01-14 13:03:59,344 WARN org.apache.hadoop.hdfs.DFSClient: Could
not get block locations. Aborting...
2009-01-14 13:03:59,344 FATAL
org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay of hlog
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region:
DenseMatrix_randgnegu,,1231905480938
        at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:896)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:789)
        at 
org.apache.hadoop.hbase.regionserver.MemcacheFlusher.flushRegion(MemcacheFlusher.java:227)
        at 
org.apache.hadoop.hbase.regionserver.MemcacheFlusher.run(MemcacheFlusher.java:137)
Caused by: java.io.IOException: Could not read from stream
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:119)
        at java.io.DataInputStream.readByte(DataInputStream.java:248)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:325)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:346)
        at org.apache.hadoop.io.Text.readString(Text.java:400)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2779)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2704)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
2009-01-14 13:03:59,359 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
request=15, regions=48, stores=192, storefiles=756,
storefileIndexSize=6, memcacheSize=338, usedHeap=395, maxHeap=971
2009-01-14 13:03:59,359 INFO
org.apache.hadoop.hbase.regionserver.MemcacheFlusher:
regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher exiting
2009-01-14 13:03:59,368 INFO
org.apache.hadoop.hbase.regionserver.HLog: Closed
hdfs://dev3.nm2.naver.com:9000/hbase/log_61.247.201.165_1231894400437_60020/hlog.dat.1231905813472,
entries=896500. New log writer:
/hbase/log_61.247.201.165_1231894400437_60020/hlog.dat.1231905839367

2009-01-14 13:03:59,368 INFO
org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardy...@apache.org
http://blog.udanax.org

Reply via email to