hi,maillist:
           i use scribe to receive data from app ,write to hadoop hdfs,when
the system in high concurrency connect
it will cause hdfs error like the following ,the incoming connect will be
blocked ,and the tomcat will die,

in dir user/hive/warehouse/dsp.db/request ,the file data_00000 will be
rotate each hour ,but the scribe ( we modified the scribe code) will switch
the same file when rotate happen ,so  data_00000 will be close ,and reopen .
and when the load is high ,i can observe the corrupt replica of
data_00000,how can i handle with it? thanks

[Thu Feb 13 23:59:59 2014] "[hdfs] disconnected fileSys for
/user/hive/warehouse/dsp.db/request"
[Thu Feb 13 23:59:59 2014] "[hdfs] closing
/user/hive/warehouse/dsp.db/request/2014-02-13/data_00000"
[Thu Feb 13 23:59:59 2014] "[hdfs] disconnecting fileSys for
/user/hive/warehouse/dsp.db/request/2014-02-13/data_00000"
[Thu Feb 13 23:59:59 2014] "[hdfs] disconnected fileSys for
/user/hive/warehouse/dsp.db/request/2014-02-13/data_00000"
[Thu Feb 13 23:59:59 2014] "[hdfs] Connecting to HDFS for
/user/hive/warehouse/dsp.db/request/2014-02-13/data_00000"
[Thu Feb 13 23:59:59 2014] "[hdfs] opened for append
/user/hive/warehouse/dsp.db/request/2014-02-13/data_00000"
[Thu Feb 13 23:59:59 2014] "[dsp_request] Opened file
</user/hive/warehouse/dsp.db/request/2014-02-13/data_00000> for writing"
[Thu Feb 13 23:59:59 2014] "[dsp_request] 23:59 rotating file
<2014-02-13/data> old size <10027577955> max size <10000000000>"
[Thu Feb 13 23:59:59 2014] "[hdfs] Connecting to HDFS for
/user/hive/warehouse/dsp.db/request"
[Thu Feb 13 23:59:59 2014] "[hdfs] disconnecting fileSys for
/user/hive/warehouse/dsp.db/request"
14/02/13 23:59:59 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink as
192.168.11.13:50010
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1117)
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:992)
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:494)
14/02/13 23:59:59 WARN hdfs.DFSClient: Error Recovery for block
BP-1043055049-192.168.11.11-1382442676609:blk_433572108425800355_3411489 in
pipeline 192.168.11.12:50010, 192.168.11.13:50010, 192.168.11.14:50010,
192.168.11.10:50010, 192.168.11.15:50010: bad datanode 192.168.11.13:50010
14/02/13 23:59:59 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink as
192.168.11.10:50010
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1117)
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:992)
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:494)
14/02/13 23:59:59 WARN hdfs.DFSClient: Error Recovery for block
BP-1043055049-192.168.11.11-1382442676609:blk_433572108425800355_3411489 in
pipeline 192.168.11.12:50010, 192.168.11.14:50010, 192.168.11.10:50010,
192.168.11.15:50010: bad datanode 192.168.11.10:50010
14/02/13 23:59:59 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink as
192.168.11.15:50010
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1117)
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:992)
        at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:494)
14/02/13 23:59:59 WARN hdfs.DFSClient: Error Recovery for block
BP-1043055049-192.168.11.11-1382442676609:blk_433572108425800355_3411489 in
pipeline 192.168.11.12:50010, 192.168.11.14:50010, 192.168.11.15:50010: bad
datanode 192.168.11.15:50010


/user/hive/warehouse/dsp.db/request/2014-02-13/data_00000:
blk_433572108425800355_3411509 (replicas: l: 1 d: 0 c: 4 e: 0)
192.168.11.12:50010 :  192.168.11.13:50010(corrupt) :
192.168.11.14:50010(corrupt)
:  192.168.11.10:50010(corrupt) :  192.168.11.15:50010(corrupt) :

Reply via email to