Lin Yiqun created HDFS-10181:
--------------------------------

             Summary: TestHFlush frequently fails due to theadInterrupt
                 Key: HDFS-10181
                 URL: https://issues.apache.org/jira/browse/HDFS-10181
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: test
            Reporter: Lin Yiqun
            Assignee: Lin Yiqun


The test {{TestFLush}} frequently fails in recent patchs. I looked for the 
failed log records. I found there were two reason lead this test 
{{TestFLush#testHFlushInterrupted}} to be failed. And this method failure is 
the main reason that {{TestFLush}} not pass test. The two failed reasons is 
below:
{code}
Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 30.325 sec <<< 
FAILURE! - in org.apache.hadoop.hdfs.TestHFlush
testHFlushInterrupted(org.apache.hadoop.hdfs.TestHFlush)  Time elapsed: 0.864 
sec  <<< ERROR!
java.io.IOException: The stream is closed
        at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:118)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        at java.io.DataOutputStream.flush(DataOutputStream.java:123)
        at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
        at 
org.apache.hadoop.hdfs.DataStreamer.closeStream(DataStreamer.java:877)
        at 
org.apache.hadoop.hdfs.DataStreamer.closeInternal(DataStreamer.java:726)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:721)
{code}

{code}
testHFlushInterrupted(org.apache.hadoop.hdfs.TestHFlush)  Time elapsed: 0.862 
sec  <<< ERROR!
java.nio.channels.ClosedByInterruptException: null
        at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:501)
        at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
        at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
        at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        at java.io.DataOutputStream.flush(DataOutputStream.java:123)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:653)
{code}

I analysed them, can be simplify described as below:

* The IOException happens when the stream is closed but the stream write 
operation continues.
* The ClosedByInterruptException happens when stream do hfulsh operations and 
thread interrupt happens.

So we should catch these exceptions in stream {{hflush}} and {{write}} 
operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to