[ 
https://issues.apache.org/jira/browse/HDFS-11142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11142:
-----------------------------
    Description: 
The test {{TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit}} 
fails in trunk. I looked into this, it seemed the long-time gc caused the 
datanode to be shutdown unexpectedly when did the large block reporting. And 
then the NPE threw in the test. The related output log:
{code}
2016-11-15 11:31:18,889 [DataNode: 
[[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
 
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
  heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode 
(BPServiceActor.java:blockReport(415)) - Successfully sent block report 
0x2ae5dd91bec02273,  containing 2 storage report(s), of which we sent 2. The 
reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 
49 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-11-15 11:31:18,890 [DataNode: 
[[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
 
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
  heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode 
(BPOfferService.java:processCommandFromActive(696)) - Got finalize command for 
block pool BP-814229154-172.17.0.3-1479209475497
2016-11-15 11:31:24,026 
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@97e93f1] INFO  
util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM or 
host machine (eg GC): pause of approximately 4936ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,026 
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@5a4bef8] INFO  
util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM or 
host machine (eg GC): pause of approximately 4898ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdown(1943)) - Shutting down the Mini HDFS Cluster
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdownDataNodes(1983)) - Shutting down DataNode 0
{code}
The stack infos:
{code}
java.lang.NullPointerException: null
        at 
org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:97)
{code}


  was:
The test {{TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit}} 
fails in trunk. I looked into this, it seemed the long-time gc caused the 
datanode to be shutdown unexpectedly when did the large block reporting. And 
then the NPE thew in the test. The related output log:
{code}
2016-11-15 11:31:18,889 [DataNode: 
[[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
 
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
  heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode 
(BPServiceActor.java:blockReport(415)) - Successfully sent block report 
0x2ae5dd91bec02273,  containing 2 storage report(s), of which we sent 2. The 
reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate and 
49 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-11-15 11:31:18,890 [DataNode: 
[[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
 
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
  heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode 
(BPOfferService.java:processCommandFromActive(696)) - Got finalize command for 
block pool BP-814229154-172.17.0.3-1479209475497
2016-11-15 11:31:24,026 
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@97e93f1] INFO  
util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM or 
host machine (eg GC): pause of approximately 4936ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,026 
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@5a4bef8] INFO  
util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM or 
host machine (eg GC): pause of approximately 4898ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdown(1943)) - Shutting down the Mini HDFS Cluster
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdownDataNodes(1983)) - Shutting down DataNode 0
{code}
The stack infos:
{code}
java.lang.NullPointerException: null
        at 
org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:97)
{code}



> TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit fails in 
> trunk
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11142
>                 URL: https://issues.apache.org/jira/browse/HDFS-11142
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: test-fails-log.txt
>
>
> The test 
> {{TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit}} fails 
> in trunk. I looked into this, it seemed the long-time gc caused the datanode 
> to be shutdown unexpectedly when did the large block reporting. And then the 
> NPE threw in the test. The related output log:
> {code}
> 2016-11-15 11:31:18,889 [DataNode: 
> [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
>  
> [DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
>   heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode 
> (BPServiceActor.java:blockReport(415)) - Successfully sent block report 
> 0x2ae5dd91bec02273,  containing 2 storage report(s), of which we sent 2. The 
> reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate 
> and 49 msecs for RPC and NN processing. Got back one command: 
> FinalizeCommand/5.
> 2016-11-15 11:31:18,890 [DataNode: 
> [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
>  
> [DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
>   heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode 
> (BPOfferService.java:processCommandFromActive(696)) - Got finalize command 
> for block pool BP-814229154-172.17.0.3-1479209475497
> 2016-11-15 11:31:24,026 
> [org.apache.hadoop.util.JvmPauseMonitor$Monitor@97e93f1] INFO  
> util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM 
> or host machine (eg GC): pause of approximately 4936ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
> 2016-11-15 11:31:24,026 
> [org.apache.hadoop.util.JvmPauseMonitor$Monitor@5a4bef8] INFO  
> util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM 
> or host machine (eg GC): pause of approximately 4898ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
> 2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:shutdown(1943)) - Shutting down the Mini HDFS Cluster
> 2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:shutdownDataNodes(1983)) - Shutting down DataNode 0
> {code}
> The stack infos:
> {code}
> java.lang.NullPointerException: null
>       at 
> org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:97)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to