[ 
https://issues.apache.org/jira/browse/HADOOP-15711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751846#comment-16751846
 ] 

Jonathan Hung edited comment on HADOOP-15711 at 1/25/19 3:24 AM:
-----------------------------------------------------------------

In the qbt runs there's fatal errors in the logs such as
{noformat}
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (safepoint.cpp:325), pid=30102, tid=140265819887360
#  guarantee(PageArmed == 0) failed: invariant
#
# JRE version: OpenJDK Runtime Environment (7.0_181-b01) (build 1.7.0_181-b01)
# Java VM: OpenJDK 64-Bit Server VM (24.181-b01 mixed mode linux-amd64 
compressed oops)
# Derivative: IcedTea 2.6.14
# Distribution: Ubuntu 14.04 LTS, package 7u181-2.6.14-0ubuntu0.3
# Core dump written. Default location: 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/core or core.30102
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#

---------------  T H R E A D  ---------------

Current thread (0x00007f923c31d800):  VMThread [stack: 
0x00007f922e4e5000,0x00007f922e5e6000] [id=30122]


Stack: [0x00007f922e4e5000,0x00007f922e5e6000],  sp=0x00007f922e5e4b10,  free 
space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x966c25]
V  [libjvm.so+0x49b96e]
V  [libjvm.so+0x872b51]
V  [libjvm.so+0x96b69a]
V  [libjvm.so+0x96baf2]
V  [libjvm.so+0x7da992]

VM_Operation (0x00007f9210b2b920): RevokeBias, mode: safepoint, requested by 
thread 0x00007f923dd0f800

{noformat}
Suspected it might be related to 
[https://bugs.openjdk.java.net/browse/JDK-6869327,] so I tried adding 
{{-XX:+UseCountedLoopSafepoints}} to one of the runs but it didn't seem to do 
anything

Then tried porting HADOOP-14816 (and HADOOP-15610) to a test branch forked off 
branch-2, getting similar results as reported in HDFS-12711, here's a test run 
: 
[https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/39/]
 (run with openjdk8) - so at least it appears the unit tests are running to 
completion with openjdk8.


was (Author: jhung):
In the qbt runs there's fatal errors in the logs such as
{noformat}
---------------  T H R E A D  ---------------



Current thread (0x00007f3cc031d800):  VMThread [stack: 
0x00007f3ca0dce000,0x00007f3ca0ecf000] [id=23500]



Stack: [0x00007f3ca0dce000,0x00007f3ca0ecf000],  sp=0x00007f3ca0ecdb10,  free 
space=1022k

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

V  [libjvm.so+0x966c25]

V  [libjvm.so+0x49b96e]

V  [libjvm.so+0x872b51]

V  [libjvm.so+0x96b69a]

V  [libjvm.so+0x96baf2]

V  [libjvm.so+0x7da992]



VM_Operation (0x00007f3c95bafad0): RevokeBias, mode: safepoint, requested by 
thread 0x00007f3cc0744800


{noformat}
Suspected it might be related to 
[https://bugs.openjdk.java.net/browse/JDK-6869327,] so I tried adding 
{{-XX:+UseCountedLoopSafepoints}} to one of the runs but it didn't seem to do 
anything

Then tried porting HADOOP-14816 (and HADOOP-15610) to a test branch forked off 
branch-2, getting similar results as reported in HDFS-12711, here's a test run 
: 
[https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/39/]
 (run with openjdk8) - so at least it appears the unit tests are running to 
completion with openjdk8.

> Fix branch-2 builds
> -------------------
>
>                 Key: HADOOP-15711
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15711
>             Project: Hadoop Common
>          Issue Type: Task
>            Reporter: Jonathan Hung
>            Priority: Critical
>         Attachments: HADOOP-15711.001.branch-2.patch
>
>
> Branch-2 builds have been disabled for a while: 
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/
> A test run here causes hdfs tests to hang: 
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86-jhung/4/
> Running hadoop-hdfs tests locally reveal some errors such 
> as:{noformat}[ERROR] 
> testComplexAppend2(org.apache.hadoop.hdfs.TestFileAppend2)  Time elapsed: 
> 0.059 s  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:714)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1164)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1128)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:174)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1172)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:403)
>         at 
> org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:234)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1080)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:883)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:514)
>         at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:473)
>         at 
> org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend(TestFileAppend2.java:489)
>         at 
> org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend2(TestFileAppend2.java:543)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){noformat}
> I was able to get more tests passing locally by increasing the max user 
> process count on my machine. But the error suggests that there's an issue in 
> the tests themselves. Not sure if the error seen locally is the same reason 
> as why jenkins builds are failing, I wasn't able to confirm based on the 
> jenkins builds' lack of output.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to