[ 
https://issues.apache.org/jira/browse/MESOS-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063895#comment-16063895
 ] 

James Peach edited comment on MESOS-7160 at 6/26/17 10:20 PM:
--------------------------------------------------------------

This morning, my VM doesn't reproduce this, however it definitely happened :)

The normal code path is that the {{exec}} failure causes an abort. The 
supervisor then gets SIGTERM (need to read more code to see why). The signal 
handler it has installed issued SIGKILL. If the SIGTERM delivery is delayed, 
then the second abort in the supervisor could trigger.

{noformat}
[pid  2738] execve("/bin/perf", ["perf", "--version"], 0x4bb6fc0 /* 21 vars */ 
<unfinished ...>
[pid  2737] wait4(2738,  <unfinished ...>
[pid  2738] <... execve resumed> )      = -1 ENOENT (No such file or directory)
[pid  2738] execve("/usr/sbin/perf", ["perf", "--version"], 0x4bb6fc0 /* 21 
vars */) = -1 ENOENT (No such file or directory)
[pid  2738] execve("/usr/bin/perf", ["perf", "--version"], 0x4bb6fc0 /* 21 vars 
*/) = -1 ENOENT (No such file or directory)
[pid  2738] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=2738, 
si_uid=0} ---
...
[pid  2737] <... wait4 resumed> 0x7f27e8901f44, 0, NULL) = ? ERESTARTSYS (To be 
restarted if SA_RESTART is set)
[pid  2737] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=2708, 
si_uid=0} ---
[pid  2738] +++ killed by SIGKILL +++
[pid  2737] +++ killed by SIGKILL +++
{noformat}



was (Author: jamespeach):
This morning, my VM doesn't reproduce this, however it definitely happened :)

The normal code path is that the {{exec}} failure causes an abort. The 
supervisor then gets SIGTERM (need to read more code to see why). The signal 
handler it has installed issued SIGKILL. If the SIGTERM delivery is delayed, 
then the second abort in the supervisor could trigger.

{{noformat}}
[pid  2738] execve("/bin/perf", ["perf", "--version"], 0x4bb6fc0 /* 21 vars */ 
<unfinished ...>
[pid  2737] wait4(2738,  <unfinished ...>
[pid  2738] <... execve resumed> )      = -1 ENOENT (No such file or directory)
[pid  2738] execve("/usr/sbin/perf", ["perf", "--version"], 0x4bb6fc0 /* 21 
vars */) = -1 ENOENT (No such file or directory)
[pid  2738] execve("/usr/bin/perf", ["perf", "--version"], 0x4bb6fc0 /* 21 vars 
*/) = -1 ENOENT (No such file or directory)
[pid  2738] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=2738, 
si_uid=0} ---
...
[pid  2737] <... wait4 resumed> 0x7f27e8901f44, 0, NULL) = ? ERESTARTSYS (To be 
restarted if SA_RESTART is set)
[pid  2737] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=2708, 
si_uid=0} ---
[pid  2738] +++ killed by SIGKILL +++
[pid  2737] +++ killed by SIGKILL +++
{{noformat}}


> Parsing of perf version segfaults
> ---------------------------------
>
>                 Key: MESOS-7160
>                 URL: https://issues.apache.org/jira/browse/MESOS-7160
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>            Reporter: Benjamin Bannier
>            Assignee: Andrei Budnik
>
> Parsing the perf version [fails with a segfault in ASF 
> CI|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu:14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/3294/],
> {noformat}
> E0222 20:54:03.033464   805 perf.cpp:237] Failed to get perf version: Failed 
> to execute perf: terminated with signal Aborted (core dumped)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to