Aaron Wood created MESOS-6909:
---------------------------------

             Summary: ABORT execvpe() crash when binaries from launcher_dir 
cannot be found
                 Key: MESOS-6909
                 URL: https://issues.apache.org/jira/browse/MESOS-6909
             Project: Mesos
          Issue Type: Bug
          Components: agent
    Affects Versions: 1.1.0
            Reporter: Aaron Wood


When running the Mesos agent either without --launcher_dir or with a 
--launcher_dir not pointing to the right place tasks are launched you'll get a 
crash:

E0111 10:50:56.665149 20924 slave.cpp:4423] Container 
'6cdd0c9b-cb29-42b0-b6cf-51f410df0f31' for executor 
'99D50FCB-ADB0-6B2A-3FC3-8A47FF178C10' of framework 
d3bc8031-29b6-4c2f-9fe3-a73c1b8b6360-0007 failed to start: Collect failed: 
Failed to setup hostname and network files: ABORT: 
(../../../3rdparty/libprocess/include/process/posix/subprocess.hpp:214): Failed 
to os::execvpe on path '/usr/local/libexec/mesos/mesos-containerizer': No such 
file or directory
*** Aborted at 1484149856 (unix time) try "date -d @1484149856" if you are 
using GNU date ***
PC: @     0x7fc3bd418428 (unknown)
*** SIGABRT (@0x51d8) received by PID 20952 (TID 0x7fc3b6007700) from PID 
20952; stack trace: ***
    @     0x7fc3bd7bd390 (unknown)
    @     0x7fc3bd418428 (unknown)
    @     0x7fc3bd41a02a (unknown)
    @           0x47fafc _Abort()
    @           0x47fb2a _Abort()
    @     0x7fc3c385f092 process::internal::childMain()
    @     0x7fc3c3864227 
_ZNSt5_BindIFPFiRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPPcS9_RKN7process10Subprocess2IO20InputFileDescriptorsERKNSC_21OutputFileDescriptorsESI_bPiRKSt6vectorINSB_9ChildHookESaISL_EEES5_S9_S9_SD_SG_SG_bSJ_SN_EE6__callIiJEJLm0ELm1ELm2ELm3ELm4ELm5ELm6ELm7ELm8EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
    @     0x7fc3c38635d3 std::_Bind<>::operator()<>()
    @     0x7fc3c3862682 std::_Function_handler<>::_M_invoke()
    @           0x48a4b8 std::function<>::operator()()
    @     0x7fc3c247de67 process::defaultClone()
    @     0x7fc3c3861c40 std::_Function_handler<>::_M_invoke()
    @     0x7fc3c3861411 std::function<>::operator()()
    @     0x7fc3c385f8f5 process::internal::cloneChild()
    @     0x7fc3c385d50e process::subprocess()
    @     0x7fc3c30d318f 
mesos::internal::slave::NetworkCniIsolatorProcess::__isolate()
    @     0x7fc3c30cf909 
mesos::internal::slave::NetworkCniIsolatorProcess::isolate()
    @     0x7fc3c2d4db56 
_ZZN7process8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS2_11ContainerIDEiS6_iEENS_6FutureIT_EERKNS_3PIDIT0_EEMSD_FSB_T1_T2_ET3_T4_ENKUlPNS_11ProcessBaseEE_clESO_
    @     0x7fc3c2d50eb8 
_ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS6_11ContainerIDEiSA_iEENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSH_FSF_T1_T2_ET3_T4_EUlS2_E_E9_M_invokeERKSt9_Any_dataOS2_
    @     0x7fc3c380a1dd std::function<>::operator()()
    @     0x7fc3c37eb094 process::ProcessBase::visit()
    @     0x7fc3c37f3b26 process::DispatchEvent::visit()
    @     0x7fc3c2244a08 process::ProcessBase::serve()
    @     0x7fc3c37e6f50 process::ProcessManager::resume()
    @     0x7fc3c37e3a78 _ZZN7process14ProcessManager12init_threadsEvENKUt_clEv
    @     0x7fc3c37f3148 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
    @     0x7fc3c37f309e 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEclEv
    @     0x7fc3c37f302e 
_ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
    @     0x7fc3bdc97c80 (unknown)
    @     0x7fc3bd7b36ba start_thread
    @     0x7fc3bd4e982d (unknown)

Note that this does not crash hard so the agent stays running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to