[ https://issues.apache.org/jira/browse/MESOS-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aaron Wood updated MESOS-6909: ------------------------------ Description: When running the Mesos agent either without --launcher_dir or with a --launcher_dir not pointing to the right place tasks are launched you'll get a crash: E0111 10:50:56.665149 20924 slave.cpp:4423] Container '6cdd0c9b-cb29-42b0-b6cf-51f410df0f31' for executor '99D50FCB-ADB0-6B2A-3FC3-8A47FF178C10' of framework d3bc8031-29b6-4c2f-9fe3-a73c1b8b6360-0007 failed to start: Collect failed: Failed to setup hostname and network files: ABORT: (../../../3rdparty/libprocess/include/process/posix/subprocess.hpp:214): Failed to os::execvpe on path '/usr/local/libexec/mesos/mesos-containerizer': No such file or directory Aborted at 1484149856 (unix time) try "date -d @1484149856" if you are using GNU date *** PC: @ 0x7fc3bd418428 (unknown) SIGABRT (@0x51d8) received by PID 20952 (TID 0x7fc3b6007700) from PID 20952; stack trace: *** @ 0x7fc3bd7bd390 (unknown) @ 0x7fc3bd418428 (unknown) @ 0x7fc3bd41a02a (unknown) @ 0x47fafc _Abort() @ 0x47fb2a _Abort() @ 0x7fc3c385f092 process::internal::childMain() @ 0x7fc3c3864227 _ZNSt5_BindIFPFiRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPPcS9_RKN7process10Subprocess2IO20InputFileDescriptorsERKNSC_21OutputFileDescriptorsESI_bPiRKSt6vectorINSB_9ChildHookESaISL_EEES5_S9_S9_SD_SG_SG_bSJ_SN_EE6__callIiJEJLm0ELm1ELm2ELm3ELm4ELm5ELm6ELm7ELm8EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE @ 0x7fc3c38635d3 std::_Bind<>::operator()<>() @ 0x7fc3c3862682 std::_Function_handler<>::_M_invoke() @ 0x48a4b8 std::function<>::operator()() @ 0x7fc3c247de67 process::defaultClone() @ 0x7fc3c3861c40 std::_Function_handler<>::_M_invoke() @ 0x7fc3c3861411 std::function<>::operator()() @ 0x7fc3c385f8f5 process::internal::cloneChild() @ 0x7fc3c385d50e process::subprocess() @ 0x7fc3c30d318f mesos::internal::slave::NetworkCniIsolatorProcess::__isolate() @ 0x7fc3c30cf909 mesos::internal::slave::NetworkCniIsolatorProcess::isolate() @ 0x7fc3c2d4db56 _ZZN7process8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS2_11ContainerIDEiS6_iEENS_6FutureIT_EERKNS_3PIDIT0_EEMSD_FSB_T1_T2_ET3_T4_ENKUlPNS_11ProcessBaseEE_clESO_ @ 0x7fc3c2d50eb8 _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS6_11ContainerIDEiSA_iEENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSH_FSF_T1_T2_ET3_T4_EUlS2_E_E9_M_invokeERKSt9_Any_dataOS2_ @ 0x7fc3c380a1dd std::function<>::operator()() @ 0x7fc3c37eb094 process::ProcessBase::visit() @ 0x7fc3c37f3b26 process::DispatchEvent::visit() @ 0x7fc3c2244a08 process::ProcessBase::serve() @ 0x7fc3c37e6f50 process::ProcessManager::resume() @ 0x7fc3c37e3a78 _ZZN7process14ProcessManager12init_threadsEvENKUt_clEv @ 0x7fc3c37f3148 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE @ 0x7fc3c37f309e _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEclEv @ 0x7fc3c37f302e _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv @ 0x7fc3bdc97c80 (unknown) @ 0x7fc3bd7b36ba start_thread @ 0x7fc3bd4e982d (unknown) Note that this does not crash hard so the agent stays running. was: When running the Mesos agent either without --launcher_dir or with a --launcher_dir not pointing to the right place tasks are launched you'll get a crash: E0111 10:50:56.665149 20924 slave.cpp:4423] Container '6cdd0c9b-cb29-42b0-b6cf-51f410df0f31' for executor '99D50FCB-ADB0-6B2A-3FC3-8A47FF178C10' of framework d3bc8031-29b6-4c2f-9fe3-a73c1b8b6360-0007 failed to start: Collect failed: Failed to setup hostname and network files: ABORT: (../../../3rdparty/libprocess/include/process/posix/subprocess.hpp:214): Failed to os::execvpe on path '/usr/local/libexec/mesos/mesos-containerizer': No such file or directory *** Aborted at 1484149856 (unix time) try "date -d @1484149856" if you are using GNU date *** PC: @ 0x7fc3bd418428 (unknown) *** SIGABRT (@0x51d8) received by PID 20952 (TID 0x7fc3b6007700) from PID 20952; stack trace: *** @ 0x7fc3bd7bd390 (unknown) @ 0x7fc3bd418428 (unknown) @ 0x7fc3bd41a02a (unknown) @ 0x47fafc _Abort() @ 0x47fb2a _Abort() @ 0x7fc3c385f092 process::internal::childMain() @ 0x7fc3c3864227 _ZNSt5_BindIFPFiRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPPcS9_RKN7process10Subprocess2IO20InputFileDescriptorsERKNSC_21OutputFileDescriptorsESI_bPiRKSt6vectorINSB_9ChildHookESaISL_EEES5_S9_S9_SD_SG_SG_bSJ_SN_EE6__callIiJEJLm0ELm1ELm2ELm3ELm4ELm5ELm6ELm7ELm8EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE @ 0x7fc3c38635d3 std::_Bind<>::operator()<>() @ 0x7fc3c3862682 std::_Function_handler<>::_M_invoke() @ 0x48a4b8 std::function<>::operator()() @ 0x7fc3c247de67 process::defaultClone() @ 0x7fc3c3861c40 std::_Function_handler<>::_M_invoke() @ 0x7fc3c3861411 std::function<>::operator()() @ 0x7fc3c385f8f5 process::internal::cloneChild() @ 0x7fc3c385d50e process::subprocess() @ 0x7fc3c30d318f mesos::internal::slave::NetworkCniIsolatorProcess::__isolate() @ 0x7fc3c30cf909 mesos::internal::slave::NetworkCniIsolatorProcess::isolate() @ 0x7fc3c2d4db56 _ZZN7process8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS2_11ContainerIDEiS6_iEENS_6FutureIT_EERKNS_3PIDIT0_EEMSD_FSB_T1_T2_ET3_T4_ENKUlPNS_11ProcessBaseEE_clESO_ @ 0x7fc3c2d50eb8 _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS6_11ContainerIDEiSA_iEENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSH_FSF_T1_T2_ET3_T4_EUlS2_E_E9_M_invokeERKSt9_Any_dataOS2_ @ 0x7fc3c380a1dd std::function<>::operator()() @ 0x7fc3c37eb094 process::ProcessBase::visit() @ 0x7fc3c37f3b26 process::DispatchEvent::visit() @ 0x7fc3c2244a08 process::ProcessBase::serve() @ 0x7fc3c37e6f50 process::ProcessManager::resume() @ 0x7fc3c37e3a78 _ZZN7process14ProcessManager12init_threadsEvENKUt_clEv @ 0x7fc3c37f3148 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE @ 0x7fc3c37f309e _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEclEv @ 0x7fc3c37f302e _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv @ 0x7fc3bdc97c80 (unknown) @ 0x7fc3bd7b36ba start_thread @ 0x7fc3bd4e982d (unknown) Note that this does not crash hard so the agent stays running. > ABORT execvpe() crash when binaries from launcher_dir cannot be found > --------------------------------------------------------------------- > > Key: MESOS-6909 > URL: https://issues.apache.org/jira/browse/MESOS-6909 > Project: Mesos > Issue Type: Bug > Components: agent > Affects Versions: 1.1.0 > Reporter: Aaron Wood > > When running the Mesos agent either without --launcher_dir or with a > --launcher_dir not pointing to the right place tasks are launched you'll get > a crash: > E0111 10:50:56.665149 20924 slave.cpp:4423] Container > '6cdd0c9b-cb29-42b0-b6cf-51f410df0f31' for executor > '99D50FCB-ADB0-6B2A-3FC3-8A47FF178C10' of framework > d3bc8031-29b6-4c2f-9fe3-a73c1b8b6360-0007 failed to start: Collect failed: > Failed to setup hostname and network files: ABORT: > (../../../3rdparty/libprocess/include/process/posix/subprocess.hpp:214): > Failed to os::execvpe on path '/usr/local/libexec/mesos/mesos-containerizer': > No such file or directory > Aborted at 1484149856 (unix time) try "date -d @1484149856" if you are using > GNU date *** > PC: @ 0x7fc3bd418428 (unknown) > SIGABRT (@0x51d8) received by PID 20952 (TID 0x7fc3b6007700) from PID 20952; > stack trace: *** > @ 0x7fc3bd7bd390 (unknown) > @ 0x7fc3bd418428 (unknown) > @ 0x7fc3bd41a02a (unknown) > @ 0x47fafc _Abort() > @ 0x47fb2a _Abort() > @ 0x7fc3c385f092 process::internal::childMain() > @ 0x7fc3c3864227 > _ZNSt5_BindIFPFiRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPPcS9_RKN7process10Subprocess2IO20InputFileDescriptorsERKNSC_21OutputFileDescriptorsESI_bPiRKSt6vectorINSB_9ChildHookESaISL_EEES5_S9_S9_SD_SG_SG_bSJ_SN_EE6__callIiJEJLm0ELm1ELm2ELm3ELm4ELm5ELm6ELm7ELm8EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE > @ 0x7fc3c38635d3 std::_Bind<>::operator()<>() > @ 0x7fc3c3862682 std::_Function_handler<>::_M_invoke() > @ 0x48a4b8 std::function<>::operator()() > @ 0x7fc3c247de67 process::defaultClone() > @ 0x7fc3c3861c40 std::_Function_handler<>::_M_invoke() > @ 0x7fc3c3861411 std::function<>::operator()() > @ 0x7fc3c385f8f5 process::internal::cloneChild() > @ 0x7fc3c385d50e process::subprocess() > @ 0x7fc3c30d318f > mesos::internal::slave::NetworkCniIsolatorProcess::__isolate() > @ 0x7fc3c30cf909 > mesos::internal::slave::NetworkCniIsolatorProcess::isolate() > @ 0x7fc3c2d4db56 > _ZZN7process8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS2_11ContainerIDEiS6_iEENS_6FutureIT_EERKNS_3PIDIT0_EEMSD_FSB_T1_T2_ET3_T4_ENKUlPNS_11ProcessBaseEE_clESO_ > @ 0x7fc3c2d50eb8 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS6_11ContainerIDEiSA_iEENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSH_FSF_T1_T2_ET3_T4_EUlS2_E_E9_M_invokeERKSt9_Any_dataOS2_ > @ 0x7fc3c380a1dd std::function<>::operator()() > @ 0x7fc3c37eb094 process::ProcessBase::visit() > @ 0x7fc3c37f3b26 process::DispatchEvent::visit() > @ 0x7fc3c2244a08 process::ProcessBase::serve() > @ 0x7fc3c37e6f50 process::ProcessManager::resume() > @ 0x7fc3c37e3a78 > _ZZN7process14ProcessManager12init_threadsEvENKUt_clEv > @ 0x7fc3c37f3148 > _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE > @ 0x7fc3c37f309e > _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEclEv > @ 0x7fc3c37f302e > _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv > @ 0x7fc3bdc97c80 (unknown) > @ 0x7fc3bd7b36ba start_thread > @ 0x7fc3bd4e982d (unknown) > Note that this does not crash hard so the agent stays running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)