[ https://issues.apache.org/jira/browse/MESOS-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jojy Varghese updated MESOS-3588: --------------------------------- Assignee: (was: Jojy Varghese) > Port mapping isolator check failed: createQdisc.get() > ----------------------------------------------------- > > Key: MESOS-3588 > URL: https://issues.apache.org/jira/browse/MESOS-3588 > Project: Mesos > Issue Type: Bug > Reporter: Paul Brett > > Container creation is failing occasionally due to the required name already > existing, e.g: > {code} > F1005 13:25:04.331053 48582 port_mapping.cpp:2245] Check failed: > createQdisc.get() > *** Check failure stack trace: *** > > @ 0x7f3b5c3b668d google::LogMessage::Fail() > > @ 0x7f3b5c3b84d4 google::LogMessage::SendToLog() > > @ 0x7f3b5c3b627c google::LogMessage::Flush() > > @ 0x7f3b5c3b8dc9 google::LogMessageFatal::~LogMessageFatal() > > @ 0x7f3b5c0bdc8c > mesos::internal::slave::PortMappingIsolatorProcess::isolate() > @ 0x7f3b5bf28fd6 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS6_11ContainerIDEiSA_iEENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSH_FSF_T1_T2_ET3_T4_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x7f3b5c3690b1 process::ProcessManager::resume() > > @ 0x7f3b5c3693af process::internal::schedule() > > @ 0x7f3b5c478cd0 execute_native_thread_routine > > @ 0x7f3b5b14283d start_thread > > @ 0x7f3b5abb7fdd clone > > /usr/local/bin/mesos-slave.sh: line 102: 48575 Aborted (core > dumped) $debug /usr/local/sbin/mesos-slave "${MESOS_FLAGS[@]}" > Slave Exit Status: 134 > > {code} > > It appears the there are valid circumstances under which the kernel can > reallocate the namespace PID before the containers external interface > (mesos_nnnnn) has been destroyed. > {code} > 2236 // Prepare the ingress queueing disciplines on veth. > > 2237 Try<bool> createQdisc = ingress::create(veth(pid)); > > 2238 if (createQdisc.isError()) { > > 2239 return Failure( > > 2240 "Failed to create the ingress qdisc on " + veth(pid) + > > 2241 ": " + createQdisc.error()); > > 2242 } > > 2243 > > 2244 // Veth device should exist since we just created it. > > 2245 CHECK(createQdisc.get()); > {code} > We should check for test for link already exists errors in port mapping (e.g. > link::create returns false) and fail the container creation rather than > killing the slave. -- This message was sent by Atlassian JIRA (v6.3.4#6332)