[ 
https://issues.apache.org/jira/browse/MESOS-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jojy Varghese updated MESOS-3588:
---------------------------------
    Assignee:     (was: Jojy Varghese)

> Port mapping isolator check failed: createQdisc.get()
> -----------------------------------------------------
>
>                 Key: MESOS-3588
>                 URL: https://issues.apache.org/jira/browse/MESOS-3588
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Paul Brett
>
> Container creation is failing occasionally due to the required name already 
> existing, e.g:
> {code}
> F1005 13:25:04.331053 48582 port_mapping.cpp:2245] Check failed: 
> createQdisc.get()
> *** Check failure stack trace: ***                                            
>   
>     @     0x7f3b5c3b668d  google::LogMessage::Fail()                          
>   
>     @     0x7f3b5c3b84d4  google::LogMessage::SendToLog()                     
>   
>     @     0x7f3b5c3b627c  google::LogMessage::Flush()                         
>   
>     @     0x7f3b5c3b8dc9  google::LogMessageFatal::~LogMessageFatal()         
>   
>     @     0x7f3b5c0bdc8c  
> mesos::internal::slave::PortMappingIsolatorProcess::isolate()
>     @     0x7f3b5bf28fd6  
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingN5mesos8internal5slave20MesosIsolatorProcessERKNS6_11ContainerIDEiSA_iEENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSH_FSF_T1_T2_ET3_T4_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
>     @     0x7f3b5c3690b1  process::ProcessManager::resume()                   
>   
>     @     0x7f3b5c3693af  process::internal::schedule()                       
>   
>     @     0x7f3b5c478cd0  execute_native_thread_routine                       
>   
>     @     0x7f3b5b14283d  start_thread                                        
>   
>     @     0x7f3b5abb7fdd  clone                                               
>   
> /usr/local/bin/mesos-slave.sh: line 102: 48575 Aborted                 (core 
> dumped) $debug /usr/local/sbin/mesos-slave "${MESOS_FLAGS[@]}"
> Slave Exit Status: 134                                                        
>   
> {code}                                                                        
>         
> It appears the there are valid circumstances under which the kernel can 
> reallocate the namespace PID before the containers external interface 
> (mesos_nnnnn) has been destroyed.
> {code}
>   2236   // Prepare the ingress queueing disciplines on veth.                 
>          
>   2237   Try<bool> createQdisc = ingress::create(veth(pid));                  
>          
>   2238   if (createQdisc.isError()) {                                         
>          
>   2239     return Failure(                                                    
>          
>   2240         "Failed to create the ingress qdisc on " + veth(pid) +         
>          
>   2241         ": " + createQdisc.error());                                   
>          
>   2242   }                                                                    
>          
>   2243                                                                        
>          
>   2244   // Veth device should exist since we just created it.                
>          
>   2245   CHECK(createQdisc.get());   
> {code}
> We should check for test for link already exists errors in port mapping (e.g. 
> link::create returns false) and fail the container creation rather than 
> killing the slave.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to