Vinod Kone created MESOS-239:
--------------------------------

             Summary: Allocator doesn't handle framework failover correctly
                 Key: MESOS-239
                 URL: https://issues.apache.org/jira/browse/MESOS-239
             Project: Mesos
          Issue Type: Bug
            Reporter: Vinod Kone


This cropped up during one of AB tests.

The scenario: A framework fails over. The allocator throws an exception when 
its trying to add the framework. This is because the framework has been 
deactivated, but the allocated[frameworkId] is never erased.


I0721 00:41:13.154080 43396 dominant_share_allocator.cpp:167] Deactivated 
framework 201207210040-2081170186-58055-43387-0000
W0721 00:41:14.272461 43392 master.cpp:77] No whitelist given. Advertising 
offers for all slaves
2012-07-21 00:41:14,538:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-07-21 00:41:17,875:43387(0x4ba6d940):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
.......
.......
.......
I0721 00:42:09.721727 43396 master.cpp:614] Re-registering framework 
201207210040-2081170186-58055-43387-0000 at scheduler(1)@10.35.12.124:57793
I0721 00:42:09.721822 43396 master.cpp:633] Framework 
201207210040-2081170186-58055-43387-0000 failed over
F0721 00:42:09.722185 43397 dominant_share_allocator.cpp:143] Check failed: 
!allocated.contains(frameworkId) 
*** Check failure stack trace: ***
    @     0x7f5874ef7fdd  google::LogMessage::Fail()
    @     0x7f5874efdc47  google::LogMessage::SendToLog()
    @     0x7f5874ef988c  google::LogMessage::Flush()
    @     0x7f5874ef9af6  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f5874c75c1d  
mesos::internal::master::DominantShareAllocator::frameworkAdded()
    @     0x7f5874bd16be  std::tr1::_Mem_fn<>::operator()()
    @     0x7f5874bd55b2  std::tr1::_Bind<>::operator()<>()
    @     0x7f5874bd55e3  std::tr1::_Function_handler<>::_M_invoke()
    @     0x7f5874be139f  std::tr1::function<>::operator()()
    @     0x7f5874bf7560  process::internal::vdispatcher<>()
    @     0x7f5874bf8310  std::tr1::_Bind<>::operator()<>()
    @     0x7f5874bf8365  std::tr1::_Function_handler<>::_M_invoke()
    @     0x7f5874e3bf4f  std::tr1::function<>::operator()()
    @     0x7f5874e0c5db  process::ProcessBase::visit()
    @     0x7f5874e1dc50  process::DispatchEvent::visit()
    @     0x7f5874b71ffc  process::ProcessBase::serve()
    @     0x7f5874e1656f  process::ProcessManager::resume()
    @     0x7f5874e16dba  process::schedule()
    @       0x316120673d  (unknown)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to