[ https://issues.apache.org/jira/browse/MESOS-7478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003380#comment-16003380 ]
Benjamin Mahler commented on MESOS-7478: ---------------------------------------- [~anandmazumdar] aside from my manual testing, I ran the upgrade script. It turns out it doesn't catch it because it upgrades the master first, then agents. Filed MESOS-7483. > Pre-1.2.x master does not work with 1.2.x agent. > ------------------------------------------------ > > Key: MESOS-7478 > URL: https://issues.apache.org/jira/browse/MESOS-7478 > Project: Mesos > Issue Type: Bug > Components: agent > Reporter: Benjamin Mahler > Assignee: Benjamin Mahler > Priority: Blocker > > [~evilezh] reported the following crash in the agent upon running a 1.1.0 > master against a 1.2.0 agent: > {noformat} > F0509 00:19:07.045413 3469 slave.cpp:4609] Check failed: > resource.has_allocation_info() > *** Check failure stack trace: *** > @ 0x7f4c4a4fa3cd google::LogMessage::Fail() > @ 0x7f4c4a4fc180 google::LogMessage::SendToLog() > @ 0x7f4c4a4f9fb3 google::LogMessage::Flush() > @ 0x7f4c4a4fcba9 google::LogMessageFatal::~LogMessageFatal() > @ 0x7f4c49b3bcf5 mesos::internal::slave::Slave::getExecutorInfo() > @ 0x7f4c49b3cf76 mesos::internal::slave::Slave::runTask() > @ 0x7f4c49b8832c ProtobufProcess<>::handler4<>() > @ 0x7f4c49b4dc06 std::_Function_handler<>::_M_invoke() > @ 0x7f4c49b6975a ProtobufProcess<>::visit() > @ 0x7f4c4a46c933 process::ProcessManager::resume() > @ 0x7f4c4a477537 > _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv > @ 0x7f4c486b8c80 (unknown) > @ 0x7f4c481d46ba start_thread > @ 0x7f4c47f0a82d (unknown) > Aborted (core dumped) > {noformat} > This appears to have been due to a lack of manual upgrade testing (we also > don't have any automated upgrade testing in place). > The check in {{getExecutorInfo(...)}} > [here|https://github.com/apache/mesos/blob/1.2.0/src/slave/slave.cpp#L4609] > crashes with an old master because it occurs before our injection in > {{run(...)}}. See the {{runTask(...)}} call into {{getExecutorInfo(...)}} > [here|https://github.com/apache/mesos/blob/1.2.0/src/slave/slave.cpp#L1556]. -- This message was sent by Atlassian JIRA (v6.3.15#6346)