[ https://issues.apache.org/jira/browse/MESOS-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244803#comment-15244803 ]
Vinod Kone commented on MESOS-5224: ----------------------------------- Interesting. Looks like the buffer overflow happened inside Slave::statusUpdate() when logging the update message? {code} Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531cc617a mesos::internal::operator<<() Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531d71837 mesos::internal::slave::Slave::statusUpdate() {code} The code for the output stream operator for status update looks like so {code} ostream& operator<<(ostream& stream, const StatusUpdate& update) { stream << update.status().state(); if (update.has_uuid()) { stream << " (UUID: " << stringify(UUID::fromBytes(update.uuid())) << ")"; } stream << " for task " << update.status().task_id(); if (update.status().has_healthy()) { stream << " in health state " << (update.status().healthy() ? "healthy" : "unhealthy"); } return stream << " of framework " << update.framework_id(); } {code} The one thing that could cause an issue is `UUID::fromBytes()`. How is the UUID being set by the HTTP executor? > buffer overflow error in slave upon processing status update from executor v1 > http API > -------------------------------------------------------------------------------------- > > Key: MESOS-5224 > URL: https://issues.apache.org/jira/browse/MESOS-5224 > Project: Mesos > Issue Type: Bug > Components: slave > Affects Versions: 0.28.0 > Environment: {code} > $ dpkg -l|grep -e mesos > ii mesos 0.28.0-2.0.16.ubuntu1404 > amd64 Cluster resource manager with efficient resource isolation > $ uname -a > Linux node-3 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 > x86_64 x86_64 x86_64 GNU/Linux > {code} > Reporter: James DeFelice > Labels: mesosphere > > implementing support for executor HTTP v1 API in mesos-go:next and my > executor can't send status updates because the slave dies upon receiving > them. protobufs generated from 0.28.1 > from syslog: > {code} > Apr 17 17:53:53 node-1 mesos-slave[4462]: I0417 17:53:53.121467 4489 > http.cpp:190] HTTP POST for /slave(1)/api/v1/executor from 10.2.0.5:51800 > with User-Agent='Go-http-client/1.1' > Apr 17 17:53:53 node-1 mesos-slave[4462]: *** buffer overflow detected ***: > /usr/sbin/mesos-slave terminated > Apr 17 17:53:53 node-1 mesos-slave[4462]: ======= Backtrace: ========= > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /lib/x86_64-linux-gnu/libc.so.6(+0x7338f)[0x7fc53064e38f] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7fc5306e5c9c] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /lib/x86_64-linux-gnu/libc.so.6(+0x109b60)[0x7fc5306e4b60] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/local/lib/libmesos-0.28.0.so(_ZN5mesos8internallsERSoRKNS0_12StatusUpdateE+0x16a)[0x7fc531cc617a] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/local/lib/libmesos-0.28.0.so(_ZN5mesos8internal5slave5Slave12statusUpdateENS0_12StatusUpdateERK6OptionIN7process4UPIDEE+0xe7)[0x7fc531d71837] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/local/lib/libmesos-0.28.0.so(_ZNK5mesos8internal5slave5Slave4Http8executorERKN7process4http7RequestE+0xb52)[0x7fc531d302a2] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/local/lib/libmesos-0.28.0.so(+0xc754a3)[0x7fc531d4d4a3] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/local/lib/libmesos-0.28.0.so(+0x1295aa8)[0x7fc53236daa8] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/local/lib/libmesos-0.28.0.so(_ZN7process14ProcessManager6resumeEPNS_11ProcessBaseE+0x2d1)[0x7fc532375a71] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/local/lib/libmesos-0.28.0.so(+0x129dd77)[0x7fc532375d77] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb1bf0)[0x7fc530e85bf0] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fc5309a8182] > Apr 17 17:53:53 node-1 mesos-slave[4462]: > /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc5306d547d] > ... > Apr 17 17:53:53 node-1 mesos-slave[4462]: *** Aborted at 1460915633 (unix > time) try "date -d @1460915633" if you are using GNU date *** > Apr 17 17:53:53 node-1 mesos-slave[4462]: PC: @ 0x7fc530611cc9 (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: *** SIGABRT (@0x116e) received by > PID 4462 (TID 0x7fc5275f5700) from PID 4462; stack trace: *** > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5309b0340 (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc530611cc9 (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5306150d8 (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc53064e394 (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5306e5c9c (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5306e4b60 (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531cc617a > mesos::internal::operator<<() > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531d71837 > mesos::internal::slave::Slave::statusUpdate() > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531d302a2 > mesos::internal::slave::Slave::Http::executor() > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531d4d4a3 > _ZNSt17_Function_handlerIFN7process6FutureINS0_4http8ResponseEEERKNS2_7RequestEEZN5mesos8internal5slave5Slave10initializeEvEUlS7_E19_E9_M_invokeERKSt9_Any_dataS7_ > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc53236daa8 > _ZZN7process11ProcessBase5visitERKNS_9HttpEventEENKUlRKNS_6FutureI6OptionINS_4http14authentication20AuthenticationResultEEEEE0_clESC_ > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc532375a71 > process::ProcessManager::resume() > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc532375d77 > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc530e85bf0 (unknown) > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5309a8182 > start_thread > Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5306d547d (unknown) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)