[ 
https://issues.apache.org/jira/browse/MESOS-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen resolved MESOS-979.
------------------------------------------

    Resolution: Fixed

> Master segfault when tasks.json endpoint is hit
> -----------------------------------------------
>
>                 Key: MESOS-979
>                 URL: https://issues.apache.org/jira/browse/MESOS-979
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.18.0
>            Reporter: Jie Yu
>            Assignee: Niklas Quarfot Nielsen
>            Priority: Critical
>             Fix For: 0.19.0
>
>
> curl 'http://<host>:5050/master/tasks.json
> {noformat}
> I0210 17:45:30.496232 10818 master.hpp:421] Removing task 
> system-gc-ea18e4d4-3549-4159-b9a4-7587f87af2f3 with resources cpus(*):0.01; 
> disk(*):1; mem(*):1 on slave 201309231905-2215191306-5050-53933-22 (<host>)
> I0210 17:45:30.496374 10818 hierarchical_allocator_process.hpp:637] Recovered 
> cpus(*):0.01; disk(*):1; mem(*):1 (total allocatable: cpus(*):0.01; 
> disk(*):1; mem(*):1) on slave 201309231905-2215191306-5050-53933-22 from 
> framework 201108030032-0000000011-0000
> I0210 17:45:30.562484 10818 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142078 ] on slave 
> 201309231905-2215191306-5050-53933-27 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:30.562623 10818 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309231905-2215191306-5050-53933-27 for 5secs
> I0210 17:45:30.624539 10828 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142075 ] on slave 
> 201309231905-2215191306-5050-53933-28 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:30.624811 10830 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309231905-2215191306-5050-53933-28 for 5secs
> I0210 17:45:30.986958 10828 http.cpp:319] HTTP request for 
> '/master/stats.json'
> I0210 17:45:35.099058 10831 master.cpp:1998] Status update TASK_FINISHED 
> (UUID: dca364d9-4b93-4a46-b5cc-3237beaf2ef6) for task 
> system-gc-8024ec43-07e8-4d78-96b4-5c491ecea710 of framework 
> 201108030032-0000000011-0000 from slave(1)@10.35.12.106:5051
> I0210 17:45:35.099158 10831 master.hpp:421] Removing task 
> system-gc-8024ec43-07e8-4d78-96b4-5c491ecea710 with resources cpus(*):0.01; 
> disk(*):1; mem(*):1 on slave 201309231905-2215191306-5050-53933-15 (<host>)
> I0210 17:45:35.099278 10831 hierarchical_allocator_process.hpp:637] Recovered 
> cpus(*):0.01; disk(*):1; mem(*):1 (total allocatable: cpus(*):0.01; 
> disk(*):1; mem(*):1) on slave 201309231905-2215191306-5050-53933-15 from 
> framework 201108030032-0000000011-0000
> I0210 17:45:35.966225 10825 master.cpp:2250] Sending 2 offers to framework 
> 201108030032-0000000011-0000
> I0210 17:45:36.858616 10819 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142082 ] on slave 
> 201309131923-1829643018-5050-22101-4 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:36.858881 10819 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309131923-1829643018-5050-22101-4 for 5secs
> I0210 17:45:41.004396 10821 http.cpp:319] HTTP request for 
> '/master/stats.json'
> I0210 17:45:41.914156 10823 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142112 ] on slave 
> 201309231905-2215191306-5050-53933-26 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:41.914438 10823 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309231905-2215191306-5050-53933-26 for 5secs
> I0210 17:45:41.972837 10818 master.cpp:2250] Sending 1 offers to framework 
> 201108030032-0000000011-0000
> I0210 17:45:41.976341 10819 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142264 ] on slave 
> 201309131923-1829643018-5050-22101-4 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:41.976824 10819 master.hpp:403] Adding task 
> system-gc-6e21f223-e6e8-4790-b9fc-8df7b4195a70 with resources cpus(*):0.01; 
> disk(*):1; mem(*):1 on slave 201309131923-1829643018-5050-22101-4 (<host>)
> I0210 17:45:41.976881 10819 master.cpp:2419] Launching task 
> system-gc-6e21f223-e6e8-4790-b9fc-8df7b4195a70 of framework 
> 201108030032-0000000011-0000 with resources cpus(*):0.01; disk(*):1; mem(*):1 
> on slave 201309131923-1829643018-5050-22101-4 (<host>)
> I0210 17:45:41.977452 10819 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309131923-1829643018-5050-22101-4 for 5secs
> I0210 17:45:43.807765 10827 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142084 ] on slave 
> 201309231901-1762272010-5050-5544-17 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:43.807962 10827 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309231901-1762272010-5050-5544-17 for 5secs
> I0210 17:45:43.974618 10823 master.cpp:2250] Sending 1 offers to framework 
> 201108030032-0000000011-0000
> I0210 17:45:46.977696 10825 master.cpp:2250] Sending 1 offers to framework 
> 201108030032-0000000011-0000
> I0210 17:45:47.978766 10817 master.cpp:2250] Sending 1 offers to framework 
> 201108030032-0000000011-0000
> I0210 17:45:49.204412 10820 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142101 ] on slave 
> 201309231905-2215191306-5050-53933-32 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:49.204604 10820 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309231905-2215191306-5050-53933-32 for 5secs
> I0210 17:45:49.538697 10830 master.cpp:1546] Processing reply for offers: [ 
> 201402061850-2215191306-5050-10816-142106 ] on slave 
> 201309231905-2215191306-5050-53933-34 (<host>) for framework 
> 201108030032-0000000011-0000
> I0210 17:45:49.538883 10830 hierarchical_allocator_process.hpp:590] Framework 
> 201108030032-0000000011-0000 filtered slave 
> 201309231905-2215191306-5050-53933-34 for 5secs
> I0210 17:45:51.022019 10829 http.cpp:319] HTTP request for 
> '/master/stats.json'
> I0210 17:45:51.773195 10830 http.cpp:540] HTTP request for 
> '/master/tasks.json'
> *** Aborted at 1392054351 (unix time) try "date -d @1392054351" if you are 
> using GNU date ***
> PC: @     0x7f3fb0a0d9ca mesos::internal::master::TaskComparator::descending()
> *** SIGSEGV (@0x4069) received by PID 10816 (TID 0x4781a940) from PID 16489; 
> stack trace: ***
>     @       0x3bdac0eb10 (unknown)
>     @     0x7f3fb0a0d9ca mesos::internal::master::TaskComparator::descending()
>     @     0x7f3fb0a083e2 std::__unguarded_partition<>()
>     @     0x7f3fb0a0a8e8 std::__introsort_loop<>()
>     @     0x7f3fb09ff0dc mesos::internal::master::Master::Http::tasks()
>     @     0x7f3fb0a3fc2e std::tr1::_Function_handler<>::_M_invoke()
>     @     0x7f3fb0e40dd1 std::tr1::function<>::operator()()
>     @     0x7f3fb0e3b521 process::ProcessBase::visit()
>     @     0x7f3fb0e2d1ab process::ProcessManager::resume()
>     @     0x7f3fb0e2db7f process::schedule()
>     @       0x3bdac0673d (unknown)
>     @       0x3bd9cd3f6d (unknown)
> /usr/local/bin/mesos-master.sh: line 83: 10816 Segmentation fault      (core 
> dumped) $debug /usr/local/sbin/mesos-master --root_submissions 
> --zk=${master_zoo_url} --log_dir=${log_dir} "$@"
> Master Exit Status: 139
> Abnormal exit detected, sending mail.
> BACKTRACE:
> [New Thread 10832]
> [New Thread 10833]
> [New Thread 10831]
> [New Thread 10837]
> [New Thread 10816]
> [New Thread 10834]
> [New Thread 10836]
> [New Thread 10823]
> [New Thread 10818]
> [New Thread 10825]
> [New Thread 10824]
> [New Thread 10835]
> [New Thread 10826]
> [New Thread 10821]
> [New Thread 10819]
> [New Thread 10829]
> [New Thread 10817]
> [New Thread 10822]
> [New Thread 10820]
> [New Thread 10827]
> [New Thread 10828]
> Core was generated by `/usr/local/sbin/mesos-master --root_submissions 
> --zk=zk://mesos:mesos@zookeeper'.
> Program terminated with signal 11, Segmentation fault.
> #0  ascending (lhs=0xc559c0, rhs=0x4011) at master/http.cpp:524
>         in master/http.cpp
> #0  ascending (lhs=0xc559c0, rhs=0x4011) at master/http.cpp:524
> #1  mesos::internal::master::TaskComparator::descending (lhs=0xc559c0,
>     rhs=0x4011) at master/http.cpp:533
> #2  0x00007f3fb0a083e2 in 
> std::__unguarded_partition<__gnu_cxx::__normal_iterator<mesos::internal::Task 
> const**, std::vector<mesos::internal::Task const*, 
> std::allocator<mesos::internal::Task const*> > >, mesos::internal::Task 
> const*, bool (*)(mesos::internal::Task const*, mesos::internal::Task const*)> 
> (
>     __first=<value optimized out>, __last=<value optimized out>,
>     __pivot=0xc559c0,
>     __comp=0x7f3fb0a0d9a0 
> <mesos::internal::master::TaskComparator::descending(mesos::internal::Task 
> const*, mesos::internal::Task const*)>)
>     at 
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_algo.h:2202
> #3  0x00007f3fb0a0a8e8 in 
> std::__introsort_loop<__gnu_cxx::__normal_iterator<mesos::internal::Task 
> const**, std::vector<mesos::internal::Task const*, 
> std::allocator<mesos::internal::Task const*> > >, long, bool 
> (*)(mesos::internal::Task const*, mesos::internal::Task const*)> (__first=...,
>     __last=<value optimized out>, __depth_limit=19,
>     __comp=0x7f3fb0a0d9a0 
> <mesos::internal::master::TaskComparator::descending(mesos::internal::Task 
> const*, mesos::internal::Task const*)>)
>     at 
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_algo.h:2679
> #4  0x00007f3fb09ff0dc in 
> sort<__gnu_cxx::__normal_iterator<mesos::internal::Task const**, 
> std::vector<mesos::internal::Task const*, 
> std::allocator<mesos::internal::Task const*> > >, bool 
> (*)(mesos::internal::Task const*, mesos::internal::Task const*)> 
> (this=0x47819700, request=<value optimized out>)
>     at 
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_algo.h:2749
> #5  mesos::internal::master::Master::Http::tasks (this=0x47819700,
>     request=<value optimized out>) at master/http.cpp:580
> #6  0x00007f3fb0a3fc2e in operator() (__functor=<value optimized out>,
>     __a1=<value optimized out>)
>     at 
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:208
> #7  operator()<process::http::Request const> (
>     __functor=<value optimized out>, __a1=<value optimized out>)
>     at 
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/bind_iterate.h:45
> #8  std::tr1::_Function_handler<process::Future<process::http::Response> 
> ()(const 
> process::http::Request&),std::tr1::_Bind<std::tr1::_Mem_fn<process::Future<process::http::Response>
>  (mesos::internal::master::Master::Http::*)(const process::http::Request&)> 
> ()(mesos::internal::master::Master::Http, std::tr1::_Placeholder<1>)> 
> >::_M_invoke(const std::tr1::_Any_data &, const process::http::Request &) 
> (__functor=<value optimized out>, __a1=<value optimized out>)
>     at 
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:488
> #9  0x00007f3fb0e40dd1 in 
> std::tr1::function<process::Future<process::http::Response> ()(const 
> process::http::Request&)>::operator()(const process::http::Request &) const 
> (this=0x1, __a1=...)
>     at 
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:868
> #10 0x00007f3fb0e3b521 in process::ProcessBase::visit (
>     this=<value optimized out>, event=<value optimized out>)
>     at src/process.cpp:3194
> #11 0x00007f3fb0e2d1ab in process::ProcessManager::resume (this=0xb1dec0,
>     process=0x7f3fa8010d10) at src/process.cpp:2594
> #12 0x00007f3fb0e2db7f in process::schedule (arg=<value optimized out>)
>     at src/process.cpp:1290
> #13 0x0000003bdac0673d in start_thread () from /lib64/libpthread.so.0
> #14 0x0000003bd9cd3f6d in clone () from /lib64/libc.so.6
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to