[ https://issues.apache.org/jira/browse/MESOS-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044831#comment-15044831 ]
Benjamin Bannier commented on MESOS-4072: ----------------------------------------- Note that this is an intentional hard exit: you specified a {{work_dir}} which is not writable, so there is no way we can continue after emitting an error message (which we did). However, we do not need to show a stack trace or dump core here (i.e. replace the use of {{CHECK}} with something like {{EXIT}}). > The lt-mesos-master will coredump in some situation. > ---------------------------------------------------- > > Key: MESOS-4072 > URL: https://issues.apache.org/jira/browse/MESOS-4072 > Project: Mesos > Issue Type: Bug > Affects Versions: 0.25.0 > Reporter: Nan Xiao > > I find lt-mesos-master will coredump when following conditions are met: > (1) The user doesn't have write permission of /var/lib/mesos directory: > nan@ubuntu:~/mesos-0.25.0/build$ ls -lt /var/lib/ > total 176 > dr-xr-xr-x 2 root root 4096 Dec 7 03:08 mesos > ...... > (2) the /var/lib/mesos is an empty folder: > nan@ubuntu:~/mesos-0.25.0/build$ ls -lt /var/lib/mesos/ > total 0 > Executing following command will core dump: > nan@ubuntu:~/mesos-0.25.0/build$ ./bin/mesos-master.sh --ip=16.187.250.141 > --work_dir=/var/lib/mesos > I1207 03:18:36.431015 22951 main.cpp:229] Build: 2015-12-07 00:11:18 by nan > I1207 03:18:36.431154 22951 main.cpp:231] Version: 0.25.0 > I1207 03:18:36.431388 22951 main.cpp:252] Using 'HierarchicalDRF' allocator > F1207 03:18:36.431807 22951 replica.cpp:724] CHECK_SOME(state): IO error: > /var/lib/mesos/replicated_log/LOCK: No such file or directory Failed to > recover the log > *** Check failure stack trace: *** > @ 0x7f076bc208ca google::LogMessage::Fail() > @ 0x7f076bc20816 google::LogMessage::SendToLog() > @ 0x7f076bc20218 google::LogMessage::Flush() > @ 0x7f076bc2312c google::LogMessageFatal::~LogMessageFatal() > @ 0x7f076adf8f30 _CheckFatal::~_CheckFatal() > @ 0x7f076baa4939 mesos::internal::log::ReplicaProcess::restore() > @ 0x7f076baa0f8c > mesos::internal::log::ReplicaProcess::ReplicaProcess() > @ 0x7f076baa4c95 mesos::internal::log::Replica::Replica() > @ 0x7f076b9cf819 mesos::internal::log::LogProcess::LogProcess() > @ 0x7f076b9d576c mesos::internal::log::Log::Log() > @ 0x46d21f main > @ 0x7f0766f69ec5 (unknown) > @ 0x46b979 (unknown) > Aborted (core dumped) > Use gdb to analyze it: > nan@ubuntu:~/mesos-0.25.0/build$ gdb > /home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master core > GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 > Copyright (C) 2014 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-linux-gnu". > Type "show configuration" for configuration details. > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > For help, type "help". > Type "apropos word" to search for commands related to "word"... > Reading symbols from > /home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master...done. > [New LWP 22065] > [New LWP 22087] > [New LWP 22085] > [New LWP 22089] > [New LWP 22084] > [New LWP 22086] > [New LWP 22091] > [New LWP 22088] > [New LWP 22092] > [New LWP 22090] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `/home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master > --ip=127.0.0.1 --work_di'. > Program terminated with signal SIGABRT, Aborted. > #0 0x00007fe917810cc9 in __GI_raise (sig=sig@entry=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:56 > 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. > Traceback (most recent call last): > File > "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19-gdb.py", > line 63, in <module> > from libstdcxx.v6.printers import register_libstdcxx_printers > ImportError: No module named 'libstdcxx' > (gdb) bt > #0 0x00007fe917810cc9 in __GI_raise (sig=sig@entry=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:56 > #1 0x00007fe9178140d8 in __GI_abort () at abort.c:89 > #2 0x00007fe91c4b8c1b in DumpStackTraceAndExit () from > /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so > #3 0x00007fe91c4b28ca in google::LogMessage::Fail () from > /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so > #4 0x00007fe91c4b2816 in google::LogMessage::SendToLog () from > /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so > #5 0x00007fe91c4b2218 in google::LogMessage::Flush () from > /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so > #6 0x00007fe91c4b512c in google::LogMessageFatal::~LogMessageFatal () from > /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so > #7 0x00007fe91b68af30 in _CheckFatal::~_CheckFatal (this=0x7ffe704ec3f0, > __in_chrg=<optimized out>) > at ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:165 > #8 0x00007fe91c336939 in mesos::internal::log::ReplicaProcess::restore > (this=0x16f25d0, path=...) at ../../src/log/replica.cpp:724 > #9 0x00007fe91c332f8c in > mesos::internal::log::ReplicaProcess::ReplicaProcess (this=0x16f25d0, > path=..., __in_chrg=<optimized out>, > __vtt_parm=<optimized out>) at ../../src/log/replica.cpp:160 > #10 0x00007fe91c336c95 in mesos::internal::log::Replica::Replica > (this=0x16e82a0, path=...) at ../../src/log/replica.cpp:753 > #11 0x00007fe91c261819 in mesos::internal::log::LogProcess::LogProcess () > from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so > #12 0x00007fe91c26776c in mesos::internal::log::Log::Log () from > /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so > #13 0x000000000046d21f in main (argc=3, argv=0x7ffe704ef028) at > ../../src/master/main.cpp:307 > (gdb) -- This message was sent by Atlassian JIRA (v6.3.4#6332)