----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72239/#review219968 -----------------------------------------------------------
src/logging/logging.cpp Lines 115 (patched) <https://reviews.apache.org/r/72239/#comment308311> Do I understand correctly that all the exception filter code tries to avoid dynamic allocation at all costs? Maybe leave a NOTE? Or is it obvious? src/logging/logging.cpp Lines 117 (patched) <https://reviews.apache.org/r/72239/#comment308202> Wrapped link (here and below) can be hard to copy into browser. Not sure what our styleguide says; checked other parts of Mesos codebase - links like this are typically kept on a single line with `// NOLINT` at the end. src/logging/logging.cpp Lines 234 (patched) <https://reviews.apache.org/r/72239/#comment308309> Do I understand correctly that each thread has its own unhandled exception filter (which we reset for all threads with one call of `SetUnhandledExceptionFilter`) https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter but the functions of `dbghelp.h` that we use are not thread-safe? https://docs.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symgetlinefromaddr64 Doesn't this mean that we need to synchronize access to `dbghelp` or, at least, to make sure that concurrent run is dropped? Given the nature of Mesos, I would not be surprised if we will eventually run into a pair of simultaneous exceptions (probably triggered by a common cause) in two threads. Having only one stack of two is probably tolerable, but we would want to avoid getting garbadge/triggering UB at all costs. src/logging/logging.cpp Lines 308 (patched) <https://reviews.apache.org/r/72239/#comment308308> In the not very likely case of a very deep stack, can we somehow explicitly inform the reader that the trace was clipped due to this limitation? src/logging/logging.cpp Lines 449 (patched) <https://reviews.apache.org/r/72239/#comment308310> This function returns the previously installed exception filter, if there was one; do you have any idea whether we need to care of this or not, and why? I would naively expect that there should be no handler initially. Probably, if the returned value is not NULL, this means that we don't know what is going on (i.e. what 3rd-party library, after we updated it, started to install its own filter before us) and should just crash here? - Andrei Sekretenko On March 17, 2020, 12:09 a.m., Benjamin Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72239/ > ----------------------------------------------------------- > > (Updated March 17, 2020, 12:09 a.m.) > > > Review request for mesos, Andrei Sekretenko, Greg Mann, and Joseph Wu. > > > Bugs: MESOS-7245 > https://issues.apache.org/jira/browse/MESOS-7245 > > > Repository: mesos > > > Description > ------- > > For now, this starts with adding the necessary code to logging.cpp up > in mesos. This gives us stack traces for any mesos binaries or > libraries that initialize logging with `installFailureSignalHandler` > set to true. > > Ideally, we would either (1) move this into stout, or (2) upstream > some changes to glog. Both approaches would allow us to also get > stack traces from stout tests and libprocess tests. > > For now however, this patch is just aiming to fix the urgent issue > of a lack of stack traces for mesos crashes. > > > Diffs > ----- > > src/logging/logging.cpp baca51b3956dbaa282c03e73a2c261ff103f4821 > > > Diff: https://reviews.apache.org/r/72239/diff/3/ > > > Testing > ------- > > Ran through CI and locally, for the recent reservation update crash this > produces: > > ``` > E0000 00:00:00.000000 18216 logging.cpp:295] RAW: Received fatal exception > 0xc0000005 EXCEPTION_ACCESS_VIOLATION > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > google::protobuf::internal::RepeatedPtrIterator<mesos::Resource_ReservationInfo > const >::operator* [00007FF75279306E+14] > (c:\users\bmahler\git\mesos\build\3rdparty\protobuf-3.5.0\src\protobuf-3.5.0\src\google\protobuf\repeated_field.h:2266) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::authorization::ActionObject::reserve [00007FF753C94880+720] > (c:\users\bmahler\git\mesos\src\master\authorization.cpp:236) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::internal::master::Master::Http::_reserve [00007FF753D007CF+1423] > (c:\users\bmahler\git\mesos\src\master\http.cpp:1960) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::internal::master::Master::Http::reserveResources > [00007FF753D07FFC+380] (c:\users\bmahler\git\mesos\src\master\http.cpp:1984) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::internal::master::Master::Http::api [00007FF753CED99B+6571] > (c:\users\bmahler\git\mesos\src\master\http.cpp:352) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_339a9fa0c2c868bcf197f8c97b63146d>::operator() [00007FF7527AA2E4+100] > (c:\users\bmahler\git\mesos\src\master\master.cpp:909) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_339a9fa0c2c868bcf197f8c97b63146d> > &,process::http::Request const > &,Option<process::http::authentication::Principal> const &> > [00007FF75262D61A+122] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_339a9fa0c2c868bcf197f8c97b63146d> > &,process::http::Request const > &,Option<process::http::authentication::Principal> const &> > [00007FF7526E55AA+122] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_ret<process::Future<process::http::Response>,0>::_Call<<lambda_339a9fa0c2c868bcf197f8c97b63146d> > &,process::http::Request const > &,Option<process::http::authentication::Principal> const &> > [00007FF75262D55A+122] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Func_impl_no_alloc<<lambda_339a9fa0c2c868bcf197f8c97b63146d>,process::Future<process::http::Response>,process::http::Request > const &,Option<process::http::authentication::Principal> const &>::_Do_call > [00007FF7527EDF2F+111] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\functional:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Func_class<process::Future<process::http::Response>,process::http::Request > const &,Option<process::http::authentication::Principal> const > &>::operator() [00007FF75472BAAD+157] (c:\program files (x86)\microsoft > visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\functional:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_23c733dc7debde2255779e987bee5763>::operator() [00007FF75471911F+335] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3874) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_23c733dc7debde2255779e987bee5763>,bool> > [00007FF754638F1F+95] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_23c733dc7debde2255779e987bee5763>,bool> > [00007FF75469AB2E+94] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>::invoke_expand<<lambda_23c733dc7debde2255779e987bee5763>,std::tuple<bool>,std::tuple<>,0> > [00007FF7546A2B9A+138] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:292) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>::operator()<> > [00007FF754620DCF+127] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:331) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > > [00007FF75463D082+66] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > > [00007FF75469EBF2+66] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Invoke<process::Future<process::http::Response> > >::operator()<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > > [00007FF75462A637+71] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:387) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>::CallableFn<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > >::operator() [00007FF7547261CF+95] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:463) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>::operator() [00007FF752521D5B+267] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:443) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_1161592621caf1e03e3f58e967ac7c4e>::operator() [00007FF75251C5CD+77] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\dispatch.hpp:119) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,process::ProcessBase *> [00007FF7524DA748+152] (c:\program > files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,process::ProcessBase *> [00007FF7524F2AC4+132] (c:\program > files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> > >::invoke_expand<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::tuple<std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,std::tuple<process::ProcessBase * &&>,0,1,2> > [00007FF7524F5EE9+249] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:292) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >::operator()<process::ProcessBase *> > [00007FF7524D413A+122] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:331) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524DC40C+76] > (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524F46BC+76] > (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Invoke<void>::operator()<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524D6031+81] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:399) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<void __cdecl(process::ProcessBase > *)>::CallableFn<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> > >::operator() [00007FF752520296+102] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:464) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<void __cdecl(process::ProcessBase *)>::operator() > [00007FF75472B83D+285] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:444) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::ProcessBase::consume [00007FF7545D8439+73] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3666) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::DispatchEvent::consume [00007FF7547890EA+74] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\event.hpp:200) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::ProcessBase::serve [00007FF75064AB77+71] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\process.hpp:88) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::ProcessManager::resume [00007FF7545E6260+1696] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3055) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_124422ac022fa041208b80c1460630d7>::operator() [00007FF754718541+145] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:2545) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_124422ac022fa041208b80c1460630d7> > > [00007FF754638A20+48] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_124422ac022fa041208b80c1460630d7> > [00007FF75469A650+48] > (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > > > > >::_Execute<0> [00007FF75464CE2C+60] (c:\program files (x86)\microsoft > visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:239) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > > > > >::_Run [00007FF75476E6AA+106] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:245) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > > > > >::_Go [00007FF754759698+40] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:231) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: std::_Pad::_Call_func > [00007FF75473EA3D+45] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:209) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > register_onexit_function [00007FFA6B384FB8+1192] > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > register_onexit_function [00007FFA6B384BF1+225] > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: BaseThreadInitThunk > [00007FFA7F687974+20] > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: RtlUserThreadStart > [00007FFA81F3A261+33] > ``` > > > Thanks, > > Benjamin Mahler > >