[ https://issues.apache.org/jira/browse/MESOS-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Greg Mann updated MESOS-6938: ----------------------------- Sprint: Mesosphere Sprint 50 > Libprocess reinitialization is flaky, can segfault > -------------------------------------------------- > > Key: MESOS-6938 > URL: https://issues.apache.org/jira/browse/MESOS-6938 > Project: Mesos > Issue Type: Bug > Components: libprocess, tests > Environment: ASF CI, CentOS 7, libevent and SSL enabled > Reporter: Greg Mann > Assignee: Greg Mann > Labels: libprocess, tests > > This was observed on ASF CI. Based on the placement of the stacktrace, the > segfault seems to occur during libprocess reinitialization, when > {{process::initialize}} is called: > {code} > [----------] 4 tests from Encryption/NetSocketTest > [ RUN ] Encryption/NetSocketTest.EOFBeforeRecv/0 > I0117 15:18:35.320691 27596 openssl.cpp:419] CA file path is unspecified! > NOTE: Set CA file path with LIBPROCESS_SSL_CA_FILE=<filepath> > I0117 15:18:35.320714 27596 openssl.cpp:424] CA directory path unspecified! > NOTE: Set CA directory path with LIBPROCESS_SSL_CA_DIR=<dirpath> > I0117 15:18:35.320719 27596 openssl.cpp:429] Will not verify peer certificate! > NOTE: Set LIBPROCESS_SSL_VERIFY_CERT=1 to enable peer certificate verification > I0117 15:18:35.320726 27596 openssl.cpp:435] Will only verify peer > certificate if presented! > NOTE: Set LIBPROCESS_SSL_REQUIRE_CERT=1 to require peer certificate > verification > I0117 15:18:35.335141 27596 process.cpp:1234] libprocess is initialized on > 172.17.0.3:46415 with 16 worker threads > [ OK ] Encryption/NetSocketTest.EOFBeforeRecv/0 (422 ms) > [ RUN ] Encryption/NetSocketTest.EOFBeforeRecv/1 > I0117 15:18:35.390697 27596 process.cpp:1234] libprocess is initialized on > 172.17.0.3:39822 with 16 worker threads > [ OK ] Encryption/NetSocketTest.EOFBeforeRecv/1 (6 ms) > [ RUN ] Encryption/NetSocketTest.EOFAfterRecv/0 > I0117 15:18:35.998528 27596 openssl.cpp:419] CA file path is unspecified! > NOTE: Set CA file path with LIBPROCESS_SSL_CA_FILE=<filepath> > I0117 15:18:35.998559 27596 openssl.cpp:424] CA directory path unspecified! > NOTE: Set CA directory path with LIBPROCESS_SSL_CA_DIR=<dirpath> > I0117 15:18:35.998566 27596 openssl.cpp:429] Will not verify peer certificate! > NOTE: Set LIBPROCESS_SSL_VERIFY_CERT=1 to enable peer certificate verification > I0117 15:18:35.998572 27596 openssl.cpp:435] Will only verify peer > certificate if presented! > NOTE: Set LIBPROCESS_SSL_REQUIRE_CERT=1 to require peer certificate > verification > I0117 15:18:36.010643 27596 process.cpp:1234] libprocess is initialized on > 172.17.0.3:47429 with 16 worker threads > [ OK ] Encryption/NetSocketTest.EOFAfterRecv/0 (664 ms) > [ RUN ] Encryption/NetSocketTest.EOFAfterRecv/1 > I0117 15:18:36.079453 27596 process.cpp:1234] libprocess is initialized on > 172.17.0.3:38149 with 16 worker threads > [ OK ] Encryption/NetSocketTest.EOFAfterRecv/1 (19 ms) > *** Aborted at 1484666316 (unix time) try "date -d @1484666316" if you are > using GNU date *** > PC: @ 0x7f7643ad7c56 __memcpy_ssse3_back > *** SIGSEGV (@0x57c10f8) received by PID 27596 (TID 0x7f76393c2700) from PID > 92016888; stack trace: *** > @ 0x7f7644ba0370 (unknown) > @ 0x7f7643ad7c56 __memcpy_ssse3_back > @ 0x7f76443248e0 (unknown) > @ 0x7f7644324f8c (unknown) > @ 0x422a4d process::UPID::UPID() > I0117 15:18:36.090376 27596 process.cpp:1234] libprocess is initialized on > 172.17.0.3:43835 with 16 worker threads > [----------] 4 tests from Encryption/NetSocketTest (1116 ms total) > [----------] 6 tests from SSLVerifyIPAdd/SSLTest > [ RUN ] SSLVerifyIPAdd/SSLTest.BasicSameProcess/0 > @ 0x8ae4a8 process::DispatchEvent::DispatchEvent() > @ 0x8a6a5e process::internal::dispatch() > @ 0x8c0b44 process::dispatch<>() > @ 0x8a598a process::ProcessBase::route() > @ 0x98be53 process::ProcessBase::route<>() > @ 0x988096 process::Help::initialize() > @ 0x89ef2a process::ProcessManager::resume() > @ 0x89b976 > _ZZN7process14ProcessManager12init_threadsEvENKUt_clEv > @ 0x8adb3c > _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x8ada80 > _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEclEv > @ 0x8ada0a > _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv > @ 0x7f764431b230 (unknown) > @ 0x7f7644b98dc5 start_thread > @ 0x7f7643a8473d __clone > make[7]: *** [check-local] Segmentation fault > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)