[ https://issues.apache.org/jira/browse/MESOS-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064891#comment-16064891 ]
Dmitry Zhuk commented on MESOS-6345: ------------------------------------ Similar crash on CentOS7 (in ExamplesTest.PersistentVolumeFramework and ExamplesTest.DynamicReservationFramework) presumably due to race condition for {{signaledWrapper}} in {{configureSignal}}. {noformat} [ RUN ] ExamplesTest.DynamicReservationFramework *** Error in `mesos/build/src/.libs/lt-dynamic-reservation-framework': double free or corruption (fasttop): 0x00007fdfa0002e60 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x7c503)[0x7fdfc6da7503] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt14_Function_base13_Base_managerIZN7process5deferIN5mesos8internal5slave5SlaveEiiSt12_PlaceholderILi1EES7_ILi2EEEENS1_9_DeferredIDTcl4bindadsrSt8functionIFvT0_T1_EEclcvSF__Efp1_fp2_EEEERKNS1_3PIDIT_EEMSJ_FvSC_SD_ET2_T3_EUliiE_E10_M_destroyERSt9_Any_dataSt17integral_constantIbLb0EE+0x31)[0x7fdfcca9165c] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt14_Function_base13_Base_managerIZN7process5deferIN5mesos8internal5slave5SlaveEiiSt12_PlaceholderILi1EES7_ILi2EEEENS1_9_DeferredIDTcl4bindadsrSt8functionIFvT0_T1_EEclcvSF__Efp1_fp2_EEEERKNS1_3PIDIT_EEMSJ_FvSC_SD_ET2_T3_EUliiE_E10_M_managerERSt9_Any_dataRKST_St18_Manager_operation+0xa2)[0x7fdfcca79857] mesos/build/src/.libs/lt-dynamic-reservation-framework(_ZNSt14_Function_baseD1Ev+0x33)[0x560e50f40ae7] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt8functionIFviiEED1Ev+0x18)[0x7fdfcca2ec98] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt10_Head_baseILm0ESt8functionIFviiEELb0EED1Ev+0x18)[0x7fdfcca300ce] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt11_Tuple_implILm0EISt8functionIFviiEESt12_PlaceholderILi1EES3_ILi2EEEED1Ev+0x18)[0x7fdfcca300e8] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt5tupleIISt8functionIFviiEESt12_PlaceholderILi1EES3_ILi2EEEED1Ev+0x18)[0x7fdfcca30102] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt5_BindIFSt7_Mem_fnIMSt8functionIFviiEEKFviiEES3_St12_PlaceholderILi1EES7_ILi2EEEED1Ev+0x1c)[0x7fdfcca30120] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt14_Function_base13_Base_managerISt5_BindIFSt7_Mem_fnIMSt8functionIFviiEEKFviiEES5_St12_PlaceholderILi1EES9_ILi2EEEEE10_M_destroyERSt9_Any_dataSt17integral_constantIbLb0EE+0x29)[0x7fdfcca91873] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt14_Function_base13_Base_managerISt5_BindIFSt7_Mem_fnIMSt8functionIFviiEEKFviiEES5_St12_PlaceholderILi1EES9_ILi2EEEEE10_M_managerERSt9_Any_dataRKSF_St18_Manager_operation+0xa2)[0x7fdfcca79ba3] mesos/build/src/.libs/lt-dynamic-reservation-framework(_ZNSt14_Function_baseD1Ev+0x33)[0x560e50f40ae7] mesos/build/src/.libs/libmesos-1.4.0.so(_ZNSt8functionIFviiEED1Ev+0x18)[0x7fdfcca2ec98] mesos/build/src/.libs/libmesos-1.4.0.so(_ZN2os8internal15configureSignalEPKSt8functionIFviiEE+0x4a)[0x7fdfcc9db47d] mesos/build/src/.libs/libmesos-1.4.0.so(_ZN5mesos8internal5slave5Slave10initializeEv+0x3d5e)[0x7fdfcc9e0a78] mesos/build/src/.libs/libmesos-1.4.0.so(_ZN7process14ProcessManager6resumeEPNS_11ProcessBaseE+0x284)[0x7fdfcd93fedc] mesos/build/src/.libs/libmesos-1.4.0.so(+0x61152da)[0x7fdfcd93c2da] mesos/build/src/.libs/libmesos-1.4.0.so(+0x6127bce)[0x7fdfcd94ebce] mesos/build/src/.libs/libmesos-1.4.0.so(+0x6127b12)[0x7fdfcd94eb12] mesos/build/src/.libs/libmesos-1.4.0.so(+0x6127a9c)[0x7fdfcd94ea9c] /lib64/libstdc++.so.6(+0xb5230)[0x7fdfc73b7230] /lib64/libpthread.so.0(+0x7dc5)[0x7fdfc7612dc5] /lib64/libc.so.6(clone+0x6d)[0x7fdfc6e2276d] {noformat} > ExamplesTest.PersistentVolumeFramework failing due to double free corruption > on Ubuntu 14.04 > -------------------------------------------------------------------------------------------- > > Key: MESOS-6345 > URL: https://issues.apache.org/jira/browse/MESOS-6345 > Project: Mesos > Issue Type: Bug > Components: framework > Reporter: Avinash Sridharan > Labels: mesosphere > > PersistentVolumeFramework tests if failing on Ubuntu 14 > {code} > [Step 10/10] *** Error in > `/mnt/teamcity/work/4240ba9ddd0997c3/build/src/.libs/lt-persistent-volume-framework': > double free or corruption (fasttop): 0x00007f1ae0006a20 *** > [04:56:48]W: [Step 10/10] *** Aborted at 1475902608 (unix time) try "date > -d @1475902608" if you are using GNU date *** > [04:56:48]W: [Step 10/10] I1008 04:56:48.592744 25425 state.cpp:57] > Recovering state from '/mnt/teamcity/temp/buildTmp/mesos-8KiPML/2/meta' > [04:56:48]W: [Step 10/10] I1008 04:56:48.592808 25423 state.cpp:57] > Recovering state from '/mnt/teamcity/temp/buildTmp/mesos-8KiPML/1/meta' > [04:56:48]W: [Step 10/10] I1008 04:56:48.592952 25425 > status_update_manager.cpp:203] Recovering status update manager > [04:56:48]W: [Step 10/10] I1008 04:56:48.592957 25423 > status_update_manager.cpp:203] Recovering status update manager > [04:56:48]W: [Step 10/10] I1008 04:56:48.593010 25424 > containerizer.cpp:557] Recovering containerizer > [04:56:48]W: [Step 10/10] I1008 04:56:48.593143 25396 sched.cpp:226] > Version: 1.1.0 > [04:56:48]W: [Step 10/10] I1008 04:56:48.593158 25425 master.cpp:2013] > Elected as the leading master! > [04:56:48]W: [Step 10/10] I1008 04:56:48.593173 25425 master.cpp:1560] > Recovering from registrar > [04:56:48]W: [Step 10/10] I1008 04:56:48.593211 25424 registrar.cpp:329] > Recovering registrar > [04:56:48]W: [Step 10/10] I1008 04:56:48.593250 25425 sched.cpp:330] New > master detected at master@172.30.2.21:45167 > [04:56:48]W: [Step 10/10] I1008 04:56:48.593282 25425 sched.cpp:341] No > credentials provided. Attempting to register without authentication > [04:56:48]W: [Step 10/10] I1008 04:56:48.593293 25425 sched.cpp:820] > Sending SUBSCRIBE call to master@172.30.2.21:45167 > [04:56:48]W: [Step 10/10] PC: @ 0x7f1b0bbaccc9 (unknown) > [04:56:48]W: [Step 10/10] I1008 04:56:48.593339 25425 sched.cpp:853] Will > retry registration in 32.354951ms if necessary > [04:56:48]W: [Step 10/10] I1008 04:56:48.593364 25421 master.cpp:1387] > Dropping 'mesos.scheduler.Call' message since not recovered yet > [04:56:48]W: [Step 10/10] I1008 04:56:48.593413 25428 provisioner.cpp:253] > Provisioner recovery complete > [04:56:48]W: [Step 10/10] *** SIGABRT (@0x6334) received by PID 25396 (TID > 0x7f1b02ed6700) from PID 25396; stack trace: *** > [04:56:48]W: [Step 10/10] I1008 04:56:48.593520 25421 > containerizer.cpp:557] Recovering containerizer > [04:56:48]W: [Step 10/10] I1008 04:56:48.593529 25425 slave.cpp:5276] > Finished recovery > [04:56:48]W: [Step 10/10] I1008 04:56:48.593627 25422 leveldb.cpp:304] > Persisting metadata (8 bytes) to leveldb took 4.546422ms > [04:56:48]W: [Step 10/10] I1008 04:56:48.593695 25428 provisioner.cpp:253] > Provisioner recovery complete > [04:56:48]W: [Step 10/10] I1008 04:56:48.593701 25422 replica.cpp:320] > Persisted replica status to VOTING > [04:56:48]W: [Step 10/10] I1008 04:56:48.593760 25424 slave.cpp:5276] > Finished recovery > [04:56:48]W: [Step 10/10] I1008 04:56:48.593864 25427 recover.cpp:582] > Successfully joined the Paxos group > [04:56:48]W: [Step 10/10] I1008 04:56:48.593896 25425 slave.cpp:5448] > Querying resource estimator for oversubscribable resources > [04:56:48]W: [Step 10/10] I1008 04:56:48.593922 25427 recover.cpp:466] > Recover process terminated > [04:56:48]W: [Step 10/10] I1008 04:56:48.593976 25427 slave.cpp:5462] > Received oversubscribable resources {} from the resource estimator > [04:56:48]W: [Step 10/10] I1008 04:56:48.594002 25424 slave.cpp:5448] > Querying resource estimator for oversubscribable resources > [04:56:48]W: [Step 10/10] I1008 04:56:48.594017 25422 log.cpp:553] > Attempting to start the writer > [04:56:48]W: [Step 10/10] I1008 04:56:48.594030 25428 > status_update_manager.cpp:177] Pausing sending status updates > [04:56:48]W: [Step 10/10] I1008 04:56:48.594032 25427 slave.cpp:915] New > master detected at master@172.30.2.21:45167 > [04:56:48]W: [Step 10/10] I1008 04:56:48.594055 25423 slave.cpp:915] New > master detected at master@172.30.2.21:45167 > [04:56:48]W: [Step 10/10] I1008 04:56:48.594048 25428 > status_update_manager.cpp:177] Pausing sending status updates > [04:56:48]W: [Step 10/10] I1008 04:56:48.594061 25427 slave.cpp:936] No > credentials provided. Attempting to register without authentication > [04:56:48]W: [Step 10/10] I1008 04:56:48.594106 25427 slave.cpp:947] > Detecting new master > [04:56:48]W: [Step 10/10] I1008 04:56:48.594071 25423 slave.cpp:936] No > credentials provided. Attempting to register without authentication > [04:56:48]W: [Step 10/10] @ 0x7f1b0bf4b340 (unknown) > [04:56:48]W: [Step 10/10] I1008 04:56:48.594125 25423 slave.cpp:947] > Detecting new master > [04:56:48]W: [Step 10/10] I1008 04:56:48.594194 25423 slave.cpp:5462] > Received oversubscribable resources {} from the resource estimator > [04:56:48]W: [Step 10/10] I1008 04:56:48.594378 25422 replica.cpp:493] > Replica received implicit promise request from > __req_res__(3)@172.30.2.21:45167 with proposal 1 > [04:56:48]W: [Step 10/10] @ 0x7f1b0bbaccc9 (unknown) > [04:56:48]W: [Step 10/10] @ 0x7f1b0bbb00d8 (unknown) > [04:56:48]W: [Step 10/10] @ 0x7f1b0bbe9394 (unknown) > [04:56:48]W: [Step 10/10] I1008 04:56:48.595368 25422 leveldb.cpp:304] > Persisting metadata (8 bytes) to leveldb took 972334ns > [04:56:48]W: [Step 10/10] I1008 04:56:48.595381 25422 replica.cpp:342] > Persisted promised to 1 > [04:56:48]W: [Step 10/10] @ 0x7f1b0bbf566e (unknown) > [04:56:48]W: [Step 10/10] @ 0x7f1b0d930925 > _ZNSt14_Function_base13_Base_managerIZN7process5deferIN5mesos8internal5slave5SlaveEiiSt12_PlaceholderILi1EES7_ILi2EEEENS1_9_DeferredIDTcl4bindadsrSt8functionIFvT0_T1_EEclcvSF__Efp1_fp2_EEEERKNS1_3PIDIT_EEMSJ_FvSC_SD_ET2_T3_EUliiE_E10_M_managerERSt9_Any_dataRKST_St18_Manager_operation > [04:56:48]W: [Step 10/10] I1008 04:56:48.597909 25421 coordinator.cpp:238] > Coordinator attempting to fill missing positions > [04:56:48]W: [Step 10/10] I1008 04:56:48.598273 25423 replica.cpp:388] > Replica received explicit promise request from > __req_res__(4)@172.30.2.21:45167 for position 0 with proposal 2 > [04:56:48]W: [Step 10/10] @ 0x7f1b0d935b0b > std::_Function_base::_Base_manager<>::_M_manager() > [04:56:48]W: [Step 10/10] @ 0x7f1b0d8f2516 > os::internal::configureSignal() > [04:56:48]W: [Step 10/10] I1008 04:56:48.599318 25423 leveldb.cpp:341] > Persisting action (8 bytes) to leveldb took 1.024957ms > [04:56:48]W: [Step 10/10] I1008 04:56:48.599333 25423 replica.cpp:708] > Persisted action NOP at position 0 > [04:56:48]W: [Step 10/10] I1008 04:56:48.599630 25428 replica.cpp:537] > Replica received write request for position 0 from > __req_res__(5)@172.30.2.21:45167 > [04:56:48]W: [Step 10/10] I1008 04:56:48.599660 25428 leveldb.cpp:436] > Reading position from leveldb took 16893ns > [04:56:48]W: [Step 10/10] @ 0x7f1b0d904022 > mesos::internal::slave::Slave::initialize() > [04:56:48]W: [Step 10/10] @ 0x7f1b0e0c6ed1 > process::ProcessManager::resume() > [04:56:48]W: [Step 10/10] @ 0x7f1b0e0c7187 > _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv > [04:56:48]W: [Step 10/10] @ 0x7f1b0c726a60 (unknown) > [04:56:48]W: [Step 10/10] @ 0x7f1b0bf43182 start_thread > [04:56:48]W: [Step 10/10] @ 0x7f1b0bc7047d (unknown) > {code} > This is seen specifically in Ubuntu 14.04 -- This message was sent by Atlassian JIRA (v6.4.14#64029)