[ https://issues.apache.org/jira/browse/MESOS-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Mahler updated MESOS-2118: ----------------------------------- Environment: CentOS 6 Occurred on a CentOS 6 CI, quite possibly it only fails when run along with other tests, running in isolation might not be enough to reproduce. > ShutdownTest.ShutdownEndpointGoodACLs is flaky, runs forever. > ------------------------------------------------------------- > > Key: MESOS-2118 > URL: https://issues.apache.org/jira/browse/MESOS-2118 > Project: Mesos > Issue Type: Bug > Components: technical debt, test > Environment: CentOS 6 > Reporter: Benjamin Mahler > > Looks like this test never finished cleanly: > {noformat} > DEBUG: [ RUN ] ShutdownTest.ShutdownEndpointGoodACLs > DEBUG: Using temporary directory > '/tmp/ShutdownTest_ShutdownEndpointGoodACLs_PNU8y4' > DEBUG: I1116 08:33:03.597002 12611 leveldb.cpp:176] Opened db in 1.679773ms > DEBUG: I1116 08:33:03.597597 12611 leveldb.cpp:183] Compacted db in 571410ns > DEBUG: I1116 08:33:03.597615 12611 leveldb.cpp:198] Created db iterator in > 1957ns > DEBUG: I1116 08:33:03.597620 12611 leveldb.cpp:204] Seeked to beginning of db > in 370ns > DEBUG: I1116 08:33:03.597626 12611 leveldb.cpp:273] Iterated through 0 keys > in the db in 121ns > DEBUG: I1116 08:33:03.597636 12611 replica.cpp:741] Replica recovered with > log positions 0 -> 0 with 1 holes and 0 unlearned > DEBUG: I1116 08:33:03.597851 12638 recover.cpp:437] Starting replica recovery > DEBUG: I1116 08:33:03.597950 12638 recover.cpp:463] Replica is in EMPTY status > DEBUG: I1116 08:33:03.598697 12634 replica.cpp:638] Replica in EMPTY status > received a broadcasted recover request > DEBUG: I1116 08:33:03.599009 12648 recover.cpp:188] Received a recover > response from a replica in EMPTY status > DEBUG: I1116 08:33:03.599309 12626 recover.cpp:554] Updating replica status > to STARTING > DEBUG: I1116 08:33:03.599762 12627 leveldb.cpp:306] Persisting metadata (8 > bytes) to leveldb took 399340ns > DEBUG: I1116 08:33:03.599786 12627 replica.cpp:320] Persisted replica status > to STARTING > DEBUG: I1116 08:33:03.599922 12650 recover.cpp:463] Replica is in STARTING > status > DEBUG: I1116 08:33:03.600464 12649 replica.cpp:638] Replica in STARTING > status received a broadcasted recover request > DEBUG: I1116 08:33:03.600662 12630 recover.cpp:188] Received a recover > response from a replica in STARTING status > DEBUG: I1116 08:33:03.600997 12651 recover.cpp:554] Updating replica status > to VOTING > DEBUG: I1116 08:33:03.601369 12639 leveldb.cpp:306] Persisting metadata (8 > bytes) to leveldb took 239227ns > DEBUG: I1116 08:33:03.601387 12639 replica.cpp:320] Persisted replica status > to VOTING > DEBUG: I1116 08:33:03.601625 12651 recover.cpp:568] Successfully joined the > Paxos group > DEBUG: I1116 08:33:03.601764 12651 recover.cpp:452] Recover process terminated > DEBUG: I1116 08:33:03.609264 12648 master.cpp:318] Master > 20141116-083303-1711542956-42834-12611 (atlc-bev-05-sr1.corpdc.twttr.net) > started on 172.18.4.102:42834 > DEBUG: I1116 08:33:03.609294 12648 master.cpp:364] Master only allowing > authenticated frameworks to register > DEBUG: I1116 08:33:03.609305 12648 master.cpp:369] Master only allowing > authenticated slaves to register > DEBUG: I1116 08:33:03.609316 12648 credentials.hpp:36] Loading credentials > for authentication from > '/tmp/ShutdownTest_ShutdownEndpointGoodACLs_PNU8y4/credentials' > DEBUG: I1116 08:33:03.609416 12648 master.cpp:413] Authorization enabled > DEBUG: I1116 08:33:03.609822 12639 master.cpp:1263] The newly elected leader > is master@172.18.4.102:42834 with id 20141116-083303-1711542956-42834-12611 > DEBUG: I1116 08:33:03.609835 12639 master.cpp:1276] Elected as the leading > master! > DEBUG: I1116 08:33:03.609841 12639 master.cpp:1094] Recovering from registrar > DEBUG: I1116 08:33:03.609895 12627 registrar.cpp:313] Recovering registrar > DEBUG: I1116 08:33:03.610188 12626 log.cpp:656] Attempting to start the writer > DEBUG: I1116 08:33:03.610788 12632 replica.cpp:474] Replica received implicit > promise request with proposal 1 > DEBUG: I1116 08:33:03.611081 12632 leveldb.cpp:306] Persisting metadata (8 > bytes) to leveldb took 273337ns > DEBUG: I1116 08:33:03.611096 12632 replica.cpp:342] Persisted promised to 1 > DEBUG: I1116 08:33:03.611469 12641 coordinator.cpp:230] Coordinator attemping > to fill missing position > DEBUG: I1116 08:33:03.612118 12634 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > DEBUG: I1116 08:33:03.612368 12634 leveldb.cpp:343] Persisting action (8 > bytes) to leveldb took 232013ns > DEBUG: I1116 08:33:03.612390 12634 replica.cpp:676] Persisted action at 0 > DEBUG: I1116 08:33:03.612936 12640 replica.cpp:508] Replica received write > request for position 0 > DEBUG: I1116 08:33:03.612962 12640 leveldb.cpp:438] Reading position from > leveldb took 8951ns > DEBUG: I1116 08:33:03.613075 12640 leveldb.cpp:343] Persisting action (14 > bytes) to leveldb took 97733ns > DEBUG: I1116 08:33:03.613088 12640 replica.cpp:676] Persisted action at 0 > DEBUG: I1116 08:33:03.613435 12648 replica.cpp:655] Replica received learned > notice for position 0 > DEBUG: I1116 08:33:03.613695 12648 leveldb.cpp:343] Persisting action (16 > bytes) to leveldb took 241135ns > DEBUG: I1116 08:33:03.613709 12648 replica.cpp:676] Persisted action at 0 > DEBUG: I1116 08:33:03.613716 12648 replica.cpp:661] Replica learned NOP > action at position 0 > DEBUG: I1116 08:33:03.614004 12637 log.cpp:672] Writer started with ending > position 0 > DEBUG: I1116 08:33:03.614518 12638 leveldb.cpp:438] Reading position from > leveldb took 11321ns > DEBUG: I1116 08:33:03.616251 12634 registrar.cpp:346] Successfully fetched > the registry (0B) in 6336us > DEBUG: I1116 08:33:03.616283 12634 registrar.cpp:445] Applied 1 operations in > 1565ns; attempting to update the 'registry' > DEBUG: I1116 08:33:03.618041 12631 log.cpp:680] Attempting to append 153 > bytes to the log > DEBUG: I1116 08:33:03.618234 12627 coordinator.cpp:340] Coordinator > attempting to write APPEND action at position 1 > DEBUG: I1116 08:33:03.618708 12627 replica.cpp:508] Replica received write > request for position 1 > DEBUG: I1116 08:33:03.619055 12627 leveldb.cpp:343] Persisting action (172 > bytes) to leveldb took 326461ns > DEBUG: I1116 08:33:03.619071 12627 replica.cpp:676] Persisted action at 1 > DEBUG: I1116 08:33:03.619359 12638 replica.cpp:655] Replica received learned > notice for position 1 > DEBUG: I1116 08:33:03.619635 12638 leveldb.cpp:343] Persisting action (174 > bytes) to leveldb took 255575ns > DEBUG: I1116 08:33:03.619657 12638 replica.cpp:676] Persisted action at 1 > DEBUG: I1116 08:33:03.619667 12638 replica.cpp:661] Replica learned APPEND > action at position 1 > DEBUG: I1116 08:33:03.619976 12639 registrar.cpp:490] Successfully updated > the 'registry' in 3.675904ms > DEBUG: I1116 08:33:03.620025 12639 registrar.cpp:376] Successfully recovered > registrar > DEBUG: I1116 08:33:03.620079 12629 log.cpp:699] Attempting to truncate the > log to 1 > DEBUG: I1116 08:33:03.620496 12649 master.cpp:1121] Recovered 0 slaves from > the Registry (115B) ; allowing 10mins for slaves to re-register > DEBUG: I1116 08:33:03.620504 12626 coordinator.cpp:340] Coordinator > attempting to write TRUNCATE action at position 2 > DEBUG: I1116 08:33:03.620971 12654 replica.cpp:508] Replica received write > request for position 2 > DEBUG: I1116 08:33:03.621232 12654 leveldb.cpp:343] Persisting action (16 > bytes) to leveldb took 245704ns > DEBUG: I1116 08:33:03.621247 12654 replica.cpp:676] Persisted action at 2 > DEBUG: I1116 08:33:03.621649 12638 replica.cpp:655] Replica received learned > notice for position 2 > DEBUG: I1116 08:33:03.621922 12638 leveldb.cpp:343] Persisting action (18 > bytes) to leveldb took 252335ns > DEBUG: I1116 08:33:03.621950 12638 leveldb.cpp:401] Deleting ~1 keys from > leveldb took 13419ns > DEBUG: I1116 08:33:03.621959 12638 replica.cpp:676] Persisted action at 2 > DEBUG: I1116 08:33:03.621966 12638 replica.cpp:661] Replica learned TRUNCATE > action at position 2 > DEBUG: I1116 08:33:03.635592 12611 sched.cpp:148] Version: 0.22.0-rc0 > DEBUG: I1116 08:33:03.636092 12652 sched.cpp:245] New master detected at > master@172.18.4.102:42834 > DEBUG: I1116 08:33:03.636126 12652 sched.cpp:301] Authenticating with master > master@172.18.4.102:42834 > DEBUG: I1116 08:33:03.636133 12652 sched.cpp:308] Using default CRAM-MD5 > authenticatee > DEBUG: I1116 08:33:03.636348 12654 authenticatee.hpp:138] Creating new client > SASL connection > DEBUG: I1116 08:33:03.636590 12631 master.cpp:3896] Authenticating > scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834 > DEBUG: I1116 08:33:03.636613 12631 master.cpp:3907] Using default CRAM-MD5 > authenticator > DEBUG: I1116 08:33:03.636693 12626 authenticator.hpp:170] Creating new server > SASL connection > DEBUG: I1116 08:33:03.636988 12652 authenticatee.hpp:229] Received SASL > authentication mechanisms: CRAM-MD5 > DEBUG: I1116 08:33:03.637004 12652 authenticatee.hpp:255] Attempting to > authenticate with mechanism 'CRAM-MD5' > DEBUG: I1116 08:33:03.637030 12652 authenticator.hpp:276] Received SASL > authentication start > DEBUG: I1116 08:33:03.637126 12652 authenticator.hpp:398] Authentication > requires more steps > DEBUG: I1116 08:33:03.637266 12627 authenticatee.hpp:275] Received SASL > authentication step > DEBUG: I1116 08:33:03.637338 12654 authenticator.hpp:304] Received SASL > authentication step > DEBUG: I1116 08:33:03.637367 12654 authenticator.hpp:390] Authentication > success > DEBUG: I1116 08:33:03.637419 12641 authenticatee.hpp:315] Authentication > success > DEBUG: I1116 08:33:03.637423 12654 master.cpp:3954] Successfully > authenticated principal 'test-principal' at > scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834 > DEBUG: I1116 08:33:03.637684 12635 sched.cpp:389] Successfully authenticated > with master master@172.18.4.102:42834 > DEBUG: I1116 08:33:03.637755 12630 master.cpp:1383] Received registration > request for framework 'default' at > scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834 > DEBUG: I1116 08:33:03.637780 12630 master.cpp:1342] Authorizing framework > principal 'test-principal' to receive offers for role '*' > DEBUG: I1116 08:33:03.637987 12633 master.cpp:1447] Registering framework > 20141116-083303-1711542956-42834-12611-0000 (default) at > scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834 > DEBUG: I1116 08:33:03.638067 12635 hierarchical_allocator_process.hpp:329] > Added framework 20141116-083303-1711542956-42834-12611-0000 > DEBUG: I1116 08:33:03.638491 12654 sched.cpp:439] Framework registered with > 20141116-083303-1711542956-42834-12611-0000 > DEBUG: I1116 08:33:03.639173 12638 master.cpp:4266] Removing framework > 20141116-083303-1711542956-42834-12611-0000 (default) at > scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834 > DEBUG: I1116 08:33:03.639245 12636 hierarchical_allocator_process.hpp:405] > Deactivated framework 20141116-083303-1711542956-42834-12611-0000 > DEBUG: I1116 08:33:03.639276 12636 hierarchical_allocator_process.hpp:360] > Removed framework 20141116-083303-1711542956-42834-12611-0000 > DEBUG: 2014-11-16 > 08:33:06,261:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:09,597:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:12,932:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:16,265:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:19,602:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:22,936:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:26,273:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:29,608:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:32,943:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:36,280:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:39,616:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:42,952:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:46,286:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:49,620:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:52,954:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > DEBUG: 2014-11-16 > 08:33:56,288:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)