[ 
https://issues.apache.org/jira/browse/MESOS-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-2118:
-----------------------------------
    Environment: CentOS 6

Occurred on a CentOS 6 CI, quite possibly it only fails when run along with 
other tests, running in isolation might not be enough to reproduce.

> ShutdownTest.ShutdownEndpointGoodACLs is flaky, runs forever.
> -------------------------------------------------------------
>
>                 Key: MESOS-2118
>                 URL: https://issues.apache.org/jira/browse/MESOS-2118
>             Project: Mesos
>          Issue Type: Bug
>          Components: technical debt, test
>         Environment: CentOS 6
>            Reporter: Benjamin Mahler
>
> Looks like this test never finished cleanly:
> {noformat}
> DEBUG: [ RUN      ] ShutdownTest.ShutdownEndpointGoodACLs
> DEBUG: Using temporary directory 
> '/tmp/ShutdownTest_ShutdownEndpointGoodACLs_PNU8y4'
> DEBUG: I1116 08:33:03.597002 12611 leveldb.cpp:176] Opened db in 1.679773ms
> DEBUG: I1116 08:33:03.597597 12611 leveldb.cpp:183] Compacted db in 571410ns
> DEBUG: I1116 08:33:03.597615 12611 leveldb.cpp:198] Created db iterator in 
> 1957ns
> DEBUG: I1116 08:33:03.597620 12611 leveldb.cpp:204] Seeked to beginning of db 
> in 370ns
> DEBUG: I1116 08:33:03.597626 12611 leveldb.cpp:273] Iterated through 0 keys 
> in the db in 121ns
> DEBUG: I1116 08:33:03.597636 12611 replica.cpp:741] Replica recovered with 
> log positions 0 -> 0 with 1 holes and 0 unlearned
> DEBUG: I1116 08:33:03.597851 12638 recover.cpp:437] Starting replica recovery
> DEBUG: I1116 08:33:03.597950 12638 recover.cpp:463] Replica is in EMPTY status
> DEBUG: I1116 08:33:03.598697 12634 replica.cpp:638] Replica in EMPTY status 
> received a broadcasted recover request
> DEBUG: I1116 08:33:03.599009 12648 recover.cpp:188] Received a recover 
> response from a replica in EMPTY status
> DEBUG: I1116 08:33:03.599309 12626 recover.cpp:554] Updating replica status 
> to STARTING
> DEBUG: I1116 08:33:03.599762 12627 leveldb.cpp:306] Persisting metadata (8 
> bytes) to leveldb took 399340ns
> DEBUG: I1116 08:33:03.599786 12627 replica.cpp:320] Persisted replica status 
> to STARTING
> DEBUG: I1116 08:33:03.599922 12650 recover.cpp:463] Replica is in STARTING 
> status
> DEBUG: I1116 08:33:03.600464 12649 replica.cpp:638] Replica in STARTING 
> status received a broadcasted recover request
> DEBUG: I1116 08:33:03.600662 12630 recover.cpp:188] Received a recover 
> response from a replica in STARTING status
> DEBUG: I1116 08:33:03.600997 12651 recover.cpp:554] Updating replica status 
> to VOTING
> DEBUG: I1116 08:33:03.601369 12639 leveldb.cpp:306] Persisting metadata (8 
> bytes) to leveldb took 239227ns
> DEBUG: I1116 08:33:03.601387 12639 replica.cpp:320] Persisted replica status 
> to VOTING
> DEBUG: I1116 08:33:03.601625 12651 recover.cpp:568] Successfully joined the 
> Paxos group
> DEBUG: I1116 08:33:03.601764 12651 recover.cpp:452] Recover process terminated
> DEBUG: I1116 08:33:03.609264 12648 master.cpp:318] Master 
> 20141116-083303-1711542956-42834-12611 (atlc-bev-05-sr1.corpdc.twttr.net) 
> started on 172.18.4.102:42834
> DEBUG: I1116 08:33:03.609294 12648 master.cpp:364] Master only allowing 
> authenticated frameworks to register
> DEBUG: I1116 08:33:03.609305 12648 master.cpp:369] Master only allowing 
> authenticated slaves to register
> DEBUG: I1116 08:33:03.609316 12648 credentials.hpp:36] Loading credentials 
> for authentication from 
> '/tmp/ShutdownTest_ShutdownEndpointGoodACLs_PNU8y4/credentials'
> DEBUG: I1116 08:33:03.609416 12648 master.cpp:413] Authorization enabled
> DEBUG: I1116 08:33:03.609822 12639 master.cpp:1263] The newly elected leader 
> is master@172.18.4.102:42834 with id 20141116-083303-1711542956-42834-12611
> DEBUG: I1116 08:33:03.609835 12639 master.cpp:1276] Elected as the leading 
> master!
> DEBUG: I1116 08:33:03.609841 12639 master.cpp:1094] Recovering from registrar
> DEBUG: I1116 08:33:03.609895 12627 registrar.cpp:313] Recovering registrar
> DEBUG: I1116 08:33:03.610188 12626 log.cpp:656] Attempting to start the writer
> DEBUG: I1116 08:33:03.610788 12632 replica.cpp:474] Replica received implicit 
> promise request with proposal 1
> DEBUG: I1116 08:33:03.611081 12632 leveldb.cpp:306] Persisting metadata (8 
> bytes) to leveldb took 273337ns
> DEBUG: I1116 08:33:03.611096 12632 replica.cpp:342] Persisted promised to 1
> DEBUG: I1116 08:33:03.611469 12641 coordinator.cpp:230] Coordinator attemping 
> to fill missing position
> DEBUG: I1116 08:33:03.612118 12634 replica.cpp:375] Replica received explicit 
> promise request for position 0 with proposal 2
> DEBUG: I1116 08:33:03.612368 12634 leveldb.cpp:343] Persisting action (8 
> bytes) to leveldb took 232013ns
> DEBUG: I1116 08:33:03.612390 12634 replica.cpp:676] Persisted action at 0
> DEBUG: I1116 08:33:03.612936 12640 replica.cpp:508] Replica received write 
> request for position 0
> DEBUG: I1116 08:33:03.612962 12640 leveldb.cpp:438] Reading position from 
> leveldb took 8951ns
> DEBUG: I1116 08:33:03.613075 12640 leveldb.cpp:343] Persisting action (14 
> bytes) to leveldb took 97733ns
> DEBUG: I1116 08:33:03.613088 12640 replica.cpp:676] Persisted action at 0
> DEBUG: I1116 08:33:03.613435 12648 replica.cpp:655] Replica received learned 
> notice for position 0
> DEBUG: I1116 08:33:03.613695 12648 leveldb.cpp:343] Persisting action (16 
> bytes) to leveldb took 241135ns
> DEBUG: I1116 08:33:03.613709 12648 replica.cpp:676] Persisted action at 0
> DEBUG: I1116 08:33:03.613716 12648 replica.cpp:661] Replica learned NOP 
> action at position 0
> DEBUG: I1116 08:33:03.614004 12637 log.cpp:672] Writer started with ending 
> position 0
> DEBUG: I1116 08:33:03.614518 12638 leveldb.cpp:438] Reading position from 
> leveldb took 11321ns
> DEBUG: I1116 08:33:03.616251 12634 registrar.cpp:346] Successfully fetched 
> the registry (0B) in 6336us
> DEBUG: I1116 08:33:03.616283 12634 registrar.cpp:445] Applied 1 operations in 
> 1565ns; attempting to update the 'registry'
> DEBUG: I1116 08:33:03.618041 12631 log.cpp:680] Attempting to append 153 
> bytes to the log
> DEBUG: I1116 08:33:03.618234 12627 coordinator.cpp:340] Coordinator 
> attempting to write APPEND action at position 1
> DEBUG: I1116 08:33:03.618708 12627 replica.cpp:508] Replica received write 
> request for position 1
> DEBUG: I1116 08:33:03.619055 12627 leveldb.cpp:343] Persisting action (172 
> bytes) to leveldb took 326461ns
> DEBUG: I1116 08:33:03.619071 12627 replica.cpp:676] Persisted action at 1
> DEBUG: I1116 08:33:03.619359 12638 replica.cpp:655] Replica received learned 
> notice for position 1
> DEBUG: I1116 08:33:03.619635 12638 leveldb.cpp:343] Persisting action (174 
> bytes) to leveldb took 255575ns
> DEBUG: I1116 08:33:03.619657 12638 replica.cpp:676] Persisted action at 1
> DEBUG: I1116 08:33:03.619667 12638 replica.cpp:661] Replica learned APPEND 
> action at position 1
> DEBUG: I1116 08:33:03.619976 12639 registrar.cpp:490] Successfully updated 
> the 'registry' in 3.675904ms
> DEBUG: I1116 08:33:03.620025 12639 registrar.cpp:376] Successfully recovered 
> registrar
> DEBUG: I1116 08:33:03.620079 12629 log.cpp:699] Attempting to truncate the 
> log to 1
> DEBUG: I1116 08:33:03.620496 12649 master.cpp:1121] Recovered 0 slaves from 
> the Registry (115B) ; allowing 10mins for slaves to re-register
> DEBUG: I1116 08:33:03.620504 12626 coordinator.cpp:340] Coordinator 
> attempting to write TRUNCATE action at position 2
> DEBUG: I1116 08:33:03.620971 12654 replica.cpp:508] Replica received write 
> request for position 2
> DEBUG: I1116 08:33:03.621232 12654 leveldb.cpp:343] Persisting action (16 
> bytes) to leveldb took 245704ns
> DEBUG: I1116 08:33:03.621247 12654 replica.cpp:676] Persisted action at 2
> DEBUG: I1116 08:33:03.621649 12638 replica.cpp:655] Replica received learned 
> notice for position 2
> DEBUG: I1116 08:33:03.621922 12638 leveldb.cpp:343] Persisting action (18 
> bytes) to leveldb took 252335ns
> DEBUG: I1116 08:33:03.621950 12638 leveldb.cpp:401] Deleting ~1 keys from 
> leveldb took 13419ns
> DEBUG: I1116 08:33:03.621959 12638 replica.cpp:676] Persisted action at 2
> DEBUG: I1116 08:33:03.621966 12638 replica.cpp:661] Replica learned TRUNCATE 
> action at position 2
> DEBUG: I1116 08:33:03.635592 12611 sched.cpp:148] Version: 0.22.0-rc0
> DEBUG: I1116 08:33:03.636092 12652 sched.cpp:245] New master detected at 
> master@172.18.4.102:42834
> DEBUG: I1116 08:33:03.636126 12652 sched.cpp:301] Authenticating with master 
> master@172.18.4.102:42834
> DEBUG: I1116 08:33:03.636133 12652 sched.cpp:308] Using default CRAM-MD5 
> authenticatee
> DEBUG: I1116 08:33:03.636348 12654 authenticatee.hpp:138] Creating new client 
> SASL connection
> DEBUG: I1116 08:33:03.636590 12631 master.cpp:3896] Authenticating 
> scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834
> DEBUG: I1116 08:33:03.636613 12631 master.cpp:3907] Using default CRAM-MD5 
> authenticator
> DEBUG: I1116 08:33:03.636693 12626 authenticator.hpp:170] Creating new server 
> SASL connection
> DEBUG: I1116 08:33:03.636988 12652 authenticatee.hpp:229] Received SASL 
> authentication mechanisms: CRAM-MD5
> DEBUG: I1116 08:33:03.637004 12652 authenticatee.hpp:255] Attempting to 
> authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1116 08:33:03.637030 12652 authenticator.hpp:276] Received SASL 
> authentication start
> DEBUG: I1116 08:33:03.637126 12652 authenticator.hpp:398] Authentication 
> requires more steps
> DEBUG: I1116 08:33:03.637266 12627 authenticatee.hpp:275] Received SASL 
> authentication step
> DEBUG: I1116 08:33:03.637338 12654 authenticator.hpp:304] Received SASL 
> authentication step
> DEBUG: I1116 08:33:03.637367 12654 authenticator.hpp:390] Authentication 
> success
> DEBUG: I1116 08:33:03.637419 12641 authenticatee.hpp:315] Authentication 
> success
> DEBUG: I1116 08:33:03.637423 12654 master.cpp:3954] Successfully 
> authenticated principal 'test-principal' at 
> scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834
> DEBUG: I1116 08:33:03.637684 12635 sched.cpp:389] Successfully authenticated 
> with master master@172.18.4.102:42834
> DEBUG: I1116 08:33:03.637755 12630 master.cpp:1383] Received registration 
> request for framework 'default' at 
> scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834
> DEBUG: I1116 08:33:03.637780 12630 master.cpp:1342] Authorizing framework 
> principal 'test-principal' to receive offers for role '*'
> DEBUG: I1116 08:33:03.637987 12633 master.cpp:1447] Registering framework 
> 20141116-083303-1711542956-42834-12611-0000 (default) at 
> scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834
> DEBUG: I1116 08:33:03.638067 12635 hierarchical_allocator_process.hpp:329] 
> Added framework 20141116-083303-1711542956-42834-12611-0000
> DEBUG: I1116 08:33:03.638491 12654 sched.cpp:439] Framework registered with 
> 20141116-083303-1711542956-42834-12611-0000
> DEBUG: I1116 08:33:03.639173 12638 master.cpp:4266] Removing framework 
> 20141116-083303-1711542956-42834-12611-0000 (default) at 
> scheduler-38732cf0-d6db-44a1-a95b-b3763e10555d@172.18.4.102:42834
> DEBUG: I1116 08:33:03.639245 12636 hierarchical_allocator_process.hpp:405] 
> Deactivated framework 20141116-083303-1711542956-42834-12611-0000
> DEBUG: I1116 08:33:03.639276 12636 hierarchical_allocator_process.hpp:360] 
> Removed framework 20141116-083303-1711542956-42834-12611-0000
> DEBUG: 2014-11-16 
> 08:33:06,261:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:09,597:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:12,932:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:16,265:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:19,602:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:22,936:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:26,273:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:29,608:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:32,943:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:36,280:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:39,616:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:42,952:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:46,286:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:49,620:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:52,954:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> DEBUG: 2014-11-16 
> 08:33:56,288:12611(0x7f5c34a30700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:43782] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to