Vinod Kone created MESOS-1685: --------------------------------- Summary: AuthenticationTest.LeaderElectionDuringSlaveAuthentication is flaky Key: MESOS-1685 URL: https://issues.apache.org/jira/browse/MESOS-1685 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.20.0 Reporter: Vinod Kone
{code} [ RUN ] AuthenticationTest.LeaderElectionDuringSlaveAuthentication Using temporary directory '/tmp/AuthenticationTest_LeaderElectionDuringSlaveAuthentication_ToAVSz' I0807 19:41:39.846611 32388 leveldb.cpp:176] Opened db in 43.95906ms I0807 19:41:39.862596 32388 leveldb.cpp:183] Compacted db in 15.962654ms I0807 19:41:39.862617 32388 leveldb.cpp:198] Created db iterator in 2081ns I0807 19:41:39.862623 32388 leveldb.cpp:204] Seeked to beginning of db in 328ns I0807 19:41:39.862630 32388 leveldb.cpp:273] Iterated through 0 keys in the db in 142ns I0807 19:41:39.862640 32388 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0807 19:41:39.866587 1160 master.cpp:286] Master 20140807-194139-453759884-45707-32388 (hemera.apache.org) started on 140.211.11.27:45707 I0807 19:41:39.866605 1160 master.cpp:323] Master only allowing authenticated frameworks to register I0807 19:41:39.866611 1160 master.cpp:328] Master only allowing authenticated slaves to register I0807 19:41:39.866617 1160 credentials.hpp:36] Loading credentials for authentication from '/tmp/AuthenticationTest_LeaderElectionDuringSlaveAuthentication_ToAVSz/credentials' I0807 19:41:39.866710 1160 master.cpp:357] Authorization enabled I0807 19:41:39.867102 1160 recover.cpp:425] Starting replica recovery I0807 19:41:39.867368 1160 hierarchical_allocator_process.hpp:296] Initializing hierarchical allocator process with master : master@140.211.11.27:45707 I0807 19:41:39.867401 1160 master.cpp:120] No whitelist given. Advertising offers for all slaves I0807 19:41:39.867470 1160 master.cpp:1196] The newly elected leader is master@140.211.11.27:45707 with id 20140807-194139-453759884-45707-32388 I0807 19:41:39.867480 1160 master.cpp:1209] Elected as the leading master! I0807 19:41:39.867486 1160 master.cpp:1027] Recovering from registrar I0807 19:41:39.867535 1160 recover.cpp:451] Replica is in EMPTY status I0807 19:41:39.867583 1160 registrar.cpp:313] Recovering registrar I0807 19:41:39.868026 1160 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request I0807 19:41:39.868083 1160 recover.cpp:188] Received a recover response from a replica in EMPTY status I0807 19:41:39.868185 1160 recover.cpp:542] Updating replica status to STARTING I0807 19:41:39.889654 1160 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 21.412285ms I0807 19:41:39.889731 1160 replica.cpp:320] Persisted replica status to STARTING I0807 19:41:39.889857 1160 recover.cpp:451] Replica is in STARTING status I0807 19:41:39.890214 1160 replica.cpp:638] Replica in STARTING status received a broadcasted recover request I0807 19:41:39.890328 1160 recover.cpp:188] Received a recover response from a replica in STARTING status I0807 19:41:39.890488 1160 recover.cpp:542] Updating replica status to VOTING I0807 19:41:39.909653 1160 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 19.057628ms I0807 19:41:39.909729 1160 replica.cpp:320] Persisted replica status to VOTING I0807 19:41:39.909821 1160 recover.cpp:556] Successfully joined the Paxos group I0807 19:41:39.909926 1160 recover.cpp:440] Recover process terminated I0807 19:41:39.910084 1160 log.cpp:656] Attempting to start the writer I0807 19:41:39.910423 1160 replica.cpp:474] Replica received implicit promise request with proposal 1 I0807 19:41:39.929661 1160 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 19.170603ms I0807 19:41:39.929735 1160 replica.cpp:342] Persisted promised to 1 I0807 19:41:39.929951 1160 coordinator.cpp:230] Coordinator attemping to fill missing position I0807 19:41:39.930344 1160 replica.cpp:375] Replica received explicit promise request for position 0 with proposal 2 I0807 19:41:39.949653 1160 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 19.241344ms I0807 19:41:39.949728 1160 replica.cpp:676] Persisted action at 0 I0807 19:41:39.950052 1160 replica.cpp:508] Replica received write request for position 0 I0807 19:41:39.950129 1160 leveldb.cpp:438] Reading position from leveldb took 11101ns I0807 19:41:39.965492 1160 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 15.301932ms I0807 19:41:39.965565 1160 replica.cpp:676] Persisted action at 0 I0807 19:41:39.966223 1148 replica.cpp:655] Replica received learned notice for position 0 I0807 19:41:39.981616 1148 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 15.357209ms I0807 19:41:39.981663 1148 replica.cpp:676] Persisted action at 0 I0807 19:41:39.981673 1148 replica.cpp:661] Replica learned NOP action at position 0 I0807 19:41:39.981917 1148 log.cpp:672] Writer started with ending position 0 I0807 19:41:39.982179 1148 leveldb.cpp:438] Reading position from leveldb took 15562ns I0807 19:41:39.983762 1148 registrar.cpp:346] Successfully fetched the registry (0B) I0807 19:41:39.983783 1148 registrar.cpp:422] Attempting to update the 'registry' I0807 19:41:39.985309 1148 log.cpp:680] Attempting to append 138 bytes to the log I0807 19:41:39.985353 1148 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1 I0807 19:41:39.985576 1148 replica.cpp:508] Replica received write request for position 1 I0807 19:41:40.002367 1148 leveldb.cpp:343] Persisting action (157 bytes) to leveldb took 16.770862ms I0807 19:41:40.002405 1148 replica.cpp:676] Persisted action at 1 I0807 19:41:40.002647 1148 replica.cpp:655] Replica received learned notice for position 1 I0807 19:41:40.018611 1148 leveldb.cpp:343] Persisting action (159 bytes) to leveldb took 15.944725ms I0807 19:41:40.018648 1148 replica.cpp:676] Persisted action at 1 I0807 19:41:40.018658 1148 replica.cpp:661] Replica learned APPEND action at position 1 I0807 19:41:40.019000 1148 registrar.cpp:479] Successfully updated 'registry' I0807 19:41:40.019037 1148 registrar.cpp:372] Successfully recovered registrar I0807 19:41:40.019076 1148 log.cpp:699] Attempting to truncate the log to 1 I0807 19:41:40.019148 1148 master.cpp:1054] Recovered 0 slaves from the Registry (100B) ; allowing 10mins for slaves to re-register I0807 19:41:40.019187 1148 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 2 I0807 19:41:40.019448 1148 replica.cpp:508] Replica received write request for position 2 I0807 19:41:40.034870 1148 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 15.405034ms I0807 19:41:40.034905 1148 replica.cpp:676] Persisted action at 2 I0807 19:41:40.035141 1148 replica.cpp:655] Replica received learned notice for position 2 I0807 19:41:40.051055 1148 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 15.89786ms I0807 19:41:40.051105 1148 leveldb.cpp:401] Deleting ~1 keys from leveldb took 19442ns I0807 19:41:40.051116 1148 replica.cpp:676] Persisted action at 2 I0807 19:41:40.051126 1148 replica.cpp:661] Replica learned TRUNCATE action at position 2 I0807 19:41:40.053736 32388 containerizer.cpp:124] Using isolation: posix/cpu,posix/mem I0807 19:41:40.061689 1133 slave.cpp:167] Slave started on 176)@140.211.11.27:45707 I0807 19:41:40.061714 1133 credentials.hpp:84] Loading credential for authentication from '/tmp/AuthenticationTest_LeaderElectionDuringSlaveAuthentication_L7Tgoz/credential' I0807 19:41:40.061820 1133 slave.cpp:265] Slave using credential for: test-principal I0807 19:41:40.061957 1133 slave.cpp:278] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] I0807 19:41:40.062029 1133 slave.cpp:306] Slave hostname: hemera.apache.org I0807 19:41:40.062041 1133 slave.cpp:307] Slave checkpoint: false I0807 19:41:40.062556 1133 state.cpp:33] Recovering state from '/tmp/AuthenticationTest_LeaderElectionDuringSlaveAuthentication_L7Tgoz/meta' I0807 19:41:40.062659 1133 status_update_manager.cpp:193] Recovering status update manager I0807 19:41:40.062722 1133 containerizer.cpp:287] Recovering containerizer I0807 19:41:40.063004 1133 slave.cpp:3175] Finished recovery I0807 19:41:40.063238 1133 slave.cpp:589] New master detected at master@140.211.11.27:45707 I0807 19:41:40.063262 1133 slave.cpp:663] Authenticating with master master@140.211.11.27:45707 I0807 19:41:40.063307 1133 slave.cpp:636] Detecting new master I0807 19:41:40.063344 1133 status_update_manager.cpp:167] New master detected at master@140.211.11.27:45707 I0807 19:41:40.063387 1133 authenticatee.hpp:128] Creating new client SASL connection I0807 19:41:40.063520 1133 master.cpp:3615] Authenticating slave(176)@140.211.11.27:45707 I0807 19:41:40.063597 1133 authenticator.hpp:156] Creating new server SASL connection I0807 19:41:40.063686 1133 authenticatee.hpp:219] Received SASL authentication mechanisms: CRAM-MD5 I0807 19:41:40.063702 1133 authenticatee.hpp:245] Attempting to authenticate with mechanism 'CRAM-MD5' I0807 19:41:40.063730 1133 authenticator.hpp:262] Received SASL authentication start I0807 19:41:40.063777 1133 authenticator.hpp:384] Authentication requires more steps I0807 19:41:40.065558 1140 slave.cpp:589] New master detected at master@140.211.11.27:45707 I0807 19:41:40.065598 1140 slave.cpp:636] Detecting new master I0807 19:41:40.065630 1140 status_update_manager.cpp:167] New master detected at master@140.211.11.27:45707 W0807 19:41:40.069753 1137 slave.cpp:701] Failed to authenticate with master master@140.211.11.27:45707: master changed I0807 19:41:40.069787 1137 slave.cpp:663] Authenticating with master master@140.211.11.27:45707 I0807 19:41:40.069850 1137 authenticatee.hpp:128] Creating new client SASL connection I0807 19:41:40.069954 1137 master.cpp:3601] Queuing up authentication request from slave(176)@140.211.11.27:45707 because authentication is still in progress I0807 19:41:40.869726 1129 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:40.869753 1129 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 60025ns I0807 19:41:41.873703 1117 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:41.873729 1117 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 32883ns I0807 19:41:42.881687 1117 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:42.881716 1117 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 33444ns I0807 19:41:43.885706 1133 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:43.885733 1133 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 43589ns I0807 19:41:44.873805 1107 master.cpp:120] No whitelist given. Advertising offers for all slaves I0807 19:41:44.887039 1107 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:44.887054 1107 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 17800ns W0807 19:41:45.069783 1130 master.cpp:3673] Authentication timed out W0807 19:41:45.073734 1133 slave.cpp:737] Authentication timed out W0807 19:41:45.073828 1133 slave.cpp:701] Failed to authenticate with master master@140.211.11.27:45707: Authentication discarded I0807 19:41:45.073859 1133 slave.cpp:663] Authenticating with master master@140.211.11.27:45707 I0807 19:41:45.073931 1133 authenticatee.hpp:128] Creating new client SASL connection I0807 19:41:45.074075 1133 master.cpp:3601] Queuing up authentication request from slave(176)@140.211.11.27:45707 because authentication is still in progress I0807 19:41:45.889716 1136 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:45.889744 1136 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 48526ns I0807 19:41:46.897703 1136 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:46.897734 1136 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 37717ns I0807 19:41:47.905705 1136 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:47.905733 1136 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 36077ns I0807 19:41:48.913715 1136 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:48.913744 1136 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 37773ns I0807 19:41:49.881702 1136 master.cpp:120] No whitelist given. Advertising offers for all slaves I0807 19:41:49.921392 1136 hierarchical_allocator_process.hpp:688] No resources available to allocate! I0807 19:41:49.921411 1136 hierarchical_allocator_process.hpp:650] Performed allocation for 0 slaves in 24001ns ../../src/tests/authentication_tests.cpp:689: Failure Failed to wait 10secs for slaveRegisteredMessage W0807 19:41:50.081794 1158 slave.cpp:737] Authentication timed out W0807 19:41:50.081881 1158 slave.cpp:701] Failed to authenticate with master master@140.211.11.27:45707: Authentication discarded I0807 19:41:50.081915 1158 slave.cpp:663] Authenticating with master master@140.211.11.27:45707 I0807 19:41:50.082000 1158 authenticatee.hpp:128] Creating new client SASL connection I0807 19:41:50.082160 1158 master.cpp:3601] Queuing up authentication request from slave(176)@140.211.11.27:45707 because authentication is still in progress I0807 19:41:50.082362 1119 slave.cpp:576] No pings from master received within 75secs I0807 19:41:50.082394 1119 slave.cpp:636] Detecting new master I0807 19:41:50.101763 32388 master.cpp:634] Master terminating I0807 19:41:50.105679 1133 slave.cpp:2356] master@140.211.11.27:45707 exited W0807 19:41:50.105697 1133 slave.cpp:2359] Master disconnected! Waiting for a new master to be elected I0807 19:41:50.107046 32388 slave.cpp:466] Slave terminating ../../3rdparty/libprocess/include/process/gmock.hpp:282: Failure Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(testing::A<const MessageEvent&>()))... Expected args: message matcher (8-byte object <A8-9B 02-AC CA-2A 00-00>, 1-byte object <00>, 1-byte object <01>) Expected: to be called once Actual: never called - unsatisfied and active [ FAILED ] AuthenticationTest.LeaderElectionDuringSlaveAuthentication (10305 ms) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)