[jira] [Commented] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky
[ https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958448#comment-14958448 ] Yong Qiao Wang commented on MESOS-2255: --- [~xujyan], I ran the test case SlaveRecoveryTest/0.MasterFailover again on OS X(10.10.4), but I found it work well: {noformat:title=} Yongs-MacBook-Pro:bin yqwyq$ ./mesos-tests.sh --gtest_filter=SlaveRecoveryTest/0.MasterFailover .. .. [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from SlaveRecoveryTest/0, where TypeParam = mesos::internal::slave::MesosContainerizer [ RUN ] SlaveRecoveryTest/0.MasterFailover I1015 14:58:55.538914 1939460864 exec.cpp:136] Version: 0.26.0 .. .. [ OK ] SlaveRecoveryTest/0.MasterFailover (1397 ms) [--] 1 test from SlaveRecoveryTest/0 (1397 ms total) [--] Global test environment tear-down [==] 1 test from 1 test case ran. (1406 ms total) [ PASSED ] 1 test. {noformat:title=} Could you let me know which OS/version you ran this case? > SlaveRecoveryTest/0.MasterFailover is flaky > --- > > Key: MESOS-2255 > URL: https://issues.apache.org/jira/browse/MESOS-2255 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Yan Xu >Assignee: Yong Qiao Wang > Labels: flaky, twitter > > {noformat:title=} > [ RUN ] SlaveRecoveryTest/0.MasterFailover > Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0' > I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms > I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms > I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns > I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in > 2038ns > I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the > db in 484ns > I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery > I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status > I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received > a broadcasted recover request > I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to > STARTING > I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 10.24963ms > I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to > STARTING > I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status > I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status > received a broadcasted recover request > I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from > a replica in STARTING status > I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING > I0123 07:45:49.853698 17655 master.cpp:262] Master > 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955 > I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing > authenticated frameworks to register > I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing > authenticated slaves to register > I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials' > I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled > I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 8.742923ms > I0123 07:45:49.859853 17659 replica.cpp:323] Persisted replica status to > VOTING > I0123 07:45:49.860327 17658 recover.cpp:580] Successfully joined the Paxos > group > I0123 07:45:49.860703 17654 recover.cpp:464] Recover process terminated > I0123 07:45:49.859591 17655 master.cpp:1219] The newly elected leader is > master@127.0.1.1:44955 with id 20150123-074549-16842879-44955-17634 > I0123 07:45:49.864702 17655 master.cpp:1232] Elected as the leading master! > I0123 07:45:49.864904 17655 master.cpp:1050] Recovering from registrar > I0123 07:45:49.865406 17660 registrar.cpp:313] Recovering registrar > I0123 07:45:49.866576 17660 log.cpp:660] Attempting to start the writer > I0123 07:45:49.868638 17658 replica.cpp:477] Replica received implicit > promise request with proposal 1 > I0123 07:45:49.872521 17658 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 3.848859ms > I0123 07:45:49.872555 17658 replica.cpp:345] Persisted promised to 1 > I0123 07:45:49.873769 17661 coordinator.cpp:230] Coordinator attempin
[jira] [Commented] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky
[ https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956331#comment-14956331 ] Yong Qiao Wang commented on MESOS-2255: --- I will re-run this test case and fix it if it is still a problem. > SlaveRecoveryTest/0.MasterFailover is flaky > --- > > Key: MESOS-2255 > URL: https://issues.apache.org/jira/browse/MESOS-2255 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Yan Xu >Assignee: Yong Qiao Wang > Labels: flaky, twitter > > {noformat:title=} > [ RUN ] SlaveRecoveryTest/0.MasterFailover > Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0' > I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms > I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms > I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns > I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in > 2038ns > I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the > db in 484ns > I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery > I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status > I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received > a broadcasted recover request > I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to > STARTING > I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 10.24963ms > I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to > STARTING > I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status > I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status > received a broadcasted recover request > I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from > a replica in STARTING status > I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING > I0123 07:45:49.853698 17655 master.cpp:262] Master > 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955 > I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing > authenticated frameworks to register > I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing > authenticated slaves to register > I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials' > I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled > I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 8.742923ms > I0123 07:45:49.859853 17659 replica.cpp:323] Persisted replica status to > VOTING > I0123 07:45:49.860327 17658 recover.cpp:580] Successfully joined the Paxos > group > I0123 07:45:49.860703 17654 recover.cpp:464] Recover process terminated > I0123 07:45:49.859591 17655 master.cpp:1219] The newly elected leader is > master@127.0.1.1:44955 with id 20150123-074549-16842879-44955-17634 > I0123 07:45:49.864702 17655 master.cpp:1232] Elected as the leading master! > I0123 07:45:49.864904 17655 master.cpp:1050] Recovering from registrar > I0123 07:45:49.865406 17660 registrar.cpp:313] Recovering registrar > I0123 07:45:49.866576 17660 log.cpp:660] Attempting to start the writer > I0123 07:45:49.868638 17658 replica.cpp:477] Replica received implicit > promise request with proposal 1 > I0123 07:45:49.872521 17658 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 3.848859ms > I0123 07:45:49.872555 17658 replica.cpp:345] Persisted promised to 1 > I0123 07:45:49.873769 17661 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0123 07:45:49.875474 17658 replica.cpp:378] Replica received explicit > promise request for position 0 with proposal 2 > I0123 07:45:49.880878 17658 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 5.364021ms > I0123 07:45:49.880913 17658 replica.cpp:679] Persisted action at 0 > I0123 07:45:49.882619 17657 replica.cpp:511] Replica received write request > for position 0 > I0123 07:45:49.882998 17657 leveldb.cpp:438] Reading position from leveldb > took 150092ns > I0123 07:45:49.886488 17657 leveldb.cpp:343] Persisting action (14 bytes) to > leveldb took 3.269189ms > I0123 07:45:49.886536 17657 replica.cpp:679] Persisted action at 0 > I0123 07:45:49.887181 17657 replica.cpp:658] Replica received learned notice > for position 0 > I0123 07:45:49.892900 17657 leveldb.c