[jira] [Commented] (MESOS-1352) Uninitialized scalar field in usage/main.cpp
[ https://issues.apache.org/jira/browse/MESOS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129633#comment-14129633 ] Kamil Domański commented on MESOS-1352: --- False positive. The bool is initialized from a command line parameter with a default false value. > Uninitialized scalar field in usage/main.cpp > > > Key: MESOS-1352 > URL: https://issues.apache.org/jira/browse/MESOS-1352 > Project: Mesos > Issue Type: Bug >Reporter: Niklas Quarfot Nielsen > Labels: coverity > > > *** CID 1213899: Uninitialized scalar field (UNINIT_CTOR) > /src/usage/main.cpp: 56 in Flags::Flags()() > 50 "Whether or not to output ResourceStatistics protobuf\n" > 51 "using the \"recordio\" format, i.e., the size as a \n" > 52 "4 byte unsigned integer followed by the serialized\n" > 53 "protobuf itself. By default the ResourceStatistics\n" > 54 "will be output as JSON", > 55 false); > >>> CID 1213899: Uninitialized scalar field (UNINIT_CTOR) > >>> Non-static class member "recordio" is not initialized in this > >>> constructor nor in any functions that it calls. > 56 } > 57 > 58 Option pid; > 59 bool recordio; > 60 }; > 61 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1786) FaultToleranceTest.ReconcilePendingTasks is flaky.
[ https://issues.apache.org/jira/browse/MESOS-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1786: --- Sprint: Mesos Q3 Sprint 5 > FaultToleranceTest.ReconcilePendingTasks is flaky. > -- > > Key: MESOS-1786 > URL: https://issues.apache.org/jira/browse/MESOS-1786 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Benjamin Mahler >Assignee: Benjamin Mahler > > {noformat} > [ RUN ] FaultToleranceTest.ReconcilePendingTasks > Using temporary directory > '/tmp/FaultToleranceTest_ReconcilePendingTasks_TwmFlm' > I0910 20:18:02.308562 21634 leveldb.cpp:176] Opened db in 28.520372ms > I0910 20:18:02.315268 21634 leveldb.cpp:183] Compacted db in 6.37495ms > I0910 20:18:02.315588 21634 leveldb.cpp:198] Created db iterator in 6338ns > I0910 20:18:02.315745 21634 leveldb.cpp:204] Seeked to beginning of db in > 1781ns > I0910 20:18:02.315901 21634 leveldb.cpp:273] Iterated through 0 keys in the > db in 537ns > I0910 20:18:02.316076 21634 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0910 20:18:02.316524 21654 recover.cpp:425] Starting replica recovery > I0910 20:18:02.316800 21654 recover.cpp:451] Replica is in EMPTY status > I0910 20:18:02.317245 21654 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0910 20:18:02.317445 21654 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0910 20:18:02.317672 21654 recover.cpp:542] Updating replica status to > STARTING > I0910 20:18:02.321723 21652 master.cpp:286] Master > 20140910-201802-16842879-60361-21634 (precise) started on 127.0.1.1:60361 > I0910 20:18:02.322041 21652 master.cpp:332] Master only allowing > authenticated frameworks to register > I0910 20:18:02.322320 21652 master.cpp:337] Master only allowing > authenticated slaves to register > I0910 20:18:02.322568 21652 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/FaultToleranceTest_ReconcilePendingTasks_TwmFlm/credentials' > I0910 20:18:02.323031 21652 master.cpp:366] Authorization enabled > I0910 20:18:02.323663 21654 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 5.781277ms > I0910 20:18:02.324074 21654 replica.cpp:320] Persisted replica status to > STARTING > I0910 20:18:02.324443 21654 recover.cpp:451] Replica is in STARTING status > I0910 20:18:02.325106 21654 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0910 20:18:02.325454 21654 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0910 20:18:02.326408 21654 recover.cpp:542] Updating replica status to VOTING > I0910 20:18:02.323892 21649 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@127.0.1.1:60361 > I0910 20:18:02.326120 21652 master.cpp:1212] The newly elected leader is > master@127.0.1.1:60361 with id 20140910-201802-16842879-60361-21634 > I0910 20:18:02.323938 21651 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0910 20:18:04.209081 21655 hierarchical_allocator_process.hpp:697] No > resources available to allocate! > I0910 20:18:04.209183 21655 hierarchical_allocator_process.hpp:659] Performed > allocation for 0 slaves in 118308ns > I0910 20:18:04.209230 21652 master.cpp:1225] Elected as the leading master! > I0910 20:18:04.209246 21652 master.cpp:1043] Recovering from registrar > I0910 20:18:04.209360 21650 registrar.cpp:313] Recovering registrar > I0910 20:18:04.214040 21654 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.887284299secs > I0910 20:18:04.214094 21654 replica.cpp:320] Persisted replica status to > VOTING > I0910 20:18:04.214190 21654 recover.cpp:556] Successfully joined the Paxos > group > I0910 20:18:04.214258 21654 recover.cpp:440] Recover process terminated > I0910 20:18:04.214437 21654 log.cpp:656] Attempting to start the writer > I0910 20:18:04.214756 21654 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0910 20:18:04.223865 21654 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 9.044596ms > I0910 20:18:04.223944 21654 replica.cpp:342] Persisted promised to 1 > I0910 20:18:04.229053 21652 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0910 20:18:04.229552 21652 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0910 20:18:04.248437 2
[jira] [Created] (MESOS-1786) FaultToleranceTest.ReconcilePendingTasks is flaky.
Benjamin Mahler created MESOS-1786: -- Summary: FaultToleranceTest.ReconcilePendingTasks is flaky. Key: MESOS-1786 URL: https://issues.apache.org/jira/browse/MESOS-1786 Project: Mesos Issue Type: Bug Components: test Reporter: Benjamin Mahler Assignee: Benjamin Mahler {noformat} [ RUN ] FaultToleranceTest.ReconcilePendingTasks Using temporary directory '/tmp/FaultToleranceTest_ReconcilePendingTasks_TwmFlm' I0910 20:18:02.308562 21634 leveldb.cpp:176] Opened db in 28.520372ms I0910 20:18:02.315268 21634 leveldb.cpp:183] Compacted db in 6.37495ms I0910 20:18:02.315588 21634 leveldb.cpp:198] Created db iterator in 6338ns I0910 20:18:02.315745 21634 leveldb.cpp:204] Seeked to beginning of db in 1781ns I0910 20:18:02.315901 21634 leveldb.cpp:273] Iterated through 0 keys in the db in 537ns I0910 20:18:02.316076 21634 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0910 20:18:02.316524 21654 recover.cpp:425] Starting replica recovery I0910 20:18:02.316800 21654 recover.cpp:451] Replica is in EMPTY status I0910 20:18:02.317245 21654 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request I0910 20:18:02.317445 21654 recover.cpp:188] Received a recover response from a replica in EMPTY status I0910 20:18:02.317672 21654 recover.cpp:542] Updating replica status to STARTING I0910 20:18:02.321723 21652 master.cpp:286] Master 20140910-201802-16842879-60361-21634 (precise) started on 127.0.1.1:60361 I0910 20:18:02.322041 21652 master.cpp:332] Master only allowing authenticated frameworks to register I0910 20:18:02.322320 21652 master.cpp:337] Master only allowing authenticated slaves to register I0910 20:18:02.322568 21652 credentials.hpp:36] Loading credentials for authentication from '/tmp/FaultToleranceTest_ReconcilePendingTasks_TwmFlm/credentials' I0910 20:18:02.323031 21652 master.cpp:366] Authorization enabled I0910 20:18:02.323663 21654 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 5.781277ms I0910 20:18:02.324074 21654 replica.cpp:320] Persisted replica status to STARTING I0910 20:18:02.324443 21654 recover.cpp:451] Replica is in STARTING status I0910 20:18:02.325106 21654 replica.cpp:638] Replica in STARTING status received a broadcasted recover request I0910 20:18:02.325454 21654 recover.cpp:188] Received a recover response from a replica in STARTING status I0910 20:18:02.326408 21654 recover.cpp:542] Updating replica status to VOTING I0910 20:18:02.323892 21649 hierarchical_allocator_process.hpp:299] Initializing hierarchical allocator process with master : master@127.0.1.1:60361 I0910 20:18:02.326120 21652 master.cpp:1212] The newly elected leader is master@127.0.1.1:60361 with id 20140910-201802-16842879-60361-21634 I0910 20:18:02.323938 21651 master.cpp:120] No whitelist given. Advertising offers for all slaves I0910 20:18:04.209081 21655 hierarchical_allocator_process.hpp:697] No resources available to allocate! I0910 20:18:04.209183 21655 hierarchical_allocator_process.hpp:659] Performed allocation for 0 slaves in 118308ns I0910 20:18:04.209230 21652 master.cpp:1225] Elected as the leading master! I0910 20:18:04.209246 21652 master.cpp:1043] Recovering from registrar I0910 20:18:04.209360 21650 registrar.cpp:313] Recovering registrar I0910 20:18:04.214040 21654 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 1.887284299secs I0910 20:18:04.214094 21654 replica.cpp:320] Persisted replica status to VOTING I0910 20:18:04.214190 21654 recover.cpp:556] Successfully joined the Paxos group I0910 20:18:04.214258 21654 recover.cpp:440] Recover process terminated I0910 20:18:04.214437 21654 log.cpp:656] Attempting to start the writer I0910 20:18:04.214756 21654 replica.cpp:474] Replica received implicit promise request with proposal 1 I0910 20:18:04.223865 21654 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 9.044596ms I0910 20:18:04.223944 21654 replica.cpp:342] Persisted promised to 1 I0910 20:18:04.229053 21652 coordinator.cpp:230] Coordinator attemping to fill missing position I0910 20:18:04.229552 21652 replica.cpp:375] Replica received explicit promise request for position 0 with proposal 2 I0910 20:18:04.248437 21652 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 18.839475ms I0910 20:18:04.248525 21652 replica.cpp:676] Persisted action at 0 I0910 20:18:04.251194 21650 replica.cpp:508] Replica received write request for position 0 I0910 20:18:04.251260 21650 leveldb.cpp:438] Reading position from leveldb took 43213ns I0910 20:18:04.262251 21650 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 10.949353ms I0910 20:18:04.262346 21650 replica.cpp:676] Persisted action at 0 I0910 20:18:04.262717 21650 replica.cpp:655] Replica received learned notice for position 0 I091
[jira] [Resolved] (MESOS-1779) Mesos style checker should catch trailing white space
[ https://issues.apache.org/jira/browse/MESOS-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone resolved MESOS-1779. --- Resolution: Fixed Assignee: Kamil Domański commit 60dd10cfb8aef1ae661d17bd3d28f0596b731ff3 Author: Kamil Domanski Date: Wed Sep 10 21:20:03 2014 -0700 Added support for catching trailing white spaces in the style checker. Review: https://reviews.apache.org/r/25526 > Mesos style checker should catch trailing white space > - > > Key: MESOS-1779 > URL: https://issues.apache.org/jira/browse/MESOS-1779 > Project: Mesos > Issue Type: Improvement >Reporter: Vinod Kone >Assignee: Kamil Domański > Labels: newbie > > Trailing white space errors are currently not caught by the style checker. It > should! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1779) Mesos style checker should catch trailing white space
[ https://issues.apache.org/jira/browse/MESOS-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-1779: -- Fix Version/s: 0.21.0 > Mesos style checker should catch trailing white space > - > > Key: MESOS-1779 > URL: https://issues.apache.org/jira/browse/MESOS-1779 > Project: Mesos > Issue Type: Improvement >Reporter: Vinod Kone >Assignee: Kamil Domański > Labels: newbie > Fix For: 0.21.0 > > > Trailing white space errors are currently not caught by the style checker. It > should! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1779) Mesos style checker should catch trailing white space
[ https://issues.apache.org/jira/browse/MESOS-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129579#comment-14129579 ] Kamil Domański commented on MESOS-1779: --- https://reviews.apache.org/r/25526/ > Mesos style checker should catch trailing white space > - > > Key: MESOS-1779 > URL: https://issues.apache.org/jira/browse/MESOS-1779 > Project: Mesos > Issue Type: Improvement >Reporter: Vinod Kone > Labels: newbie > > Trailing white space errors are currently not caught by the style checker. It > should! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129546#comment-14129546 ] Cody Maloney commented on MESOS-1739: - [~vinodkone]: New review request (https://reviews.apache.org/r/25525/). Updated the bug title. Tests now pass. All functionality is there. All comments are incorporated except the patch still allows both resources and attributes to be set to supersets of what they currently are. In the case where someone has a setup where they have a critical negative attribute check, they should be aware of that and just not ever add that attribute at runtime (They can always fully kill the slave then restart). Changing the recover behavior in this case doesn't break their setups, and there are a number of cases where we would like increasing attribute sets. The check is in one place, and would be one line to change / remove / make be identical if that is a hard requirement. > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Improvement >Reporter: Patrick Reilly >Assignee: Cody Maloney > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129485#comment-14129485 ] Patrick Reilly commented on MESOS-1739: --- [~vinodkone] I've gone ahead and closed https://reviews.apache.org/r/25111/ I'll have [~cmaloney] submit a new review board shortly. > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Improvement >Reporter: Patrick Reilly >Assignee: Cody Maloney > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Maloney updated MESOS-1739: Summary: Allow slave reconfiguration on restart (was: Add Dynamic Slave Attributes) > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Improvement >Reporter: Patrick Reilly >Assignee: Cody Maloney > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-1739) Add Dynamic Slave Attributes
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Maloney reassigned MESOS-1739: --- Assignee: Cody Maloney (was: Patrick Reilly) > Add Dynamic Slave Attributes > > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Improvement >Reporter: Patrick Reilly >Assignee: Cody Maloney > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-1785) ExampleTest.LowLevelSchedulerLibprocess is flaky
Yan Xu created MESOS-1785: - Summary: ExampleTest.LowLevelSchedulerLibprocess is flaky Key: MESOS-1785 URL: https://issues.apache.org/jira/browse/MESOS-1785 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.20.0 Reporter: Yan Xu The test lasted forever because task 1 is not completed. {noformat:title=log} [ RUN ] ExamplesTest.LowLevelSchedulerLibprocess Using temporary directory '/tmp/ExamplesTest_LowLevelSchedulerLibprocess_s2iS1n' WARNING: Logging before InitGoogleLogging() is written to STDERR I0910 05:57:28.191807 17625 process.cpp:1771] libprocess is initialized on 127.0.1.1:47878 for 8 cpus Enabling authentication for the scheduler I0910 05:57:28.193083 17625 logging.cpp:177] Logging to STDERR I0910 05:57:28.193274 17625 scheduler.cpp:145] Version: 0.21.0 I0910 05:57:28.224969 17625 leveldb.cpp:176] Opened db in 29.007559ms I0910 05:57:28.234238 17625 leveldb.cpp:183] Compacted db in 9.042296ms I0910 05:57:28.234468 17625 leveldb.cpp:198] Created db iterator in 32144ns I0910 05:57:28.234742 17625 leveldb.cpp:204] Seeked to beginning of db in 1548ns I0910 05:57:28.234879 17625 leveldb.cpp:273] Iterated through 0 keys in the db in 5502ns I0910 05:57:28.235086 17625 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0910 05:57:28.237017 17654 master.cpp:286] Master 20140910-055728-16842879-47878-17625 (trusty) started on 127.0.1.1:47878 I0910 05:57:28.237479 17654 master.cpp:332] Master only allowing authenticated frameworks to register I0910 05:57:28.237592 17654 master.cpp:339] Master allowing unauthenticated slaves to register I0910 05:57:28.237733 17654 credentials.hpp:36] Loading credentials for authentication from '/tmp/ExamplesTest_LowLevelSchedulerLibprocess_s2iS1n/credentials' W0910 05:57:28.237985 17654 credentials.hpp:51] Permissions on credentials file '/tmp/ExamplesTest_LowLevelSchedulerLibprocess_s2iS1n/credentials' are too open. It is recommended that your credentials file is NOT accessible by others. I0910 05:57:28.238147 17654 master.cpp:366] Authorization enabled I0910 05:57:28.237361 17652 recover.cpp:425] Starting replica recovery I0910 05:57:28.238878 17651 recover.cpp:451] Replica is in EMPTY status I0910 05:57:28.239704 17651 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request I0910 05:57:28.240255 17651 recover.cpp:188] Received a recover response from a replica in EMPTY status I0910 05:57:28.240582 17651 recover.cpp:542] Updating replica status to STARTING I0910 05:57:28.240134 17650 master.cpp:120] No whitelist given. Advertising offers for all slaves I0910 05:57:28.241950 17652 hierarchical_allocator_process.hpp:299] Initializing hierarchical allocator process with master : master@127.0.1.1:47878 I0910 05:57:28.243247 17651 master.cpp:1212] The newly elected leader is master@127.0.1.1:47878 with id 20140910-055728-16842879-47878-17625 I0910 05:57:28.243402 17651 master.cpp:1225] Elected as the leading master! I0910 05:57:28.243620 17651 master.cpp:1043] Recovering from registrar I0910 05:57:28.243831 17651 registrar.cpp:313] Recovering registrar I0910 05:57:28.244851 17625 containerizer.cpp:89] Using isolation: posix/cpu,posix/mem I0910 05:57:28.246381 17652 slave.cpp:167] Slave started on 1)@127.0.1.1:47878 I0910 05:57:28.246824 17652 slave.cpp:287] Slave resources: cpus(*):1; mem(*):1986; disk(*):24988; ports(*):[31000-32000] I0910 05:57:28.247046 17652 slave.cpp:315] Slave hostname: trusty I0910 05:57:28.247133 17652 slave.cpp:316] Slave checkpoint: true I0910 05:57:28.247632 17652 state.cpp:33] Recovering state from '/tmp/mesos-SUO0qf/0/meta' I0910 05:57:28.247834 17650 status_update_manager.cpp:193] Recovering status update manager I0910 05:57:28.247987 17650 containerizer.cpp:252] Recovering containerizer I0910 05:57:28.248399 17652 slave.cpp:3202] Finished recovery I0910 05:57:28.248921 17652 slave.cpp:598] New master detected at master@127.0.1.1:47878 I0910 05:57:28.249073 17656 status_update_manager.cpp:167] New master detected at master@127.0.1.1:47878 I0910 05:57:28.249191 17652 slave.cpp:634] No credentials provided. Attempting to register without authentication I0910 05:57:28.249337 17652 slave.cpp:645] Detecting new master I0910 05:57:28.249740 17625 containerizer.cpp:89] Using isolation: posix/cpu,posix/mem I0910 05:57:28.251518 17650 slave.cpp:167] Slave started on 2)@127.0.1.1:47878 I0910 05:57:28.251782 17650 slave.cpp:287] Slave resources: cpus(*):1; mem(*):1986; disk(*):24988; ports(*):[31000-32000] I0910 05:57:28.251960 17650 slave.cpp:315] Slave hostname: trusty I0910 05:57:28.252079 17650 slave.cpp:316] Slave checkpoint: true I0910 05:57:28.252481 17650 state.cpp:33] Recovering state from '/tmp/mesos-SUO0qf/1/meta' I0910 05:57:28.252691 17650 status_update_man
[jira] [Commented] (MESOS-1783) MasterTest.LaunchDuplicateOfferTest is flaky
[ https://issues.apache.org/jira/browse/MESOS-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129269#comment-14129269 ] Niklas Quarfot Nielsen commented on MESOS-1783: --- Will do - sorry for the tardy reply > MasterTest.LaunchDuplicateOfferTest is flaky > > > Key: MESOS-1783 > URL: https://issues.apache.org/jira/browse/MESOS-1783 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.20.0 > Environment: ubuntu-14.04-gcc Jenkins VM >Reporter: Yan Xu > > {noformat:title=} > [ RUN ] MasterTest.LaunchDuplicateOfferTest > Using temporary directory '/tmp/MasterTest_LaunchDuplicateOfferTest_3ifzmg' > I0909 22:46:59.212977 21883 leveldb.cpp:176] Opened db in 20.307533ms > I0909 22:46:59.219717 21883 leveldb.cpp:183] Compacted db in 6.470397ms > I0909 22:46:59.219925 21883 leveldb.cpp:198] Created db iterator in 5571ns > I0909 22:46:59.220100 21883 leveldb.cpp:204] Seeked to beginning of db in > 1365ns > I0909 22:46:59.220268 21883 leveldb.cpp:273] Iterated through 0 keys in the > db in 658ns > I0909 22:46:59.220448 21883 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0909 22:46:59.220855 21903 recover.cpp:425] Starting replica recovery > I0909 22:46:59.221103 21903 recover.cpp:451] Replica is in EMPTY status > I0909 22:46:59.221626 21903 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0909 22:46:59.221914 21903 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0909 22:46:59.04 21903 recover.cpp:542] Updating replica status to > STARTING > I0909 22:46:59.232590 21900 master.cpp:286] Master > 20140909-224659-16842879-44263-21883 (trusty) started on 127.0.1.1:44263 > I0909 22:46:59.233278 21900 master.cpp:332] Master only allowing > authenticated frameworks to register > I0909 22:46:59.233543 21900 master.cpp:337] Master only allowing > authenticated slaves to register > I0909 22:46:59.233934 21900 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/MasterTest_LaunchDuplicateOfferTest_3ifzmg/credentials' > I0909 22:46:59.236431 21900 master.cpp:366] Authorization enabled > I0909 22:46:59.237522 21898 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@127.0.1.1:44263 > I0909 22:46:59.237877 21904 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0909 22:46:59.238723 21903 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 16.245391ms > I0909 22:46:59.238916 21903 replica.cpp:320] Persisted replica status to > STARTING > I0909 22:46:59.239203 21903 recover.cpp:451] Replica is in STARTING status > I0909 22:46:59.239724 21903 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0909 22:46:59.239967 21903 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0909 22:46:59.240304 21903 recover.cpp:542] Updating replica status to VOTING > I0909 22:46:59.240684 21900 master.cpp:1212] The newly elected leader is > master@127.0.1.1:44263 with id 20140909-224659-16842879-44263-21883 > I0909 22:46:59.240846 21900 master.cpp:1225] Elected as the leading master! > I0909 22:46:59.241149 21900 master.cpp:1043] Recovering from registrar > I0909 22:46:59.241509 21898 registrar.cpp:313] Recovering registrar > I0909 22:46:59.248440 21903 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 7.864221ms > I0909 22:46:59.248644 21903 replica.cpp:320] Persisted replica status to > VOTING > I0909 22:46:59.248846 21903 recover.cpp:556] Successfully joined the Paxos > group > I0909 22:46:59.249330 21897 log.cpp:656] Attempting to start the writer > I0909 22:46:59.249809 21897 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0909 22:46:59.250075 21903 recover.cpp:440] Recover process terminated > I0909 22:46:59.258286 21897 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 8.292514ms > I0909 22:46:59.258489 21897 replica.cpp:342] Persisted promised to 1 > I0909 22:46:59.258848 21897 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0909 22:46:59.259454 21897 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0909 22:46:59.267755 21897 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 8.109338ms > I0909 22:46:59.267916 21897 replica.cpp:676] Persisted action at 0 > I0909 22:46:59.270128 21902 replica.cpp:508] Replica received write request > for position 0 > I0909 22:46:59.270294 21902 leveldb.cpp:438] Reading position from leveldb > took 27443ns > I0909 22:46:59.277220 21902 leveldb.cpp:343] Persisting action (14 bytes) to > leve
[jira] [Commented] (MESOS-1739) Add Dynamic Slave Attributes
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129212#comment-14129212 ] Vinod Kone commented on MESOS-1739: --- [~preillyme] Do you want rename the title/description of the ticket and the review per the revised semantics we discussed about? Also, let me know if/when it's ready to review. Also, you asked about design doc for framework update earlier. You can find it here: MESOS-1784. > Add Dynamic Slave Attributes > > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Improvement >Reporter: Patrick Reilly >Assignee: Patrick Reilly > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MESOS-1728) Libprocess: report bind parameters on failure
[ https://issues.apache.org/jira/browse/MESOS-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone resolved MESOS-1728. --- Resolution: Fixed Fix Version/s: 0.21.0 Target Version/s: 0.20.1 commit 70784a9f234b2902d6fee11298365d9b08756313 Author: Nikita Vetoshkin Date: Thu Aug 21 11:40:55 2014 -0700 Report bind parameters on failure. Review: https://reviews.apache.org/r/24939 > Libprocess: report bind parameters on failure > - > > Key: MESOS-1728 > URL: https://issues.apache.org/jira/browse/MESOS-1728 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Nikita Vetoshkin >Assignee: Nikita Vetoshkin >Priority: Trivial > Fix For: 0.21.0 > > > When you attempt to start slave or master and there's another one already > running there, it is nice to report what are the actual parameters to > {{bind}} call that failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1784) Design the semantics for updating FrameworkInfo
[ https://issues.apache.org/jira/browse/MESOS-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129179#comment-14129179 ] Vinod Kone commented on MESOS-1784: --- https://docs.google.com/document/d/1vEBuFN9mm3HkrNCmkAuwX-4kNYv0MwjUecLE3Jp_BqE/edit?usp=sharing > Design the semantics for updating FrameworkInfo > --- > > Key: MESOS-1784 > URL: https://issues.apache.org/jira/browse/MESOS-1784 > Project: Mesos > Issue Type: Bug >Reporter: Vinod Kone >Assignee: Vinod Kone > > Currently, there is no easy way for frameworks to update their > FrameworkInfo., resulting in issues like MESOS-703 and MESOS-1218. > This ticket captures the design for doing FrameworkInfo update without having > to roll masters/slaves/tasks/executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1709) ExamplesTest.NoExecutorFramework is flaky
[ https://issues.apache.org/jira/browse/MESOS-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129015#comment-14129015 ] Yan Xu commented on MESOS-1709: --- This can be prevented if ExecutorDriver::join() waits for all status update acks. See MESOS-243 > ExamplesTest.NoExecutorFramework is flaky > - > > Key: MESOS-1709 > URL: https://issues.apache.org/jira/browse/MESOS-1709 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Vinod Kone > > Seen this happen couple of times on Twitter CI machines. Looks like the slave > sends TASK_FAILED for one of the executors because it got a > executorTerminated() signal before it got a TASK_FINISHED signal. > {code} > [ RUN ] ExamplesTest.NoExecutorFramework > Using temporary directory '/tmp/ExamplesTest_NoExecutorFramework_dZZzd6' > Enabling authentication for the framework > WARNING: Logging before InitGoogleLogging() is written to STDERR > I0815 18:39:25.623885 5879 process.cpp:1770] libprocess is initialized on > 192.168.122.164:53897 for 8 cpus > I0815 18:39:25.624589 5879 logging.cpp:172] Logging to STDERR > I0815 18:39:25.627943 5879 leveldb.cpp:176] Opened db in 897745ns > I0815 18:39:25.628557 5879 leveldb.cpp:183] Compacted db in 467234ns > I0815 18:39:25.628706 5879 leveldb.cpp:198] Created db iterator in 11396ns > I0815 18:39:25.628939 5879 leveldb.cpp:204] Seeked to beginning of db in > 1391ns > I0815 18:39:25.629060 5879 leveldb.cpp:273] Iterated through 0 keys in the > db in 678ns > I0815 18:39:25.629261 5879 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0815 18:39:25.630502 5909 recover.cpp:425] Starting replica recovery > I0815 18:39:25.630935 5909 recover.cpp:451] Replica is in EMPTY status > I0815 18:39:25.631501 5909 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0815 18:39:25.631804 5909 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0815 18:39:25.632524 5909 recover.cpp:542] Updating replica status to > STARTING > I0815 18:39:25.632935 5909 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 48908ns > I0815 18:39:25.633219 5909 replica.cpp:320] Persisted replica status to > STARTING > I0815 18:39:25.633545 5909 recover.cpp:451] Replica is in STARTING status > I0815 18:39:25.634224 5905 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0815 18:39:25.634405 5905 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0815 18:39:25.634724 5909 recover.cpp:542] Updating replica status to VOTING > I0815 18:39:25.634948 5908 master.cpp:286] Master > 20140815-183925-2759502016-53897-5879 (fedora-20) started on > 192.168.122.164:53897 > I0815 18:39:25.635123 5908 master.cpp:323] Master only allowing > authenticated frameworks to register > I0815 18:39:25.635311 5908 master.cpp:330] Master allowing unauthenticated > slaves to register > I0815 18:39:25.635455 5908 credentials.hpp:36] Loading credentials for > authentication from '/tmp/ExamplesTest_NoExecutorFramework_dZZzd6/credentials' > W0815 18:39:25.635658 5908 credentials.hpp:51] Permissions on credentials > file '/tmp/ExamplesTest_NoExecutorFramework_dZZzd6/credentials' are too open. > It is recommended that your credentials file is NOT accessible by others. > I0815 18:39:25.635861 5908 master.cpp:357] Authorization enabled > I0815 18:39:25.636286 5910 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@192.168.122.164:53897 > I0815 18:39:25.636443 5907 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0815 18:39:25.637657 5908 master.cpp:1196] The newly elected leader is > master@192.168.122.164:53897 with id 20140815-183925-2759502016-53897-5879 > I0815 18:39:25.638296 5908 master.cpp:1209] Elected as the leading master! > I0815 18:39:25.638254 5906 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 42875ns > I0815 18:39:25.638552 5906 replica.cpp:320] Persisted replica status to > VOTING > I0815 18:39:25.638737 5906 recover.cpp:556] Successfully joined the Paxos > group > I0815 18:39:25.638926 5906 recover.cpp:440] Recover process terminated > I0815 18:39:25.639225 5908 master.cpp:1027] Recovering from registrar > I0815 18:39:25.639457 5907 registrar.cpp:313] Recovering registrar > I0815 18:39:25.639850 5907 log.cpp:656] Attempting to start the writer > I0815 18:39:25.640336 5907 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0815 18:39:25.640530 5907 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 37820ns > I0815 18:39:25.640714 5907 replica.cpp:342] Persisted promised to 1 > I08
[jira] [Commented] (MESOS-243) driver stop() should block until outstanding requests have been persisted
[ https://issues.apache.org/jira/browse/MESOS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129014#comment-14129014 ] Yan Xu commented on MESOS-243: -- Making messages are flushed still aren't sufficient to ensure it's received. We can have ExecutorDriver.join() wait for all status updates to be acked. > driver stop() should block until outstanding requests have been persisted > - > > Key: MESOS-243 > URL: https://issues.apache.org/jira/browse/MESOS-243 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 0.14.1, > 0.14.2, 0.15.0 >Reporter: brian wickman > > in our executor, we send a terminal status update message and immediately > call driver.stop(). it turns out that the status update is dispatched > asynchronously and races with driver shutdown, causing tasks to instead > periodically go into LOST state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1776) --without-PACKAGE will set incorrect dependency prefix
[ https://issues.apache.org/jira/browse/MESOS-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129008#comment-14129008 ] Kamil Domański commented on MESOS-1776: --- [~tstclair], I'd like some feedback on whether the ability to provide a prefix for unbundled dependencies is necessary. The current method for providing the prefix is a bit unusual and conflicts with canonical usage of *\-\-with-X* and *\-\-without-X* flags. A workaround is possible by checking for the variables in question being equal to "*no*" and changing their values to "*/usr*" in such cases. However, I see removal of prefixing altogether as a preferred solution. Either way, I'd like to take care of this as as soon as I'm pointed in the right direction. > --without-PACKAGE will set incorrect dependency prefix > -- > > Key: MESOS-1776 > URL: https://issues.apache.org/jira/browse/MESOS-1776 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.20.0 >Reporter: Kamil Domański > Labels: build > > When disabling a particular bundled dependency with *--without-PACKAGE*, the > build scripts of both Mesos and libprocess will set a corresponding variable > to "no". This is later treated as prefix under which to search for the > package. > For example, with *--without-protobuf*, the script will search for *protoc* > under *no/bin* and obviously fail. I would propose to get rid of these > prefixes entirely and instead search in default locations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1766) MasterAuthorizationTest.DuplicateRegistration test is flaky
[ https://issues.apache.org/jira/browse/MESOS-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129003#comment-14129003 ] Vinod Kone commented on MESOS-1766: --- https://reviews.apache.org/r/25516/ > MasterAuthorizationTest.DuplicateRegistration test is flaky > --- > > Key: MESOS-1766 > URL: https://issues.apache.org/jira/browse/MESOS-1766 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Vinod Kone >Assignee: Vinod Kone > > {code} > [ RUN ] MasterAuthorizationTest.DuplicateRegistration > Using temporary directory > '/tmp/MasterAuthorizationTest_DuplicateRegistration_pVJg7m' > I0905 15:53:16.398993 25769 leveldb.cpp:176] Opened db in 2.601036ms > I0905 15:53:16.399566 25769 leveldb.cpp:183] Compacted db in 546216ns > I0905 15:53:16.399590 25769 leveldb.cpp:198] Created db iterator in 2787ns > I0905 15:53:16.399605 25769 leveldb.cpp:204] Seeked to beginning of db in > 500ns > I0905 15:53:16.399617 25769 leveldb.cpp:273] Iterated through 0 keys in the > db in 185ns > I0905 15:53:16.399633 25769 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0905 15:53:16.399817 25786 recover.cpp:425] Starting replica recovery > I0905 15:53:16.399952 25793 recover.cpp:451] Replica is in EMPTY status > I0905 15:53:16.400683 25795 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0905 15:53:16.400795 25787 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0905 15:53:16.401005 25783 recover.cpp:542] Updating replica status to > STARTING > I0905 15:53:16.401470 25786 master.cpp:286] Master > 20140905-155316-3125920579-49188-25769 (penates.apache.org) started on > 67.195.81.186:49188 > I0905 15:53:16.401521 25786 master.cpp:332] Master only allowing > authenticated frameworks to register > I0905 15:53:16.401533 25786 master.cpp:337] Master only allowing > authenticated slaves to register > I0905 15:53:16.401543 25786 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/MasterAuthorizationTest_DuplicateRegistration_pVJg7m/credentials' > I0905 15:53:16.401558 25793 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 474683ns > I0905 15:53:16.401582 25793 replica.cpp:320] Persisted replica status to > STARTING > I0905 15:53:16.401667 25793 recover.cpp:451] Replica is in STARTING status > I0905 15:53:16.401669 25786 master.cpp:366] Authorization enabled > I0905 15:53:16.401898 25795 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0905 15:53:16.401936 25796 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@67.195.81.186:49188 > I0905 15:53:16.402160 25784 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0905 15:53:16.402333 25790 master.cpp:1205] The newly elected leader is > master@67.195.81.186:49188 with id 20140905-155316-3125920579-49188-25769 > I0905 15:53:16.402359 25790 master.cpp:1218] Elected as the leading master! > I0905 15:53:16.402371 25790 master.cpp:1036] Recovering from registrar > I0905 15:53:16.402472 25798 registrar.cpp:313] Recovering registrar > I0905 15:53:16.402529 25791 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0905 15:53:16.402782 25788 recover.cpp:542] Updating replica status to VOTING > I0905 15:53:16.403002 25795 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 116403ns > I0905 15:53:16.403020 25795 replica.cpp:320] Persisted replica status to > VOTING > I0905 15:53:16.403081 25791 recover.cpp:556] Successfully joined the Paxos > group > I0905 15:53:16.403197 25791 recover.cpp:440] Recover process terminated > I0905 15:53:16.403388 25796 log.cpp:656] Attempting to start the writer > I0905 15:53:16.403993 25784 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0905 15:53:16.404147 25784 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 132156ns > I0905 15:53:16.404167 25784 replica.cpp:342] Persisted promised to 1 > I0905 15:53:16.404542 25795 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0905 15:53:16.405498 25787 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0905 15:53:16.405868 25787 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 347231ns > I0905 15:53:16.405886 25787 replica.cpp:676] Persisted action at 0 > I0905 15:53:16.406553 25788 replica.cpp:508] Replica received write request > for position 0 > I0905 15:53:16.406582 25788 leveldb.cpp:438] Reading position from leveldb > took 11402ns > I0905 15:53:16.529067 25788 leveldb.cpp:343] Persisting action (14 bytes) to > l
[jira] [Commented] (MESOS-1760) MasterAuthorizationTest.FrameworkRemovedBeforeReregistration is flaky
[ https://issues.apache.org/jira/browse/MESOS-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129004#comment-14129004 ] Vinod Kone commented on MESOS-1760: --- https://reviews.apache.org/r/25516/ > MasterAuthorizationTest.FrameworkRemovedBeforeReregistration is flaky > - > > Key: MESOS-1760 > URL: https://issues.apache.org/jira/browse/MESOS-1760 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Vinod Kone >Assignee: Vinod Kone > > Observed this on Apache CI: > https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2355/changes > {code} > [ RUN] MasterAuthorizationTest.FrameworkRemovedBeforeReregistration > Using temporary directory > '/tmp/MasterAuthorizationTest_FrameworkRemovedBeforeReregistration_0tw16Z' > I0903 22:04:33.520237 25565 leveldb.cpp:176] Opened db in 49.073821ms > I0903 22:04:33.538331 25565 leveldb.cpp:183] Compacted db in 18.065051ms > I0903 22:04:33.538363 25565 leveldb.cpp:198] Created db iterator in 4826ns > I0903 22:04:33.538377 25565 leveldb.cpp:204] Seeked to beginning of db in > 682ns > I0903 22:04:33.538385 25565 leveldb.cpp:273] Iterated through 0 keys in the > db in 312ns > I0903 22:04:33.538399 25565 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0903 22:04:33.538624 25593 recover.cpp:425] Starting replica recovery > I0903 22:04:33.538707 25598 recover.cpp:451] Replica is in EMPTY status > I0903 22:04:33.540909 25590 master.cpp:286] Master > 20140903-220433-453759884-44122-25565 (hemera.apache.org) started on > 140.211.11.27:44122 > I0903 22:04:33.540932 25590 master.cpp:332] Master only allowing > authenticated frameworks to register > I0903 22:04:33.540936 25590 master.cpp:337] Master only allowing > authenticated slaves to register > I0903 22:04:33.540941 25590 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/MasterAuthorizationTest_FrameworkRemovedBeforeReregistration_0tw16Z/credentials' > I0903 22:04:33.541337 25590 master.cpp:366] Authorization enabled > I0903 22:04:33.541508 25597 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0903 22:04:33.542343 25582 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@140.211.11.27:44122 > I0903 22:04:33.542445 25592 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0903 22:04:33.543175 25602 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0903 22:04:33.543637 25587 recover.cpp:542] Updating replica status to > STARTING > I0903 22:04:33.544256 25579 master.cpp:1205] The newly elected leader is > master@140.211.11.27:44122 with id 20140903-220433-453759884-44122-25565 > I0903 22:04:33.544275 25579 master.cpp:1218] Elected as the leading master! > I0903 22:04:33.544282 25579 master.cpp:1036] Recovering from registrar > I0903 22:04:33.544401 25579 registrar.cpp:313] Recovering registrar > I0903 22:04:33.558487 25593 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 14.678563ms > I0903 22:04:33.558531 25593 replica.cpp:320] Persisted replica status to > STARTING > I0903 22:04:33.558653 25593 recover.cpp:451] Replica is in STARTING status > I0903 22:04:33.559867 25588 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0903 22:04:33.560057 25602 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0903 22:04:33.561280 25584 recover.cpp:542] Updating replica status to VOTING > I0903 22:04:33.576900 25581 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 14.712427ms > I0903 22:04:33.576942 25581 replica.cpp:320] Persisted replica status to > VOTING > I0903 22:04:33.577018 25581 recover.cpp:556] Successfully joined the Paxos > group > I0903 22:04:33.577108 25581 recover.cpp:440] Recover process terminated > I0903 22:04:33.577401 25581 log.cpp:656] Attempting to start the writer > I0903 22:04:33.578559 25589 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0903 22:04:33.594611 25589 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 16.029152ms > I0903 22:04:33.594640 25589 replica.cpp:342] Persisted promised to 1 > I0903 22:04:33.595391 25584 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0903 22:04:33.597512 25588 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0903 22:04:33.613037 25588 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 15.502568ms > I0903 22:04:33.613065 25588 replica.cpp:676] Persisted action at 0 > I0903 22:04:33.615435 25585 replica.cpp:508]
[jira] [Updated] (MESOS-703) master fails to respect updated FrameworkInfo when the framework scheduler restarts
[ https://issues.apache.org/jira/browse/MESOS-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-703: - Shepherd: Vinod Kone > master fails to respect updated FrameworkInfo when the framework scheduler > restarts > --- > > Key: MESOS-703 > URL: https://issues.apache.org/jira/browse/MESOS-703 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.14.0 > Environment: ubuntu 13.04, mesos 0.14.0-rc3 >Reporter: Jordan Curzon > > When I first ran marathon it was running as a personal user and registered > with mesos-master as such due to putting an empty string in the user field. > When I restarted marathon as "nobody", tasks were still being run as the > personal user which didn't exist on the slaves. I know marathon was trying to > send a FrameworkInfo with nobody listed as the user because I hard coded it > in. The tasks wouldn't run as "nobody" until I restarted the mesos-master. > Each time I restarted the marathon framework, it reregistered with > mesos-master and mesos-master wrote to the logs that it detected a failover > because the scheduler went away and then came back. > I understand the scheduler failover, but shouldn't mesos-master respect an > updated FrameworkInfo when the scheduler re-registers? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-703) master fails to respect updated FrameworkInfo when the framework scheduler restarts
[ https://issues.apache.org/jira/browse/MESOS-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-703: - Assignee: (was: Vinod Kone) > master fails to respect updated FrameworkInfo when the framework scheduler > restarts > --- > > Key: MESOS-703 > URL: https://issues.apache.org/jira/browse/MESOS-703 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.14.0 > Environment: ubuntu 13.04, mesos 0.14.0-rc3 >Reporter: Jordan Curzon > > When I first ran marathon it was running as a personal user and registered > with mesos-master as such due to putting an empty string in the user field. > When I restarted marathon as "nobody", tasks were still being run as the > personal user which didn't exist on the slaves. I know marathon was trying to > send a FrameworkInfo with nobody listed as the user because I hard coded it > in. The tasks wouldn't run as "nobody" until I restarted the mesos-master. > Each time I restarted the marathon framework, it reregistered with > mesos-master and mesos-master wrote to the logs that it detected a failover > because the scheduler went away and then came back. > I understand the scheduler failover, but shouldn't mesos-master respect an > updated FrameworkInfo when the scheduler re-registers? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-703) master fails to respect updated FrameworkInfo when the framework scheduler restarts
[ https://issues.apache.org/jira/browse/MESOS-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128992#comment-14128992 ] Vinod Kone commented on MESOS-703: -- Linked the ticket for the design doc to do this properly. > master fails to respect updated FrameworkInfo when the framework scheduler > restarts > --- > > Key: MESOS-703 > URL: https://issues.apache.org/jira/browse/MESOS-703 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.14.0 > Environment: ubuntu 13.04, mesos 0.14.0-rc3 >Reporter: Jordan Curzon >Assignee: Vinod Kone > > When I first ran marathon it was running as a personal user and registered > with mesos-master as such due to putting an empty string in the user field. > When I restarted marathon as "nobody", tasks were still being run as the > personal user which didn't exist on the slaves. I know marathon was trying to > send a FrameworkInfo with nobody listed as the user because I hard coded it > in. The tasks wouldn't run as "nobody" until I restarted the mesos-master. > Each time I restarted the marathon framework, it reregistered with > mesos-master and mesos-master wrote to the logs that it detected a failover > because the scheduler went away and then came back. > I understand the scheduler failover, but shouldn't mesos-master respect an > updated FrameworkInfo when the scheduler re-registers? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-703) master fails to respect updated FrameworkInfo when the framework scheduler restarts
[ https://issues.apache.org/jira/browse/MESOS-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-703: - Sprint: (was: Mesos Q3 Sprint 5) > master fails to respect updated FrameworkInfo when the framework scheduler > restarts > --- > > Key: MESOS-703 > URL: https://issues.apache.org/jira/browse/MESOS-703 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.14.0 > Environment: ubuntu 13.04, mesos 0.14.0-rc3 >Reporter: Jordan Curzon >Assignee: Vinod Kone > > When I first ran marathon it was running as a personal user and registered > with mesos-master as such due to putting an empty string in the user field. > When I restarted marathon as "nobody", tasks were still being run as the > personal user which didn't exist on the slaves. I know marathon was trying to > send a FrameworkInfo with nobody listed as the user because I hard coded it > in. The tasks wouldn't run as "nobody" until I restarted the mesos-master. > Each time I restarted the marathon framework, it reregistered with > mesos-master and mesos-master wrote to the logs that it detected a failover > because the scheduler went away and then came back. > I understand the scheduler failover, but shouldn't mesos-master respect an > updated FrameworkInfo when the scheduler re-registers? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-1784) Design the semantics for updating FrameworkInfo
Vinod Kone created MESOS-1784: - Summary: Design the semantics for updating FrameworkInfo Key: MESOS-1784 URL: https://issues.apache.org/jira/browse/MESOS-1784 Project: Mesos Issue Type: Bug Reporter: Vinod Kone Assignee: Vinod Kone Currently, there is no easy way for frameworks to update their FrameworkInfo., resulting in issues like MESOS-703 and MESOS-1218. This ticket captures the design for doing FrameworkInfo update without having to roll masters/slaves/tasks/executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1766) MasterAuthorizationTest.DuplicateRegistration test is flaky
[ https://issues.apache.org/jira/browse/MESOS-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128986#comment-14128986 ] Vinod Kone commented on MESOS-1766: --- The bug here is that the authorizer might get more than the expected registration authorization requests because of registration retries by the scheduler driver. The fix is simple, authorizer should allow all subsequent authorization requests. > MasterAuthorizationTest.DuplicateRegistration test is flaky > --- > > Key: MESOS-1766 > URL: https://issues.apache.org/jira/browse/MESOS-1766 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Vinod Kone >Assignee: Vinod Kone > > {code} > [ RUN ] MasterAuthorizationTest.DuplicateRegistration > Using temporary directory > '/tmp/MasterAuthorizationTest_DuplicateRegistration_pVJg7m' > I0905 15:53:16.398993 25769 leveldb.cpp:176] Opened db in 2.601036ms > I0905 15:53:16.399566 25769 leveldb.cpp:183] Compacted db in 546216ns > I0905 15:53:16.399590 25769 leveldb.cpp:198] Created db iterator in 2787ns > I0905 15:53:16.399605 25769 leveldb.cpp:204] Seeked to beginning of db in > 500ns > I0905 15:53:16.399617 25769 leveldb.cpp:273] Iterated through 0 keys in the > db in 185ns > I0905 15:53:16.399633 25769 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0905 15:53:16.399817 25786 recover.cpp:425] Starting replica recovery > I0905 15:53:16.399952 25793 recover.cpp:451] Replica is in EMPTY status > I0905 15:53:16.400683 25795 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0905 15:53:16.400795 25787 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0905 15:53:16.401005 25783 recover.cpp:542] Updating replica status to > STARTING > I0905 15:53:16.401470 25786 master.cpp:286] Master > 20140905-155316-3125920579-49188-25769 (penates.apache.org) started on > 67.195.81.186:49188 > I0905 15:53:16.401521 25786 master.cpp:332] Master only allowing > authenticated frameworks to register > I0905 15:53:16.401533 25786 master.cpp:337] Master only allowing > authenticated slaves to register > I0905 15:53:16.401543 25786 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/MasterAuthorizationTest_DuplicateRegistration_pVJg7m/credentials' > I0905 15:53:16.401558 25793 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 474683ns > I0905 15:53:16.401582 25793 replica.cpp:320] Persisted replica status to > STARTING > I0905 15:53:16.401667 25793 recover.cpp:451] Replica is in STARTING status > I0905 15:53:16.401669 25786 master.cpp:366] Authorization enabled > I0905 15:53:16.401898 25795 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0905 15:53:16.401936 25796 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@67.195.81.186:49188 > I0905 15:53:16.402160 25784 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0905 15:53:16.402333 25790 master.cpp:1205] The newly elected leader is > master@67.195.81.186:49188 with id 20140905-155316-3125920579-49188-25769 > I0905 15:53:16.402359 25790 master.cpp:1218] Elected as the leading master! > I0905 15:53:16.402371 25790 master.cpp:1036] Recovering from registrar > I0905 15:53:16.402472 25798 registrar.cpp:313] Recovering registrar > I0905 15:53:16.402529 25791 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0905 15:53:16.402782 25788 recover.cpp:542] Updating replica status to VOTING > I0905 15:53:16.403002 25795 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 116403ns > I0905 15:53:16.403020 25795 replica.cpp:320] Persisted replica status to > VOTING > I0905 15:53:16.403081 25791 recover.cpp:556] Successfully joined the Paxos > group > I0905 15:53:16.403197 25791 recover.cpp:440] Recover process terminated > I0905 15:53:16.403388 25796 log.cpp:656] Attempting to start the writer > I0905 15:53:16.403993 25784 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0905 15:53:16.404147 25784 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 132156ns > I0905 15:53:16.404167 25784 replica.cpp:342] Persisted promised to 1 > I0905 15:53:16.404542 25795 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0905 15:53:16.405498 25787 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0905 15:53:16.405868 25787 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 347231ns > I0905 15:53:16.405886 25787 replica.cpp:676] Persisted action at 0 > I0905 15:53:16.406553 25788 replica.cpp:508] Replica recei
[jira] [Updated] (MESOS-1760) MasterAuthorizationTest.FrameworkRemovedBeforeReregistration is flaky
[ https://issues.apache.org/jira/browse/MESOS-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-1760: -- Sprint: Mesos Q3 Sprint 5 Assignee: Vinod Kone Story Points: 1 This is due to the same issue as seen in MESOS-1766, duplicate registration retries. > MasterAuthorizationTest.FrameworkRemovedBeforeReregistration is flaky > - > > Key: MESOS-1760 > URL: https://issues.apache.org/jira/browse/MESOS-1760 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Vinod Kone >Assignee: Vinod Kone > > Observed this on Apache CI: > https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2355/changes > {code} > [ RUN] MasterAuthorizationTest.FrameworkRemovedBeforeReregistration > Using temporary directory > '/tmp/MasterAuthorizationTest_FrameworkRemovedBeforeReregistration_0tw16Z' > I0903 22:04:33.520237 25565 leveldb.cpp:176] Opened db in 49.073821ms > I0903 22:04:33.538331 25565 leveldb.cpp:183] Compacted db in 18.065051ms > I0903 22:04:33.538363 25565 leveldb.cpp:198] Created db iterator in 4826ns > I0903 22:04:33.538377 25565 leveldb.cpp:204] Seeked to beginning of db in > 682ns > I0903 22:04:33.538385 25565 leveldb.cpp:273] Iterated through 0 keys in the > db in 312ns > I0903 22:04:33.538399 25565 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0903 22:04:33.538624 25593 recover.cpp:425] Starting replica recovery > I0903 22:04:33.538707 25598 recover.cpp:451] Replica is in EMPTY status > I0903 22:04:33.540909 25590 master.cpp:286] Master > 20140903-220433-453759884-44122-25565 (hemera.apache.org) started on > 140.211.11.27:44122 > I0903 22:04:33.540932 25590 master.cpp:332] Master only allowing > authenticated frameworks to register > I0903 22:04:33.540936 25590 master.cpp:337] Master only allowing > authenticated slaves to register > I0903 22:04:33.540941 25590 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/MasterAuthorizationTest_FrameworkRemovedBeforeReregistration_0tw16Z/credentials' > I0903 22:04:33.541337 25590 master.cpp:366] Authorization enabled > I0903 22:04:33.541508 25597 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0903 22:04:33.542343 25582 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@140.211.11.27:44122 > I0903 22:04:33.542445 25592 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0903 22:04:33.543175 25602 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0903 22:04:33.543637 25587 recover.cpp:542] Updating replica status to > STARTING > I0903 22:04:33.544256 25579 master.cpp:1205] The newly elected leader is > master@140.211.11.27:44122 with id 20140903-220433-453759884-44122-25565 > I0903 22:04:33.544275 25579 master.cpp:1218] Elected as the leading master! > I0903 22:04:33.544282 25579 master.cpp:1036] Recovering from registrar > I0903 22:04:33.544401 25579 registrar.cpp:313] Recovering registrar > I0903 22:04:33.558487 25593 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 14.678563ms > I0903 22:04:33.558531 25593 replica.cpp:320] Persisted replica status to > STARTING > I0903 22:04:33.558653 25593 recover.cpp:451] Replica is in STARTING status > I0903 22:04:33.559867 25588 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0903 22:04:33.560057 25602 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0903 22:04:33.561280 25584 recover.cpp:542] Updating replica status to VOTING > I0903 22:04:33.576900 25581 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 14.712427ms > I0903 22:04:33.576942 25581 replica.cpp:320] Persisted replica status to > VOTING > I0903 22:04:33.577018 25581 recover.cpp:556] Successfully joined the Paxos > group > I0903 22:04:33.577108 25581 recover.cpp:440] Recover process terminated > I0903 22:04:33.577401 25581 log.cpp:656] Attempting to start the writer > I0903 22:04:33.578559 25589 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0903 22:04:33.594611 25589 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 16.029152ms > I0903 22:04:33.594640 25589 replica.cpp:342] Persisted promised to 1 > I0903 22:04:33.595391 25584 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0903 22:04:33.597512 25588 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0903 22:04:33.613037 25588 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 15.502568ms > I0903 22:04:33.613065 25588 replica
[jira] [Commented] (MESOS-1764) Build Fixes from 0.20 release
[ https://issues.apache.org/jira/browse/MESOS-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128940#comment-14128940 ] Timothy St. Clair commented on MESOS-1764: -- fix git clean -xdf on leveldb folder -reviews.apache.org/r/25508- > Build Fixes from 0.20 release > - > > Key: MESOS-1764 > URL: https://issues.apache.org/jira/browse/MESOS-1764 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.20.0 >Reporter: Timothy St. Clair >Assignee: Timothy St. Clair > > This ticket is a catch all for minor issues caught during a rebase and > testing. > + Add package configuration file to deployment > + Updates deploy_dir from localstatedir to sysconfdir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1766) MasterAuthorizationTest.DuplicateRegistration test is flaky
[ https://issues.apache.org/jira/browse/MESOS-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-1766: -- Sprint: Mesos Q3 Sprint 5 Assignee: Vinod Kone Story Points: 2 > MasterAuthorizationTest.DuplicateRegistration test is flaky > --- > > Key: MESOS-1766 > URL: https://issues.apache.org/jira/browse/MESOS-1766 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Vinod Kone >Assignee: Vinod Kone > > {code} > [ RUN ] MasterAuthorizationTest.DuplicateRegistration > Using temporary directory > '/tmp/MasterAuthorizationTest_DuplicateRegistration_pVJg7m' > I0905 15:53:16.398993 25769 leveldb.cpp:176] Opened db in 2.601036ms > I0905 15:53:16.399566 25769 leveldb.cpp:183] Compacted db in 546216ns > I0905 15:53:16.399590 25769 leveldb.cpp:198] Created db iterator in 2787ns > I0905 15:53:16.399605 25769 leveldb.cpp:204] Seeked to beginning of db in > 500ns > I0905 15:53:16.399617 25769 leveldb.cpp:273] Iterated through 0 keys in the > db in 185ns > I0905 15:53:16.399633 25769 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0905 15:53:16.399817 25786 recover.cpp:425] Starting replica recovery > I0905 15:53:16.399952 25793 recover.cpp:451] Replica is in EMPTY status > I0905 15:53:16.400683 25795 replica.cpp:638] Replica in EMPTY status received > a broadcasted recover request > I0905 15:53:16.400795 25787 recover.cpp:188] Received a recover response from > a replica in EMPTY status > I0905 15:53:16.401005 25783 recover.cpp:542] Updating replica status to > STARTING > I0905 15:53:16.401470 25786 master.cpp:286] Master > 20140905-155316-3125920579-49188-25769 (penates.apache.org) started on > 67.195.81.186:49188 > I0905 15:53:16.401521 25786 master.cpp:332] Master only allowing > authenticated frameworks to register > I0905 15:53:16.401533 25786 master.cpp:337] Master only allowing > authenticated slaves to register > I0905 15:53:16.401543 25786 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/MasterAuthorizationTest_DuplicateRegistration_pVJg7m/credentials' > I0905 15:53:16.401558 25793 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 474683ns > I0905 15:53:16.401582 25793 replica.cpp:320] Persisted replica status to > STARTING > I0905 15:53:16.401667 25793 recover.cpp:451] Replica is in STARTING status > I0905 15:53:16.401669 25786 master.cpp:366] Authorization enabled > I0905 15:53:16.401898 25795 master.cpp:120] No whitelist given. Advertising > offers for all slaves > I0905 15:53:16.401936 25796 hierarchical_allocator_process.hpp:299] > Initializing hierarchical allocator process with master : > master@67.195.81.186:49188 > I0905 15:53:16.402160 25784 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0905 15:53:16.402333 25790 master.cpp:1205] The newly elected leader is > master@67.195.81.186:49188 with id 20140905-155316-3125920579-49188-25769 > I0905 15:53:16.402359 25790 master.cpp:1218] Elected as the leading master! > I0905 15:53:16.402371 25790 master.cpp:1036] Recovering from registrar > I0905 15:53:16.402472 25798 registrar.cpp:313] Recovering registrar > I0905 15:53:16.402529 25791 recover.cpp:188] Received a recover response from > a replica in STARTING status > I0905 15:53:16.402782 25788 recover.cpp:542] Updating replica status to VOTING > I0905 15:53:16.403002 25795 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 116403ns > I0905 15:53:16.403020 25795 replica.cpp:320] Persisted replica status to > VOTING > I0905 15:53:16.403081 25791 recover.cpp:556] Successfully joined the Paxos > group > I0905 15:53:16.403197 25791 recover.cpp:440] Recover process terminated > I0905 15:53:16.403388 25796 log.cpp:656] Attempting to start the writer > I0905 15:53:16.403993 25784 replica.cpp:474] Replica received implicit > promise request with proposal 1 > I0905 15:53:16.404147 25784 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 132156ns > I0905 15:53:16.404167 25784 replica.cpp:342] Persisted promised to 1 > I0905 15:53:16.404542 25795 coordinator.cpp:230] Coordinator attemping to > fill missing position > I0905 15:53:16.405498 25787 replica.cpp:375] Replica received explicit > promise request for position 0 with proposal 2 > I0905 15:53:16.405868 25787 leveldb.cpp:343] Persisting action (8 bytes) to > leveldb took 347231ns > I0905 15:53:16.405886 25787 replica.cpp:676] Persisted action at 0 > I0905 15:53:16.406553 25788 replica.cpp:508] Replica received write request > for position 0 > I0905 15:53:16.406582 25788 leveldb.cpp:438] Reading position from leveldb > took 11402ns > I0905 15:53:16.529067 25788 leveldb.cpp:343] Persisting action (14 bytes) to > leveldb to
[jira] [Updated] (MESOS-1676) ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession is flaky
[ https://issues.apache.org/jira/browse/MESOS-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-1676: -- Sprint: Mesos Q3 Sprint 5 Story Points: 1 > ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession is flaky > --- > > Key: MESOS-1676 > URL: https://issues.apache.org/jira/browse/MESOS-1676 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.20.0 >Reporter: Yan Xu >Assignee: Yan Xu > > {noformat:title=} > [ RUN ] > ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession > I0806 01:18:37.648684 17458 zookeeper_test_server.cpp:158] Started > ZooKeeperTestServer on port 42069 > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@712: Client > environment:zookeeper.version=zookeeper C client 3.4.5 > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@716: Client > environment:host.name=lucid > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@723: Client > environment:os.name=Linux > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@724: Client > environment:os.arch=2.6.32-64-generic > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@725: Client > environment:os.version=#128-Ubuntu SMP Tue Jul 15 08:32:40 UTC 2014 > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@log_env@733: Client > environment:user.name=(null) > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@log_env@741: Client > environment:user.home=/home/jenkins > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@log_env@753: Client > environment:user.dir=/var/jenkins/workspace/mesos-ubuntu-10.04-gcc/src > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@zookeeper_init@786: > Initiating client connection, host=127.0.0.1:42069 sessionTimeout=5000 > watcher=0x2b467450bc00 sessionId=0 sessionPasswd= context=0x1682db0 > flags=0 > 2014-08-06 01:18:37,656:17458(0x2b468638b700):ZOO_INFO@check_events@1703: > initiated connection to server [127.0.0.1:42069] > 2014-08-06 01:18:37,669:17458(0x2b468638b700):ZOO_INFO@check_events@1750: > session establishment complete on server [127.0.0.1:42069], > sessionId=0x147aa6601cf, negotiated timeout=6000 > I0806 01:18:37.671725 17486 group.cpp:313] Group process > (group(37)@127.0.1.1:55561) connected to ZooKeeper > I0806 01:18:37.671758 17486 group.cpp:787] Syncing group operations: queue > size (joins, cancels, datas) = (0, 0, 0) > I0806 01:18:37.671771 17486 group.cpp:385] Trying to create path '/mesos' in > ZooKeeper > 2014-08-06 > 01:18:39,101:17458(0x2b4687394700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:36197] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > 2014-08-06 > 01:18:42,441:17458(0x2b4687394700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:36197] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I0806 01:18:42.656673 17481 contender.cpp:131] Joining the ZK group > I0806 01:18:42.662484 17484 contender.cpp:247] New candidate (id='0') has > entered the contest for leadership > I0806 01:18:42.663754 17481 detector.cpp:138] Detected a new leader: (id='0') > I0806 01:18:42.663884 17481 group.cpp:658] Trying to get > '/mesos/info_00' in ZooKeeper > I0806 01:18:42.664788 17483 detector.cpp:426] A new leading master > (UPID=@128.150.152.0:1) is detected > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@712: Client > environment:zookeeper.version=zookeeper C client 3.4.5 > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@716: Client > environment:host.name=lucid > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@723: Client > environment:os.name=Linux > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@724: Client > environment:os.arch=2.6.32-64-generic > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@725: Client > environment:os.version=#128-Ubuntu SMP Tue Jul 15 08:32:40 UTC 2014 > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@733: Client > environment:user.name=(null) > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@741: Client > environment:user.home=/home/jenkins > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@753: Client > environment:user.dir=/var/jenkins/workspace/mesos-ubuntu-10.04-gcc/src > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@zookeeper_init@786: > Initiating client connection, host=127.0.0.1:42069 sessionTimeout=5000 > watcher=0x2b467450bc00 sessionId=0 sessionPasswd= context=0x15c00f0 > flags=0 > 2014-08-06 01:18:42,668:17458(0x2b4686d91700):ZOO_INFO@check_events@1703: > initiated connection to server [127.0.0.1:42069]
[jira] [Commented] (MESOS-1676) ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession is flaky
[ https://issues.apache.org/jira/browse/MESOS-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128831#comment-14128831 ] Yan Xu commented on MESOS-1676: --- https://reviews.apache.org/r/25487/ https://reviews.apache.org/r/25511/ > ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession is flaky > --- > > Key: MESOS-1676 > URL: https://issues.apache.org/jira/browse/MESOS-1676 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.20.0 >Reporter: Yan Xu >Assignee: Yan Xu > > {noformat:title=} > [ RUN ] > ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession > I0806 01:18:37.648684 17458 zookeeper_test_server.cpp:158] Started > ZooKeeperTestServer on port 42069 > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@712: Client > environment:zookeeper.version=zookeeper C client 3.4.5 > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@716: Client > environment:host.name=lucid > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@723: Client > environment:os.name=Linux > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@724: Client > environment:os.arch=2.6.32-64-generic > 2014-08-06 01:18:37,650:17458(0x2b4679ca5700):ZOO_INFO@log_env@725: Client > environment:os.version=#128-Ubuntu SMP Tue Jul 15 08:32:40 UTC 2014 > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@log_env@733: Client > environment:user.name=(null) > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@log_env@741: Client > environment:user.home=/home/jenkins > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@log_env@753: Client > environment:user.dir=/var/jenkins/workspace/mesos-ubuntu-10.04-gcc/src > 2014-08-06 01:18:37,651:17458(0x2b4679ca5700):ZOO_INFO@zookeeper_init@786: > Initiating client connection, host=127.0.0.1:42069 sessionTimeout=5000 > watcher=0x2b467450bc00 sessionId=0 sessionPasswd= context=0x1682db0 > flags=0 > 2014-08-06 01:18:37,656:17458(0x2b468638b700):ZOO_INFO@check_events@1703: > initiated connection to server [127.0.0.1:42069] > 2014-08-06 01:18:37,669:17458(0x2b468638b700):ZOO_INFO@check_events@1750: > session establishment complete on server [127.0.0.1:42069], > sessionId=0x147aa6601cf, negotiated timeout=6000 > I0806 01:18:37.671725 17486 group.cpp:313] Group process > (group(37)@127.0.1.1:55561) connected to ZooKeeper > I0806 01:18:37.671758 17486 group.cpp:787] Syncing group operations: queue > size (joins, cancels, datas) = (0, 0, 0) > I0806 01:18:37.671771 17486 group.cpp:385] Trying to create path '/mesos' in > ZooKeeper > 2014-08-06 > 01:18:39,101:17458(0x2b4687394700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:36197] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > 2014-08-06 > 01:18:42,441:17458(0x2b4687394700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:36197] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > I0806 01:18:42.656673 17481 contender.cpp:131] Joining the ZK group > I0806 01:18:42.662484 17484 contender.cpp:247] New candidate (id='0') has > entered the contest for leadership > I0806 01:18:42.663754 17481 detector.cpp:138] Detected a new leader: (id='0') > I0806 01:18:42.663884 17481 group.cpp:658] Trying to get > '/mesos/info_00' in ZooKeeper > I0806 01:18:42.664788 17483 detector.cpp:426] A new leading master > (UPID=@128.150.152.0:1) is detected > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@712: Client > environment:zookeeper.version=zookeeper C client 3.4.5 > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@716: Client > environment:host.name=lucid > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@723: Client > environment:os.name=Linux > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@724: Client > environment:os.arch=2.6.32-64-generic > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@725: Client > environment:os.version=#128-Ubuntu SMP Tue Jul 15 08:32:40 UTC 2014 > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@733: Client > environment:user.name=(null) > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@741: Client > environment:user.home=/home/jenkins > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@log_env@753: Client > environment:user.dir=/var/jenkins/workspace/mesos-ubuntu-10.04-gcc/src > 2014-08-06 01:18:42,666:17458(0x2b4679ea6700):ZOO_INFO@zookeeper_init@786: > Initiating client connection, host=127.0.0.1:42069 sessionTimeout=5000 > watcher=0x2b467450bc00 sessionId=0 sessionPasswd= context=0x15c00f0 > flags=0 > 2014-08-06 01:18:42,668:17458(0x2b4686d91700):ZOO_IN
[jira] [Resolved] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy St. Clair resolved MESOS-1774. -- Resolution: Fixed Fix Version/s: 0.21.0 Target Version/s: 0.21.0 (was: 1.0.0, 0.20.1) > Fix protobuf detection on systems with Python 3 as default > -- > > Key: MESOS-1774 > URL: https://issues.apache.org/jira/browse/MESOS-1774 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.20.0 > Environment: Gentoo Linux > ./configure --disable-bundled >Reporter: Kamil Domański >Assignee: Timothy St. Clair > Labels: build > Fix For: 0.21.0 > > > When configureing without bundled dependencies, usage of *python* symbolic > link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* > module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128794#comment-14128794 ] Timothy St. Clair commented on MESOS-1774: -- commit ab1cf84e7beaa979cabbc2876623c7b30ad5e48b Author: Kamil Domanski Date: Wed Sep 10 12:43:58 2014 -0500 Fix protobuf detection on systems with Python 3 as default (part2) MESOS-1774 Review: https://reviews.apache.org/r/25439 > Fix protobuf detection on systems with Python 3 as default > -- > > Key: MESOS-1774 > URL: https://issues.apache.org/jira/browse/MESOS-1774 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.20.0 > Environment: Gentoo Linux > ./configure --disable-bundled >Reporter: Kamil Domański >Assignee: Timothy St. Clair > Labels: build > > When configureing without bundled dependencies, usage of *python* symbolic > link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* > module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)