[jira] [Created] (MESOS-6599) The disordered status update message from executor may cause agent exit
Jian Qiu created MESOS-6599: --- Summary: The disordered status update message from executor may cause agent exit Key: MESOS-6599 URL: https://issues.apache.org/jira/browse/MESOS-6599 Project: Mesos Issue Type: Bug Components: slave Environment: CentOS 7.2/Ubuntu 16.04 Reporter: Jian Qiu The framework enables checkpoint, and the executor sends TaskKiiled to the agent. After the agent acknowledges the status update, the executor sends a TaskLost, and it will cause the agent exits. It is due to the CHECK_READY(future) in Slave::___statusUpdate. Not sure why we need a CHECK here. The test code as below: {code} Try> master = StartMaster(); ASSERT_SOME(master); MockExecutor exec(DEFAULT_EXECUTOR_ID); TestContainerizer containerizer(&exec); Owned detector = master.get()->createDetector(); Try> slave = StartSlave(detector.get(), &containerizer); ASSERT_SOME(slave); FrameworkInfo frameworkInfo = DEFAULT_FRAMEWORK_INFO; frameworkInfo.set_checkpoint(true); // Enable checkpointing. MockScheduler sched; MesosSchedulerDriver driver( &sched, frameworkInfo, master.get()->pid, DEFAULT_CREDENTIAL); FrameworkID frameworkId; EXPECT_CALL(sched, registered(_, _, _)) .WillOnce(SaveArg<1>(&frameworkId)); Future> offers; EXPECT_CALL(sched, resourceOffers(_, _)) .WillOnce(FutureArg<1>(&offers)) .WillRepeatedly(Return()); // Ignore subsequent offers. Future status; EXPECT_CALL(sched, statusUpdate(_, _)) .WillOnce(FutureArg<1>(&status)); driver.start(); AWAIT_READY(offers); EXPECT_NE(0u, offers.get().size()); ExecutorDriver* execDriver; EXPECT_CALL(exec, registered(_, _, _, _)) .WillOnce(SaveArg<0>(&execDriver)); EXPECT_CALL(exec, launchTask(_, _)) .WillOnce(SendStatusUpdateFromTask(TASK_RUNNING)); Future statusUpdateMessage = FUTURE_PROTOBUF(StatusUpdateMessage(), master.get()->pid, _); Future _statusUpdateAcknowledgement = FUTURE_DISPATCH(slave.get()->pid, &Slave::_statusUpdateAcknowledgement); vector tasks = createTasks(offers.get()[0]); driver.launchTasks(offers.get()[0].id(), tasks); AWAIT_READY(statusUpdateMessage); StatusUpdate update = statusUpdateMessage.get().update(); AWAIT_READY(status); EXPECT_EQ(TASK_RUNNING, status.get().state()); AWAIT_READY(_statusUpdateAcknowledgement); // driver.killTask(tasks[0].task_id()); Future _statusUpdateAcknowledgement2 = FUTURE_DISPATCH(slave.get()->pid, &Slave::_statusUpdateAcknowledgement); TaskStatus status3 = status.get(); status3.set_state(TASK_KILLED); execDriver->sendStatusUpdate(status3); AWAIT_READY(_statusUpdateAcknowledgement2); Future _statusUpdate = FUTURE_DISPATCH(slave.get()->pid, &Slave::___statusUpdate); TaskStatus status2 = status.get(); status2.set_state(TASK_LOST); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4679) slave dies unexpectedly: Mismatched checkpoint value for status update TASK_LOST
[ https://issues.apache.org/jira/browse/MESOS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15604648#comment-15604648 ] Jian Qiu commented on MESOS-4679: - It still seems to be a issue in 1.0.0 when using k8s on mesos. > slave dies unexpectedly: Mismatched checkpoint value for status update > TASK_LOST > > > Key: MESOS-4679 > URL: https://issues.apache.org/jira/browse/MESOS-4679 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.26.0 >Reporter: James DeFelice > Labels: mesosphere > > It looks like the custom executor is sending out multiple terminal status > updates for a specific task and that's crashing the slave (as well as > possibly mishandling status-update UUID's?). In any event, I think that the > slave should handle this case with a bit more aplomb. > Custom executor logs: > {code} > I0215 20:43:59.551657 11068 executor.go:426] Executor driver killTask > I0215 20:43:59.551719 11068 executor.go:436] Executor driver is asked to > kill task > '&TaskID{Value:*pod.1e4f9fbe-d1db-11e5-8a9a-525400309a8f,XXX_unrecognized:[],}' > I0215 20:43:59.552189 11068 executor.go:687] Executor sending status update > &StatusUpdate{FrameworkId:&FrameworkID{Value:*df95a79b-d6d4-4b96-853e-55686628e898-0006,XXX_unrecognized:[],},ExecutorId:&ExecutorID{Value:*31df9d040f057abd_k8sm-executor,XXX_unrecognized:[],},SlaveId:&SlaveID{Value:*20150628-154106-117441034-5050-1315-S2,XXX_unrecognized:[],},Status:&TaskStatus{TaskId:&TaskID{Value:*pod.1e4f9fbe-d1db-11e5-8a9a-525400309a8f,XXX_unrecognized:[],},State:*TASK_LOST,Data:nil,Message:*kill-pod-task,SlaveId:&SlaveID{Value:*20150628-154106-117441034-5050-1315-S2,XXX_unrecognized:[],},Timestamp:*1.455569039e+09,ExecutorId:nil,Healthy:nil,Source:nil,Reason:nil,Uuid:nil,Labels:nil,ContainerStatus:nil,XXX_unrecognized:[],},Timestamp:*1.455569039e+09,Uuid:*[214 > 253 145 223 212 36 17 229 158 224 82 84 0 231 66 > 70],LatestState:nil,XXX_unrecognized:[],} > I0215 20:43:59.552599 11068 executor.go:687] Executor sending status update > &StatusUpdate{FrameworkId:&FrameworkID{Value:*df95a79b-d6d4-4b96-853e-55686628e898-0006,XXX_unrecognized:[],},ExecutorId:&ExecutorID{Value:*31df9d040f057abd_k8sm-executor,XXX_unrecognized:[],},SlaveId:&SlaveID{Value:*20150628-154106-117441034-5050-1315-S2,XXX_unrecognized:[],},Status:&TaskStatus{TaskId:&TaskID{Value:*pod.1e4f9fbe-d1db-11e5-8a9a-525400309a8f,XXX_unrecognized:[],},State:*TASK_KILLED,Data:nil,Message:*pod-deleted,SlaveId:&SlaveID{Value:*20150628-154106-117441034-5050-1315-S2,XXX_unrecognized:[],},Timestamp:*1.455569039e+09,ExecutorId:nil,Healthy:nil,Source:nil,Reason:nil,Uuid:nil,Labels:nil,ContainerStatus:nil,XXX_unrecognized:[],},Timestamp:*1.455569039e+09,Uuid:*[214 > 253 162 110 212 36 17 229 158 224 82 84 0 231 66 > 70],LatestState:nil,XXX_unrecognized:[],} > I0215 20:43:59.557376 11068 suicide.go:51] stopping suicide watch > I0215 20:43:59.559077 11068 executor.go:445] Executor > statusUpdateAcknowledgement > I0215 20:43:59.559129 11068 executor.go:448] Receiving status update > acknowledgement > &StatusUpdateAcknowledgementMessage{SlaveId:&SlaveID{Value:*20150628-154106-117441034-5050-1315-S2,XXX_unrecognized:[],},FrameworkId:&FrameworkID{Value:*df95a79b-d6d4-4b96-853e-55686628e898-0006,XXX_unrecognized:[],},TaskId:&TaskID{Value:*pod.1e4f9fbe-d1db-11e5-8a9a-525400309a8f,XXX_unrecognized:[],},Uuid:*[214 > 253 145 223 212 36 17 229 158 224 82 84 0 231 66 70],XXX_unrecognized:[],} > I0215 20:43:59.562016 11068 executor.go:470] Executor driver received > frameworkMessage > I0215 20:43:59.562073 11068 executor.go:480] Executor driver receives > framework message > I0215 20:43:59.562100 11068 executor.go:445] Executor > statusUpdateAcknowledgement > I0215 20:43:59.562112 11068 executor.go:448] Receiving status update > acknowledgement > &StatusUpdateAcknowledgementMessage{SlaveId:&SlaveID{Value:*20150628-154106-117441034-5050-1315-S2,XXX_unrecognized:[],},FrameworkId:&FrameworkID{Value:*df95a79b-d6d4-4b96-853e-55686628e898-0006,XXX_unrecognized:[],},TaskId:&TaskID{Value:*pod.1e4f9fbe-d1db-11e5-8a9a-525400309a8f,XXX_unrecognized:[],},Uuid:*[214 > 253 162 110 212 36 17 229 158 224 82 84 0 231 66 70],XXX_unrecognized:[],} > I0215 20:43:59.562173 11068 executor.go:579] Receives message from > framework task-lost:pod.1e4f9fbe-d1db-11e5-8a9a-525400309a8f > I0215 20:43:59.562292 11068 executor.go:687] Executor sending status update > &StatusUpdate{FrameworkId:&FrameworkID{Value:*df95a79b-d6d4-4b96-853e-55686628e898-0006,XXX_unrecognized:[],},ExecutorId:&ExecutorID{Value:*31df9d040f057abd_k8sm-executor,XXX_unrecognized:[],},SlaveId:&SlaveID{Value:*20150628-154106-117441034-5050-1315-S2,XXX_unrecognized:[],},Status:&
[jira] [Commented] (MESOS-5184) Mesos does not validate role info when framework registered with specified role
[ https://issues.apache.org/jira/browse/MESOS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238504#comment-15238504 ] Jian Qiu commented on MESOS-5184: - [~vi...@twitter.com] I think the issue here is that we disallow some special characters in role, such as slash, however, the role is not validated when registering framework. > Mesos does not validate role info when framework registered with specified > role > --- > > Key: MESOS-5184 > URL: https://issues.apache.org/jira/browse/MESOS-5184 > Project: Mesos > Issue Type: Bug > Components: general >Affects Versions: 0.28.0 >Reporter: Liqiang Lin > Fix For: 0.29.0 > > > When framework registered with specified role, Mesos does not validate the > role info. It will accept the subscription and send unreserved resources as > offer to the framework. > {code} > # cat register.json > { > "framework_id": {"value" : "test1"}, > "type":"SUBSCRIBE", > "subscribe":{ > "framework_info":{ > "user":"root", > "name":"test1", > "failover_timeout":60, > "role":"/test/test1", > "id":{"value":"test1"}, > "principal":"test1", > "capabilities":[{"type":"REVOCABLE_RESOURCES"}] > }, > "force":true > } > } > # curl -v http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: > application/json" -X POST -d @register.json > * Hostname was NOT found in DNS cache > * Trying 192.168.56.110... > * Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0) > > POST /api/v1/scheduler HTTP/1.1 > > User-Agent: curl/7.35.0 > > Host: 192.168.56.110:5050 > > Accept: */* > > Content-type: application/json > > Content-Length: 265 > > > * upload completely sent off: 265 out of 265 bytes > < HTTP/1.1 200 OK > < Date: Wed, 06 Apr 2016 21:34:18 GMT > < Transfer-Encoding: chunked > < Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc > < Content-Type: application/json > < > 69 > {"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20 > {"type":"HEARTBEAT"}1531 > {"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos2"},"type":"TEXT"}],"framework_id":{"value":"test1"},"hostname":"mesos2","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O0"},"resources":[{"name":"disk","role":"*","scalar":{"value":20576.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos2","ip":"192.168.56.110","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S1"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos1"},"type":"TEXT"}],"framework_id":{"v > alue":"test1"},"hostname":"mesos1","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O1"},"resources":[{"name":"disk","role":"*","scalar":{"value":21468.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos1","ip":"192.168.56.111","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}20 > {"type":"HEARTBEAT"}20 > {code} > As you see, the role under which framework register is "/test/test1", which > is an invalid role according to > [#MESOS-2210|https://issues.apache.org/jira/browse/MESOS-2210] > And Mesos master log > {code} > I0407 05:34:18.132333 20672 master.cpp:2107] Received subscription request > for HTTP framework 'test1' > I0407 05:34:18.133515 20672 master.cpp:2198] Subscribing framework 'test1' > with checkpointing disabled and capabilities [ REVOCABLE_RESOURCES ] > I0407 05:34:18.135027 20674 hierarchical.cpp:264] Added framework test1 > I0407 05:34:18.138746 20672 master.cpp:5659] Sending 2 offers to framework > test1 (test1) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-5184) Mesos does not validate role info when framework registered with specified role
[ https://issues.apache.org/jira/browse/MESOS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-5184: Comment: was deleted (was: We also need to validate role when update weight and quota.) > Mesos does not validate role info when framework registered with specified > role > --- > > Key: MESOS-5184 > URL: https://issues.apache.org/jira/browse/MESOS-5184 > Project: Mesos > Issue Type: Bug > Components: general >Affects Versions: 0.28.0 >Reporter: Liqiang Lin > Fix For: 0.29.0 > > > When framework registered with specified role, Mesos does not validate the > role info. It will accept the subscription and send unreserved resources as > offer to the framework. > {code} > # cat register.json > { > "framework_id": {"value" : "test1"}, > "type":"SUBSCRIBE", > "subscribe":{ > "framework_info":{ > "user":"root", > "name":"test1", > "failover_timeout":60, > "role":"/test/test1", > "id":{"value":"test1"}, > "principal":"test1", > "capabilities":[{"type":"REVOCABLE_RESOURCES"}] > }, > "force":true > } > } > # curl -v http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: > application/json" -X POST -d @register.json > * Hostname was NOT found in DNS cache > * Trying 192.168.56.110... > * Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0) > > POST /api/v1/scheduler HTTP/1.1 > > User-Agent: curl/7.35.0 > > Host: 192.168.56.110:5050 > > Accept: */* > > Content-type: application/json > > Content-Length: 265 > > > * upload completely sent off: 265 out of 265 bytes > < HTTP/1.1 200 OK > < Date: Wed, 06 Apr 2016 21:34:18 GMT > < Transfer-Encoding: chunked > < Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc > < Content-Type: application/json > < > 69 > {"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20 > {"type":"HEARTBEAT"}1531 > {"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos2"},"type":"TEXT"}],"framework_id":{"value":"test1"},"hostname":"mesos2","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O0"},"resources":[{"name":"disk","role":"*","scalar":{"value":20576.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos2","ip":"192.168.56.110","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S1"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos1"},"type":"TEXT"}],"framework_id":{"v > alue":"test1"},"hostname":"mesos1","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O1"},"resources":[{"name":"disk","role":"*","scalar":{"value":21468.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos1","ip":"192.168.56.111","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}20 > {"type":"HEARTBEAT"}20 > {code} > As you see, the role under which framework register is "/test/test1", which > is an invalid role according to > [#MESOS-2210|https://issues.apache.org/jira/browse/MESOS-2210] > And Mesos master log > {code} > I0407 05:34:18.132333 20672 master.cpp:2107] Received subscription request > for HTTP framework 'test1' > I0407 05:34:18.133515 20672 master.cpp:2198] Subscribing framework 'test1' > with checkpointing disabled and capabilities [ REVOCABLE_RESOURCES ] > I0407 05:34:18.135027 20674 hierarchical.cpp:264] Added framework test1 > I0407 05:34:18.138746 20672 master.cpp:5659] Sending 2 offers to framework > test1 (test1) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5184) Mesos does not validate role info when framework registered with specified role
[ https://issues.apache.org/jira/browse/MESOS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236654#comment-15236654 ] Jian Qiu commented on MESOS-5184: - We also need to validate role when update weight and quota. > Mesos does not validate role info when framework registered with specified > role > --- > > Key: MESOS-5184 > URL: https://issues.apache.org/jira/browse/MESOS-5184 > Project: Mesos > Issue Type: Bug > Components: general >Affects Versions: 0.28.0 >Reporter: Liqiang Lin > Fix For: 0.29.0 > > > When framework registered with specified role, Mesos does not validate the > role info. It will accept the subscription and send unreserved resources as > offer to the framework. > {code} > # cat register.json > { > "framework_id": {"value" : "test1"}, > "type":"SUBSCRIBE", > "subscribe":{ > "framework_info":{ > "user":"root", > "name":"test1", > "failover_timeout":60, > "role":"/test/test1", > "id":{"value":"test1"}, > "principal":"test1", > "capabilities":[{"type":"REVOCABLE_RESOURCES"}] > }, > "force":true > } > } > # curl -v http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: > application/json" -X POST -d @register.json > * Hostname was NOT found in DNS cache > * Trying 192.168.56.110... > * Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0) > > POST /api/v1/scheduler HTTP/1.1 > > User-Agent: curl/7.35.0 > > Host: 192.168.56.110:5050 > > Accept: */* > > Content-type: application/json > > Content-Length: 265 > > > * upload completely sent off: 265 out of 265 bytes > < HTTP/1.1 200 OK > < Date: Wed, 06 Apr 2016 21:34:18 GMT > < Transfer-Encoding: chunked > < Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc > < Content-Type: application/json > < > 69 > {"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20 > {"type":"HEARTBEAT"}1531 > {"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos2"},"type":"TEXT"}],"framework_id":{"value":"test1"},"hostname":"mesos2","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O0"},"resources":[{"name":"disk","role":"*","scalar":{"value":20576.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos2","ip":"192.168.56.110","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S1"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos1"},"type":"TEXT"}],"framework_id":{"v > alue":"test1"},"hostname":"mesos1","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O1"},"resources":[{"name":"disk","role":"*","scalar":{"value":21468.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos1","ip":"192.168.56.111","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}20 > {"type":"HEARTBEAT"}20 > {code} > As you see, the role under which framework register is "/test/test1", which > is an invalid role according to > [#MESOS-2210|https://issues.apache.org/jira/browse/MESOS-2210] > And Mesos master log > {code} > I0407 05:34:18.132333 20672 master.cpp:2107] Received subscription request > for HTTP framework 'test1' > I0407 05:34:18.133515 20672 master.cpp:2198] Subscribing framework 'test1' > with checkpointing disabled and capabilities [ REVOCABLE_RESOURCES ] > I0407 05:34:18.135027 20674 hierarchical.cpp:264] Added framework test1 > I0407 05:34:18.138746 20672 master.cpp:5659] Sending 2 offers to framework > test1 (test1) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5048) MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
[ https://issues.apache.org/jira/browse/MESOS-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234451#comment-15234451 ] Jian Qiu commented on MESOS-5048: - Yes, it is what I run on my local machine and I just simply use ../configure. It happens almost every time when I run ./bin/mesos-tests.sh --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics --gtest_repeat=100 --gtest_break_on_failure. And I also saw it once in RB. > MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky > --- > > Key: MESOS-5048 > URL: https://issues.apache.org/jira/browse/MESOS-5048 > Project: Mesos > Issue Type: Bug > Components: tests >Affects Versions: 0.28.0 > Environment: Ubuntu 15.04 >Reporter: Jian Qiu > Labels: flaky-test > > ./mesos-tests.sh > --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics > --gtest_repeat=100 --gtest_break_on_failure > This is found in rb, and reproduced in my local machine. There are two types > of failures. However, the failure does not appear when enabling verbose... > {code} > ../../src/tests/environment.cpp:790: Failure > Failed > Tests completed with child processes remaining: > -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests > \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor >\--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor > {code} > And > {code} > I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0 > I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave > 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0 > Registered executor on mesos > ../../src/tests/slave_recovery_tests.cpp:3506: Failure > Value of: containers.get().size() > Actual: 0 > Expected: 1u > Which is: 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233419#comment-15233419 ] Jian Qiu commented on MESOS-4744: - Rebased. Sorry for the delay, just back from the vacation... > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5048) MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
[ https://issues.apache.org/jira/browse/MESOS-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215438#comment-15215438 ] Jian Qiu commented on MESOS-5048: - [~anandmazumdar] Unfortunately, the failure only happens when verbose logging is not enabled. > MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky > --- > > Key: MESOS-5048 > URL: https://issues.apache.org/jira/browse/MESOS-5048 > Project: Mesos > Issue Type: Bug > Components: tests >Affects Versions: 0.28.0 > Environment: Ubuntu 15.04 >Reporter: Jian Qiu > Labels: flaky-test > > ./mesos-tests.sh > --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics > --gtest_repeat=100 --gtest_break_on_failure > This is found in rb, and reproduced in my local machine. There are two types > of failures. However, the failure does not appear when enabling verbose... > {code} > ../../src/tests/environment.cpp:790: Failure > Failed > Tests completed with child processes remaining: > -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests > \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor >\--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor > {code} > And > {code} > I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0 > I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave > 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0 > Registered executor on mesos > ../../src/tests/slave_recovery_tests.cpp:3506: Failure > Value of: containers.get().size() > Actual: 0 > Expected: 1u > Which is: 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5048) MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
Jian Qiu created MESOS-5048: --- Summary: MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky Key: MESOS-5048 URL: https://issues.apache.org/jira/browse/MESOS-5048 Project: Mesos Issue Type: Bug Components: tests Affects Versions: 0.28.0 Environment: Ubuntu 15.04 Reporter: Jian Qiu ./mesos-tests.sh --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics --gtest_repeat=100 --gtest_break_on_failure This is found in rb, and reproduced in my local machine. There are two types of failures. However, the failure does not appear when enabling verbose... {code} ../../src/tests/environment.cpp:790: Failure Failed Tests completed with child processes remaining: -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor \--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor {code} And {code} I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0 I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0 Registered executor on mesos ../../src/tests/slave_recovery_tests.cpp:3506: Failure Value of: containers.get().size() Actual: 0 Expected: 1u Which is: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4974) mesos-execute should allow setting command_uris
Jian Qiu created MESOS-4974: --- Summary: mesos-execute should allow setting command_uris Key: MESOS-4974 URL: https://issues.apache.org/jira/browse/MESOS-4974 Project: Mesos Issue Type: Bug Components: cli Reporter: Jian Qiu Priority: Minor Based on discussion in MESOS-4744, it will be helpful to let mesos-execute support setting uris in command info. We can add a flag: {code} --uris=uri1,uri2.. {code} and set other values in CommandInfo::URIS as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4744: Description: It will be quite useful if we can set role when running mesos-execute (was: It will be quite useful if we can set role and command uris when running mesos-execute) > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4976) Reject RESERVE on revocable resources
[ https://issues.apache.org/jira/browse/MESOS-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201188#comment-15201188 ] Jian Qiu commented on MESOS-4976: - It has been validated in master https://github.com/apache/mesos/blob/master/src/master/validation.cpp#L151 Not sure whether it is sill need to be checked in allocator. > Reject RESERVE on revocable resources > - > > Key: MESOS-4976 > URL: https://issues.apache.org/jira/browse/MESOS-4976 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Klaus Ma > > In {{Resources::apply}}, we did not check whether the resources is revocable > or not. It does not make sense to reserve a revocable resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200811#comment-15200811 ] Jian Qiu commented on MESOS-4744: - Opened another ticket https://issues.apache.org/jira/browse/MESOS-4974 for command_uris > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4744: Summary: mesos-execute should allow setting role (was: mesos-execute should allow setting role and command uris) > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role and command uris when running > mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4974) mesos-execute should allow setting command_uris
[ https://issues.apache.org/jira/browse/MESOS-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4974: Description: Based on discussion in MESOS-4744, it will be helpful to let mesos-execute support setting uris in command info. We can add a flag: {code} --uris=uri1,uri2.. {code} and set other values in CommandInfo::URI as default. was: Based on discussion in MESOS-4744, it will be helpful to let mesos-execute support setting uris in command info. We can add a flag: {code} --uris=uri1,uri2.. {code} and set other values in CommandInfo::URIS as default. > mesos-execute should allow setting command_uris > --- > > Key: MESOS-4974 > URL: https://issues.apache.org/jira/browse/MESOS-4974 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Priority: Minor > > Based on discussion in MESOS-4744, it will be helpful to let mesos-execute > support setting uris in command info. > We can add a flag: > {code} > --uris=uri1,uri2.. > {code} > and set other values in CommandInfo::URI as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role and command uris
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195151#comment-15195151 ] Jian Qiu commented on MESOS-4744: - RR for setting role. https://reviews.apache.org/r/43935/ > mesos-execute should allow setting role and command uris > > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role and command uris when running > mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role and command uris
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194881#comment-15194881 ] Jian Qiu commented on MESOS-4744: - I have updated the description of the ticket. Do you think adding a flag {code} --command_uris=uri1,uri2... {code} is sufficient? > mesos-execute should allow setting role and command uris > > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role and command uris when running > mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role and command uris
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4744: Description: It will be quite useful if we can set role and command uris when running mesos-execute (was: It will be quite useful if we can set role when running mesos-execute) > mesos-execute should allow setting role and command uris > > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role and command uris when running > mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role and command uris
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4744: Summary: mesos-execute should allow setting role and command uris (was: mesos-execute should allow setting role) > mesos-execute should allow setting role and command uris > > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194552#comment-15194552 ] Jian Qiu commented on MESOS-4744: - Thanks [~lins05], may be we should create a separate ticket for adding uris in commendInfo for mesos-execute? > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4744: Shepherd: Michael Park > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-4744: --- Assignee: Jian Qiu > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4744) mesos-execute should allow setting role
Jian Qiu created MESOS-4744: --- Summary: mesos-execute should allow setting role Key: MESOS-4744 URL: https://issues.apache.org/jira/browse/MESOS-4744 Project: Mesos Issue Type: Bug Components: cli Reporter: Jian Qiu Priority: Minor It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4174) HookTest.VerifySlaveLaunchExecutorHook is slow
[ https://issues.apache.org/jira/browse/MESOS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153551#comment-15153551 ] Jian Qiu commented on MESOS-4174: - [~tnachen] [~alexr] Would you help to review this patch again? Thanks. > HookTest.VerifySlaveLaunchExecutorHook is slow > -- > > Key: MESOS-4174 > URL: https://issues.apache.org/jira/browse/MESOS-4174 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{HookTest.VerifySlaveLaunchExecutorHook}} test takes more than {{5s}} to > finish on my Mac OS 10.10.4: > {code} > HookTest.VerifySlaveLaunchExecutorHook (5061 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4682) ExamplesTest.PythonFramework fails on OSX
[ https://issues.apache.org/jira/browse/MESOS-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151629#comment-15151629 ] Jian Qiu commented on MESOS-4682: - Thanks [~haosd...@gmail.com], it seems my environment is not clean. It is fine after clean my environment. Will close this ticket. > ExamplesTest.PythonFramework fails on OSX > - > > Key: MESOS-4682 > URL: https://issues.apache.org/jira/browse/MESOS-4682 > Project: Mesos > Issue Type: Bug > Environment: OSX 10.10.05 >Reporter: Jian Qiu > Labels: test > > {code} > Using temporary directory '/tmp/ExamplesTest_PythonFramework_ZvbuJl' > Enabling authentication for the framework > I0216 14:48:17.029909 2007810816 leveldb.cpp:174] Opened db in 3570us > I0216 14:48:17.030324 2007810816 leveldb.cpp:181] Compacted db in 383us > I0216 14:48:17.030375 2007810816 leveldb.cpp:196] Created db iterator in 24us > I0216 14:48:17.030388 2007810816 leveldb.cpp:202] Seeked to beginning of db > in 8us > I0216 14:48:17.030411 2007810816 leveldb.cpp:271] Iterated through 0 keys in > the db in 6us > I0216 14:48:17.030468 2007810816 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0216 14:48:17.031262 138493952 recover.cpp:447] Starting replica recovery > I0216 14:48:17.031478 138493952 recover.cpp:473] Replica is in EMPTY status > I0216 14:48:17.031772 2007810816 local.cpp:239] Using 'local' authorizer > I0216 14:48:17.032449 135274496 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (4)@9.110.49.144:57199 > I0216 14:48:17.032662 137420800 recover.cpp:193] Received a recover response > from a replica in EMPTY status > I0216 14:48:17.032914 137957376 recover.cpp:564] Updating replica status to > STARTING > I0216 14:48:17.033349 136347648 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 316us > I0216 14:48:17.033375 136347648 replica.cpp:320] Persisted replica status to > STARTING > I0216 14:48:17.033488 139030528 recover.cpp:473] Replica is in STARTING status > I0216 14:48:17.034047 135811072 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (5)@9.110.49.144:57199 > I0216 14:48:17.034220 139030528 recover.cpp:193] Received a recover response > from a replica in STARTING status > I0216 14:48:17.034494 135811072 recover.cpp:564] Updating replica status to > VOTING > I0216 14:48:17.034744 136884224 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 135us > I0216 14:48:17.034764 136884224 replica.cpp:320] Persisted replica status to > VOTING > I0216 14:48:17.034814 137957376 recover.cpp:578] Successfully joined the > Paxos group > I0216 14:48:17.034934 137957376 recover.cpp:462] Recover process terminated > I0216 14:48:17.069952 137957376 master.cpp:374] Master > bd54ad91-3083-42a3-a39f-0c7e2e08b0a0 (9.110.49.144) started on > 9.110.49.144:57199 > I0216 14:48:17.070006 137957376 master.cpp:376] Flags at startup: > --acls="permissive: false > register_frameworks { > principals { > type: SOME > values: "test-principal" > } > roles { > type: SOME > values: "*" > } > } > run_tasks { > principals { > type: SOME > values: "test-principal" > } > users { > type: SOME > values: "qiujian" > } > } > " --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="false" > --authenticate_slaves="false" --authenticators="crammd5" > --authorizers="local" > --credentials="/tmp/ExamplesTest_PythonFramework_ZvbuJl/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --http_authenticators="basic" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" > --max_slave_ping_timeouts="5" --quiet="false" > --recovery_slave_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" > --registry_strict="false" --root_submissions="true" > --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" > --user_sorter="drf" --version="false" > --webui_dir="/Users/qiujian/Documents/mesos/src/webui" > --work_dir="/var/folders/_4/hjy0h2s15kv3mt9ft2ndx5jhgn/T/mesos-XX.FIXeeEhQ" > --zk_session_timeout="10secs" > I0216 14:48:17.070543 137957376 master.cpp:421] Master only allowing > authenticated frameworks to register > I0216 14:48:17.070564 137957376 master.cpp:428] Master allowing > unauthenticated slaves to register > I0216 14:48:17.070576 137957376 credentials.hpp:35] Loading credentials for > authentication from '/tmp/ExamplesTest_PythonFramework_ZvbuJl/credentials' > W0216 14:48:17.070650 137957376 cr
[jira] [Commented] (MESOS-4682) ExamplesTest.PythonFramework fails on OSX
[ https://issues.apache.org/jira/browse/MESOS-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148169#comment-15148169 ] Jian Qiu commented on MESOS-4682: - seems to be due to this error: {code} Failed to parse the flags: Failed to load unknown flag 'directory' Failed to parse the flags: Failed to load unknown flag 'directory' Failed to parse the flags: Failed to load unknown flag 'directory' {code} > ExamplesTest.PythonFramework fails on OSX > - > > Key: MESOS-4682 > URL: https://issues.apache.org/jira/browse/MESOS-4682 > Project: Mesos > Issue Type: Bug > Environment: OSX 10.10.05 >Reporter: Jian Qiu > Labels: test > > {code} > Using temporary directory '/tmp/ExamplesTest_PythonFramework_ZvbuJl' > Enabling authentication for the framework > I0216 14:48:17.029909 2007810816 leveldb.cpp:174] Opened db in 3570us > I0216 14:48:17.030324 2007810816 leveldb.cpp:181] Compacted db in 383us > I0216 14:48:17.030375 2007810816 leveldb.cpp:196] Created db iterator in 24us > I0216 14:48:17.030388 2007810816 leveldb.cpp:202] Seeked to beginning of db > in 8us > I0216 14:48:17.030411 2007810816 leveldb.cpp:271] Iterated through 0 keys in > the db in 6us > I0216 14:48:17.030468 2007810816 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0216 14:48:17.031262 138493952 recover.cpp:447] Starting replica recovery > I0216 14:48:17.031478 138493952 recover.cpp:473] Replica is in EMPTY status > I0216 14:48:17.031772 2007810816 local.cpp:239] Using 'local' authorizer > I0216 14:48:17.032449 135274496 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (4)@9.110.49.144:57199 > I0216 14:48:17.032662 137420800 recover.cpp:193] Received a recover response > from a replica in EMPTY status > I0216 14:48:17.032914 137957376 recover.cpp:564] Updating replica status to > STARTING > I0216 14:48:17.033349 136347648 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 316us > I0216 14:48:17.033375 136347648 replica.cpp:320] Persisted replica status to > STARTING > I0216 14:48:17.033488 139030528 recover.cpp:473] Replica is in STARTING status > I0216 14:48:17.034047 135811072 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (5)@9.110.49.144:57199 > I0216 14:48:17.034220 139030528 recover.cpp:193] Received a recover response > from a replica in STARTING status > I0216 14:48:17.034494 135811072 recover.cpp:564] Updating replica status to > VOTING > I0216 14:48:17.034744 136884224 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 135us > I0216 14:48:17.034764 136884224 replica.cpp:320] Persisted replica status to > VOTING > I0216 14:48:17.034814 137957376 recover.cpp:578] Successfully joined the > Paxos group > I0216 14:48:17.034934 137957376 recover.cpp:462] Recover process terminated > I0216 14:48:17.069952 137957376 master.cpp:374] Master > bd54ad91-3083-42a3-a39f-0c7e2e08b0a0 (9.110.49.144) started on > 9.110.49.144:57199 > I0216 14:48:17.070006 137957376 master.cpp:376] Flags at startup: > --acls="permissive: false > register_frameworks { > principals { > type: SOME > values: "test-principal" > } > roles { > type: SOME > values: "*" > } > } > run_tasks { > principals { > type: SOME > values: "test-principal" > } > users { > type: SOME > values: "qiujian" > } > } > " --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="false" > --authenticate_slaves="false" --authenticators="crammd5" > --authorizers="local" > --credentials="/tmp/ExamplesTest_PythonFramework_ZvbuJl/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --http_authenticators="basic" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" > --max_slave_ping_timeouts="5" --quiet="false" > --recovery_slave_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" > --registry_strict="false" --root_submissions="true" > --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" > --user_sorter="drf" --version="false" > --webui_dir="/Users/qiujian/Documents/mesos/src/webui" > --work_dir="/var/folders/_4/hjy0h2s15kv3mt9ft2ndx5jhgn/T/mesos-XX.FIXeeEhQ" > --zk_session_timeout="10secs" > I0216 14:48:17.070543 137957376 master.cpp:421] Master only allowing > authenticated frameworks to register > I0216 14:48:17.070564 137957376 master.cpp:428] Master allowing > unauthenticated slaves to register > I0216 14:48:17.070576 137957376 credentials.hpp:35] Loading credentials for
[jira] [Created] (MESOS-4682) ExamplesTest.PythonFramework fails on OSX
Jian Qiu created MESOS-4682: --- Summary: ExamplesTest.PythonFramework fails on OSX Key: MESOS-4682 URL: https://issues.apache.org/jira/browse/MESOS-4682 Project: Mesos Issue Type: Bug Environment: OSX 10.10.05 Reporter: Jian Qiu Using temporary directory '/tmp/ExamplesTest_PythonFramework_ZvbuJl' Enabling authentication for the framework I0216 14:48:17.029909 2007810816 leveldb.cpp:174] Opened db in 3570us I0216 14:48:17.030324 2007810816 leveldb.cpp:181] Compacted db in 383us I0216 14:48:17.030375 2007810816 leveldb.cpp:196] Created db iterator in 24us I0216 14:48:17.030388 2007810816 leveldb.cpp:202] Seeked to beginning of db in 8us I0216 14:48:17.030411 2007810816 leveldb.cpp:271] Iterated through 0 keys in the db in 6us I0216 14:48:17.030468 2007810816 replica.cpp:779] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0216 14:48:17.031262 138493952 recover.cpp:447] Starting replica recovery I0216 14:48:17.031478 138493952 recover.cpp:473] Replica is in EMPTY status I0216 14:48:17.031772 2007810816 local.cpp:239] Using 'local' authorizer I0216 14:48:17.032449 135274496 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (4)@9.110.49.144:57199 I0216 14:48:17.032662 137420800 recover.cpp:193] Received a recover response from a replica in EMPTY status I0216 14:48:17.032914 137957376 recover.cpp:564] Updating replica status to STARTING I0216 14:48:17.033349 136347648 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 316us I0216 14:48:17.033375 136347648 replica.cpp:320] Persisted replica status to STARTING I0216 14:48:17.033488 139030528 recover.cpp:473] Replica is in STARTING status I0216 14:48:17.034047 135811072 replica.cpp:673] Replica in STARTING status received a broadcasted recover request from (5)@9.110.49.144:57199 I0216 14:48:17.034220 139030528 recover.cpp:193] Received a recover response from a replica in STARTING status I0216 14:48:17.034494 135811072 recover.cpp:564] Updating replica status to VOTING I0216 14:48:17.034744 136884224 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 135us I0216 14:48:17.034764 136884224 replica.cpp:320] Persisted replica status to VOTING I0216 14:48:17.034814 137957376 recover.cpp:578] Successfully joined the Paxos group I0216 14:48:17.034934 137957376 recover.cpp:462] Recover process terminated I0216 14:48:17.069952 137957376 master.cpp:374] Master bd54ad91-3083-42a3-a39f-0c7e2e08b0a0 (9.110.49.144) started on 9.110.49.144:57199 I0216 14:48:17.070006 137957376 master.cpp:376] Flags at startup: --acls="permissive: false register_frameworks { principals { type: SOME values: "test-principal" } roles { type: SOME values: "*" } } run_tasks { principals { type: SOME values: "test-principal" } users { type: SOME values: "qiujian" } } " --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="true" --authenticate_http="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/ExamplesTest_PythonFramework_ZvbuJl/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/Users/qiujian/Documents/mesos/src/webui" --work_dir="/var/folders/_4/hjy0h2s15kv3mt9ft2ndx5jhgn/T/mesos-XX.FIXeeEhQ" --zk_session_timeout="10secs" I0216 14:48:17.070543 137957376 master.cpp:421] Master only allowing authenticated frameworks to register I0216 14:48:17.070564 137957376 master.cpp:428] Master allowing unauthenticated slaves to register I0216 14:48:17.070576 137957376 credentials.hpp:35] Loading credentials for authentication from '/tmp/ExamplesTest_PythonFramework_ZvbuJl/credentials' W0216 14:48:17.070650 137957376 credentials.hpp:50] Permissions on credentials file '/tmp/ExamplesTest_PythonFramework_ZvbuJl/credentials' are too open. It is recommended that your credentials file is NOT accessible by others. I0216 14:48:17.070757 137957376 master.cpp:466] Using default 'crammd5' authenticator I0216 14:48:17.070843 137957376 authenticator.cpp:518] Initializing server SASL I0216 14:48:17.071966 2007810816 containerizer.cpp:143] Using isolation: filesystem/posix,posix/cpu,posix/mem I0216 14:48:17.074555 136347648 slave.cpp:192] Slave started on 1)@9.110.49.144:57199 I
[jira] [Commented] (MESOS-4508) Make check fails on Ubuntu 15.04
[ https://issues.apache.org/jira/browse/MESOS-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116910#comment-15116910 ] Jian Qiu commented on MESOS-4508: - RR: https://reviews.apache.org/r/42792/ > Make check fails on Ubuntu 15.04 > > > Key: MESOS-4508 > URL: https://issues.apache.org/jira/browse/MESOS-4508 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0 > Environment: ubuntu 15.04, gcc 4.9.2, --enable-libevent --enable-ssl > --enable-debug >Reporter: Jian Qiu >Assignee: Jian Qiu > Labels: test > > make check > {code} > In file included from > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/internal/gmock-internal-utils.h:47:0, > from > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/gmock-actions.h:46, > from > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/gmock.h:58, > from ../../src/tests/container_logger_tests.cpp:21: > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In > instantiation of ‘testing::AssertionResult > testing::internal::CmpHelperLE(const char*, const char*, const T1&, const > T2&) [with T1 = int; T2 = long unsigned int]’: > ../../src/tests/container_logger_tests.cpp:467:3: required from here > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1579:28: > error: comparison between signed and unsigned integer expressions > [-Werror=sign-compare] > GTEST_IMPL_CMP_HELPER_(LE, <=); > ^ > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1562:12: > note: in definition of macro ‘GTEST_IMPL_CMP_HELPER_’ >if (val1 op val2) {\ > ^ > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In > instantiation of ‘testing::AssertionResult > testing::internal::CmpHelperGE(const char*, const char*, const T1&, const > T2&) [with T1 = int; T2 = long unsigned int]’: > ../../src/tests/container_logger_tests.cpp:468:3: required from here > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1583:28: > error: comparison between signed and unsigned integer expressions > [-Werror=sign-compare] > GTEST_IMPL_CMP_HELPER_(GE, >=); > ^ > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1562:12: > note: in definition of macro ‘GTEST_IMPL_CMP_HELPER_’ >if (val1 op val2) {\ > ^ > mv -f tests/containerizer/.deps/mesos_tests-cgroups_tests.Tpo > tests/containerizer/.deps/mesos_tests-cgroups_tests.Po > g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" > -DPACKAGE_VERSION=\"0.27.0\" -DPACKAGE_STRING=\"mesos\ 0.27.0\" > -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" > -DVERSION=\"0.27.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 > -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 > -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 > -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD_PRIO_INHERIT=1 > -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_APR_POOLS_H=1 > -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 > -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBSASL2=1 > -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 > -DHAVE_EVENT2_EVENT_H=1 -DHAVE_LIBEVENT=1 -DHAVE_EVENT2_THREAD_H=1 > -DHAVE_LIBEVENT_PTHREADS=1 -DHAVE_OPENSSL_SSL_H=1 -DHAVE_LIBSSL=1 > -DHAVE_LIBCRYPTO=1 -DHAVE_EVENT2_BUFFEREVENT_SSL_H=1 > -DHAVE_LIBEVENT_OPENSSL=1 -DUSE_SSL_SOCKET=1 -I. -I../../src -Wall -Werror > -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" > -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include > -I../../3rdparty/libprocess/include > -I../../3rdparty/libprocess/3rdparty/stout/include -I../include > -I../include/mesos -isystem ../3rdparty/libprocess/3rdparty/boost-1.53.0 > -I../3rdparty/libprocess/3rdparty/picojson-1.3.0 -DPICOJSON_USE_INT64 > -D__STDC_FORMAT_MACROS -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include > -I../3rdparty/zookeeper-3.4.5/src/c/generated > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -DSOURCE_DIR=\"/home/qiujian/community/mesos/build/..\" > -DBUILD_DIR=\"/home/qiujian/community/mesos/build\" > -I../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include > -I../3rdparty/libprocess/3rdparty/gmock-1.7.0/include > -I/usr/lib/jvm/java-7-openjdk-amd64/include > -I/usr/lib/jvm/java-7-openjdk-amd64/include/linux > -DZOOKEEPER_VERSION=\"3.4.5\" -I/usr/include/subversion-1 > -I/usr/include/apr-1 -I/usr/include
[jira] [Assigned] (MESOS-4508) Make check fails on Ubuntu 15.04
[ https://issues.apache.org/jira/browse/MESOS-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-4508: --- Assignee: Jian Qiu > Make check fails on Ubuntu 15.04 > > > Key: MESOS-4508 > URL: https://issues.apache.org/jira/browse/MESOS-4508 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0 > Environment: ubuntu 15.04, gcc 4.9.2, --enable-libevent --enable-ssl > --enable-debug >Reporter: Jian Qiu >Assignee: Jian Qiu > Labels: test > > make check > {code} > In file included from > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/internal/gmock-internal-utils.h:47:0, > from > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/gmock-actions.h:46, > from > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/gmock.h:58, > from ../../src/tests/container_logger_tests.cpp:21: > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In > instantiation of ‘testing::AssertionResult > testing::internal::CmpHelperLE(const char*, const char*, const T1&, const > T2&) [with T1 = int; T2 = long unsigned int]’: > ../../src/tests/container_logger_tests.cpp:467:3: required from here > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1579:28: > error: comparison between signed and unsigned integer expressions > [-Werror=sign-compare] > GTEST_IMPL_CMP_HELPER_(LE, <=); > ^ > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1562:12: > note: in definition of macro ‘GTEST_IMPL_CMP_HELPER_’ >if (val1 op val2) {\ > ^ > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In > instantiation of ‘testing::AssertionResult > testing::internal::CmpHelperGE(const char*, const char*, const T1&, const > T2&) [with T1 = int; T2 = long unsigned int]’: > ../../src/tests/container_logger_tests.cpp:468:3: required from here > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1583:28: > error: comparison between signed and unsigned integer expressions > [-Werror=sign-compare] > GTEST_IMPL_CMP_HELPER_(GE, >=); > ^ > ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1562:12: > note: in definition of macro ‘GTEST_IMPL_CMP_HELPER_’ >if (val1 op val2) {\ > ^ > mv -f tests/containerizer/.deps/mesos_tests-cgroups_tests.Tpo > tests/containerizer/.deps/mesos_tests-cgroups_tests.Po > g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" > -DPACKAGE_VERSION=\"0.27.0\" -DPACKAGE_STRING=\"mesos\ 0.27.0\" > -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" > -DVERSION=\"0.27.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 > -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 > -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 > -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD_PRIO_INHERIT=1 > -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_APR_POOLS_H=1 > -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 > -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBSASL2=1 > -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 > -DHAVE_EVENT2_EVENT_H=1 -DHAVE_LIBEVENT=1 -DHAVE_EVENT2_THREAD_H=1 > -DHAVE_LIBEVENT_PTHREADS=1 -DHAVE_OPENSSL_SSL_H=1 -DHAVE_LIBSSL=1 > -DHAVE_LIBCRYPTO=1 -DHAVE_EVENT2_BUFFEREVENT_SSL_H=1 > -DHAVE_LIBEVENT_OPENSSL=1 -DUSE_SSL_SOCKET=1 -I. -I../../src -Wall -Werror > -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" > -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include > -I../../3rdparty/libprocess/include > -I../../3rdparty/libprocess/3rdparty/stout/include -I../include > -I../include/mesos -isystem ../3rdparty/libprocess/3rdparty/boost-1.53.0 > -I../3rdparty/libprocess/3rdparty/picojson-1.3.0 -DPICOJSON_USE_INT64 > -D__STDC_FORMAT_MACROS -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include > -I../3rdparty/zookeeper-3.4.5/src/c/generated > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -DSOURCE_DIR=\"/home/qiujian/community/mesos/build/..\" > -DBUILD_DIR=\"/home/qiujian/community/mesos/build\" > -I../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include > -I../3rdparty/libprocess/3rdparty/gmock-1.7.0/include > -I/usr/lib/jvm/java-7-openjdk-amd64/include > -I/usr/lib/jvm/java-7-openjdk-amd64/include/linux > -DZOOKEEPER_VERSION=\"3.4.5\" -I/usr/include/subversion-1 > -I/usr/include/apr-1 -I/usr/include/apr-1.0 -pthread -g -O0 > -Wno-unused-local-typedefs -std=c++11 -
[jira] [Created] (MESOS-4508) Make check fails on Ubuntu 15.04
Jian Qiu created MESOS-4508: --- Summary: Make check fails on Ubuntu 15.04 Key: MESOS-4508 URL: https://issues.apache.org/jira/browse/MESOS-4508 Project: Mesos Issue Type: Bug Affects Versions: 0.27.0 Environment: ubuntu 15.04, gcc 4.9.2, --enable-libevent --enable-ssl --enable-debug Reporter: Jian Qiu make check {code} In file included from ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/internal/gmock-internal-utils.h:47:0, from ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/gmock-actions.h:46, from ../3rdparty/libprocess/3rdparty/gmock-1.7.0/include/gmock/gmock.h:58, from ../../src/tests/container_logger_tests.cpp:21: ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In instantiation of ‘testing::AssertionResult testing::internal::CmpHelperLE(const char*, const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int]’: ../../src/tests/container_logger_tests.cpp:467:3: required from here ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1579:28: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] GTEST_IMPL_CMP_HELPER_(LE, <=); ^ ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1562:12: note: in definition of macro ‘GTEST_IMPL_CMP_HELPER_’ if (val1 op val2) {\ ^ ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In instantiation of ‘testing::AssertionResult testing::internal::CmpHelperGE(const char*, const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int]’: ../../src/tests/container_logger_tests.cpp:468:3: required from here ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1583:28: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] GTEST_IMPL_CMP_HELPER_(GE, >=); ^ ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1562:12: note: in definition of macro ‘GTEST_IMPL_CMP_HELPER_’ if (val1 op val2) {\ ^ mv -f tests/containerizer/.deps/mesos_tests-cgroups_tests.Tpo tests/containerizer/.deps/mesos_tests-cgroups_tests.Po g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"0.27.0\" -DPACKAGE_STRING=\"mesos\ 0.27.0\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" -DVERSION=\"0.27.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBSASL2=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -DHAVE_EVENT2_EVENT_H=1 -DHAVE_LIBEVENT=1 -DHAVE_EVENT2_THREAD_H=1 -DHAVE_LIBEVENT_PTHREADS=1 -DHAVE_OPENSSL_SSL_H=1 -DHAVE_LIBSSL=1 -DHAVE_LIBCRYPTO=1 -DHAVE_EVENT2_BUFFEREVENT_SSL_H=1 -DHAVE_LIBEVENT_OPENSSL=1 -DUSE_SSL_SOCKET=1 -I. -I../../src -Wall -Werror -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include -I../../3rdparty/libprocess/include -I../../3rdparty/libprocess/3rdparty/stout/include -I../include -I../include/mesos -isystem ../3rdparty/libprocess/3rdparty/boost-1.53.0 -I../3rdparty/libprocess/3rdparty/picojson-1.3.0 -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include -I../3rdparty/zookeeper-3.4.5/src/c/generated -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src -DSOURCE_DIR=\"/home/qiujian/community/mesos/build/..\" -DBUILD_DIR=\"/home/qiujian/community/mesos/build\" -I../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include -I../3rdparty/libprocess/3rdparty/gmock-1.7.0/include -I/usr/lib/jvm/java-7-openjdk-amd64/include -I/usr/lib/jvm/java-7-openjdk-amd64/include/linux -DZOOKEEPER_VERSION=\"3.4.5\" -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 -pthread -g -O0 -Wno-unused-local-typedefs -std=c++11 -MT tests/containerizer/mesos_tests-filesystem_isolator_tests.o -MD -MP -MF tests/containerizer/.deps/mesos_tests-filesystem_isolator_tests.Tpo -c -o tests/containerizer/mesos_tests-filesystem_isolator_tests.o `test -f 'tests/containerizer/filesystem_isolator_tests.cpp' || echo '../../src/'`tests/containerizer/filesystem_isolator_tests.cpp cc1plus: all warnings being treated as errors
[jira] [Commented] (MESOS-920) Set GLOG_drop_log_memory=false in environment prior to logging initialization.
[ https://issues.apache.org/jira/browse/MESOS-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114803#comment-15114803 ] Jian Qiu commented on MESOS-920: It works for me, thanks! > Set GLOG_drop_log_memory=false in environment prior to logging initialization. > -- > > Key: MESOS-920 > URL: https://issues.apache.org/jira/browse/MESOS-920 > Project: Mesos > Issue Type: Improvement > Components: technical debt >Affects Versions: 0.15.0, 0.16.0 >Reporter: Benjamin Mahler >Assignee: Kapil Arya >Priority: Blocker > Labels: mesosphere > > We've observed issues where the masters are slow to respond. Two perf traces > collected while the masters were slow to respond: > {noformat} > 25.84% [kernel][k] default_send_IPI_mask_sequence_phys > 20.44% [kernel][k] native_write_msr_safe > 4.54% [kernel][k] _raw_spin_lock > 2.95% libc-2.5.so [.] _int_malloc > 1.82% libc-2.5.so [.] malloc > 1.55% [kernel][k] apic_timer_interrupt > 1.36% libc-2.5.so [.] _int_free > {noformat} > {noformat} > 29.03% [kernel][k] default_send_IPI_mask_sequence_phys > 9.64% [kernel][k] _raw_spin_lock > 7.38% [kernel][k] native_write_msr_safe > 2.43% libc-2.5.so [.] _int_malloc > 2.05% libc-2.5.so [.] _int_free > 1.67% [kernel][k] apic_timer_interrupt > 1.58% libc-2.5.so [.] malloc > {noformat} > These have been found to be attributed to the posix_fadvise calls made by > glog. We can disable these via the environment: > {noformat} > GLOG_DEFINE_bool(drop_log_memory, true, "Drop in-memory buffers of log > contents. " > "Logs can grow very quickly and they are rarely read before > they " > "need to be evicted from memory. Instead, drop them from > memory " > "as soon as they are flushed to disk."); > {noformat} > {code} > if (FLAGS_drop_log_memory) { > if (file_length_ >= logging::kPageSize) { > // don't evict the most recent page > uint32 len = file_length_ & ~(logging::kPageSize - 1); > posix_fadvise(fileno(file_), 0, len, POSIX_FADV_DONTNEED); > } > } > {code} > We should set GLOG_drop_log_memory=false prior to making our call to > google::InitGoogleLogging, to avoid others running into this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-920) Set GLOG_drop_log_memory=false in environment prior to logging initialization.
[ https://issues.apache.org/jira/browse/MESOS-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114785#comment-15114785 ] Jian Qiu commented on MESOS-920: When I run make check on OSX, it appears a error message. {code} ./mesos-tests dyld: Symbol not found: __ZN3fLB21FLAGS_drop_log_memoryE Referenced from: /Users/qiujian/Documents/mesos/build/src/.libs/libmesos-0.27.0.dylib Expected in: flat namespace in /Users/qiujian/Documents/mesos/build/src/.libs/libmesos-0.27.0.dylib make[3]: *** [check-local] Trace/BPT trap: 5 make[2]: *** [check-am] Error 2 make[1]: *** [check] Error 2 make: *** [check-recursive] Error 1 {code} Seems to be related to this ticket? > Set GLOG_drop_log_memory=false in environment prior to logging initialization. > -- > > Key: MESOS-920 > URL: https://issues.apache.org/jira/browse/MESOS-920 > Project: Mesos > Issue Type: Improvement > Components: technical debt >Affects Versions: 0.15.0, 0.16.0 >Reporter: Benjamin Mahler >Assignee: Kapil Arya >Priority: Blocker > Labels: mesosphere > > We've observed issues where the masters are slow to respond. Two perf traces > collected while the masters were slow to respond: > {noformat} > 25.84% [kernel][k] default_send_IPI_mask_sequence_phys > 20.44% [kernel][k] native_write_msr_safe > 4.54% [kernel][k] _raw_spin_lock > 2.95% libc-2.5.so [.] _int_malloc > 1.82% libc-2.5.so [.] malloc > 1.55% [kernel][k] apic_timer_interrupt > 1.36% libc-2.5.so [.] _int_free > {noformat} > {noformat} > 29.03% [kernel][k] default_send_IPI_mask_sequence_phys > 9.64% [kernel][k] _raw_spin_lock > 7.38% [kernel][k] native_write_msr_safe > 2.43% libc-2.5.so [.] _int_malloc > 2.05% libc-2.5.so [.] _int_free > 1.67% [kernel][k] apic_timer_interrupt > 1.58% libc-2.5.so [.] malloc > {noformat} > These have been found to be attributed to the posix_fadvise calls made by > glog. We can disable these via the environment: > {noformat} > GLOG_DEFINE_bool(drop_log_memory, true, "Drop in-memory buffers of log > contents. " > "Logs can grow very quickly and they are rarely read before > they " > "need to be evicted from memory. Instead, drop them from > memory " > "as soon as they are flushed to disk."); > {noformat} > {code} > if (FLAGS_drop_log_memory) { > if (file_length_ >= logging::kPageSize) { > // don't evict the most recent page > uint32 len = file_length_ & ~(logging::kPageSize - 1); > posix_fadvise(fileno(file_), 0, len, POSIX_FADV_DONTNEED); > } > } > {code} > We should set GLOG_drop_log_memory=false prior to making our call to > google::InitGoogleLogging, to avoid others running into this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4491) SSLTest.BasicSameProcess is flaky
Jian Qiu created MESOS-4491: --- Summary: SSLTest.BasicSameProcess is flaky Key: MESOS-4491 URL: https://issues.apache.org/jira/browse/MESOS-4491 Project: Mesos Issue Type: Bug Components: test Environment: OSX, ./configure --enable-libevent --enable-ssl --enable-debug Reporter: Jian Qiu [--] 16 tests from SSLTest [ RUN ] SSLTest.BasicSameProcess *** Aborted at 1453689521 (unix time) try "date -d @1453689521" if you are using GNU date *** PC: @0x10808d71a sk_num *** SIGSEGV (@0x2) received by PID 40699 (TID 0x10893f000) stack trace: *** @ 0x7fff8e958f1a _sigtramp @ 0x (unknown) @0x1081cb2e9 ssl3_send_certificate_request @0x1081c8dd0 ssl3_accept @0x1081dbbd2 ssl23_get_client_hello @0x1081db502 ssl23_accept @0x107ff3067 do_handshake @0x107ff3753 be_openssl_handshakeeventcb @0x10822c424 event_base_loop @0x106f4535e process::EventLoop::run() @0x106e62b76 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_ @ 0x7fff885a _pthread_body @ 0x7fff87d7 _pthread_start @ 0x7fff87ffd3ed thread_start make[5]: *** [check-local] Segmentation fault: 11 make[4]: *** [check-am] Error 2 make[3]: *** [check-recursive] Error 1 make[2]: *** [check-recursive] Error 1 make[1]: *** [check] Error 2 make: *** [check-recursive] Error 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4404) SlaveTest.HTTPSchedulerSlaveRestart is flaky
[ https://issues.apache.org/jira/browse/MESOS-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-4404: --- Assignee: Jian Qiu > SlaveTest.HTTPSchedulerSlaveRestart is flaky > > > Key: MESOS-4404 > URL: https://issues.apache.org/jira/browse/MESOS-4404 > Project: Mesos > Issue Type: Bug > Components: HTTP API, slave >Affects Versions: 0.26.0 > Environment: From the Jenkins CI: gcc,--verbose --enable-libevent > --enable-ssl,centos:7,docker >Reporter: Greg Mann >Assignee: Jian Qiu > Labels: flaky-test, mesosphere > > Saw this failure on the Jenkins CI: > {code} > [ RUN ] SlaveTest.HTTPSchedulerSlaveRestart > I0115 18:42:25.393354 1762 leveldb.cpp:174] Opened db in 3.456169ms > I0115 18:42:25.394310 1762 leveldb.cpp:181] Compacted db in 922588ns > I0115 18:42:25.394361 1762 leveldb.cpp:196] Created db iterator in 18529ns > I0115 18:42:25.394378 1762 leveldb.cpp:202] Seeked to beginning of db in > 1933ns > I0115 18:42:25.394390 1762 leveldb.cpp:271] Iterated through 0 keys in the > db in 280ns > I0115 18:42:25.394430 1762 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0115 18:42:25.394963 1791 recover.cpp:447] Starting replica recovery > I0115 18:42:25.395396 1791 recover.cpp:473] Replica is in EMPTY status > I0115 18:42:25.396589 1795 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (11302)@172.17.0.2:49129 > I0115 18:42:25.397101 1785 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0115 18:42:25.397721 1791 recover.cpp:564] Updating replica status to > STARTING > I0115 18:42:25.398764 1789 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 684584ns > I0115 18:42:25.398807 1789 replica.cpp:320] Persisted replica status to > STARTING > I0115 18:42:25.398947 1795 master.cpp:374] Master > 544823be-76b5-47be-b326-2cd6d6a700b8 (e648fe109cb1) started on > 172.17.0.2:49129 > I0115 18:42:25.399209 1788 recover.cpp:473] Replica is in STARTING status > I0115 18:42:25.398980 1795 master.cpp:376] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/BOGaaq/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.27.0/_inst/share/mesos/webui" > --work_dir="/tmp/BOGaaq/master" --zk_session_timeout="10secs" > I0115 18:42:25.399435 1795 master.cpp:421] Master only allowing > authenticated frameworks to register > I0115 18:42:25.399451 1795 master.cpp:426] Master only allowing > authenticated slaves to register > I0115 18:42:25.399461 1795 credentials.hpp:35] Loading credentials for > authentication from '/tmp/BOGaaq/credentials' > I0115 18:42:25.399884 1795 master.cpp:466] Using default 'crammd5' > authenticator > I0115 18:42:25.400060 1795 master.cpp:535] Using default 'basic' HTTP > authenticator > I0115 18:42:25.400254 1795 master.cpp:569] Authorization enabled > I0115 18:42:25.400439 1785 hierarchical.cpp:147] Initialized hierarchical > allocator process > I0115 18:42:25.400470 1789 whitelist_watcher.cpp:77] No whitelist given > I0115 18:42:25.400656 1792 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (11303)@172.17.0.2:49129 > I0115 18:42:25.400943 1781 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0115 18:42:25.401612 1791 recover.cpp:564] Updating replica status to VOTING > I0115 18:42:25.402313 1785 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 458849ns > I0115 18:42:25.402345 1785 replica.cpp:320] Persisted replica status to > VOTING > I0115 18:42:25.402510 1788 recover.cpp:578] Successfully joined the Paxos > group > I0115 18:42:25.402848 1788 recover.cpp:462] Recover process terminated > I0115 18:42:25.402997 1784 master.cpp:1710] The newly elected leader is > master@172.17.0.2:49129 with id 544823be-76b5-47be-b326-2cd6d6a700b8 > I0115 18:42:25.403038 1784 master.cpp:1723] Elected as t
[jira] [Commented] (MESOS-4404) SlaveTest.HTTPSchedulerSlaveRestart is flaky
[ https://issues.apache.org/jira/browse/MESOS-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103819#comment-15103819 ] Jian Qiu commented on MESOS-4404: - After looking into the log, I think the problem is that the slave receives two FrameworkUpdateMessage maybe because of re-registration timeout due to advance(), the test wait for the first one and overwrite the framework pid, however the pid is overwritten again when receiving the second FrameworkUpdateMessage. And it causes the test failure. {code} I0115 18:42:25.696996 1791 slave.cpp:2176] Updating framework 544823be-76b5-47be-b326-2cd6d6a700b8- pid to @0.0.0.0:0 I0115 18:42:25.697836 1791 slave.cpp:2176] Updating framework 544823be-76b5-47be-b326-2cd6d6a700b8- pid to scheduler-5d55118d-2bca-4afb-b6f9-8b60fa1a5274@172.17.0.2:49129 {code} It should be resolved if putting Clock::resume() after the whole test. However, I was unable to reproduce the bug on OSX, so will try centos to reproduce it. > SlaveTest.HTTPSchedulerSlaveRestart is flaky > > > Key: MESOS-4404 > URL: https://issues.apache.org/jira/browse/MESOS-4404 > Project: Mesos > Issue Type: Bug > Components: HTTP API, slave >Affects Versions: 0.26.0 > Environment: From the Jenkins CI: gcc,--verbose --enable-libevent > --enable-ssl,centos:7,docker >Reporter: Greg Mann > Labels: flaky-test, mesosphere > > Saw this failure on the Jenkins CI: > {code} > [ RUN ] SlaveTest.HTTPSchedulerSlaveRestart > I0115 18:42:25.393354 1762 leveldb.cpp:174] Opened db in 3.456169ms > I0115 18:42:25.394310 1762 leveldb.cpp:181] Compacted db in 922588ns > I0115 18:42:25.394361 1762 leveldb.cpp:196] Created db iterator in 18529ns > I0115 18:42:25.394378 1762 leveldb.cpp:202] Seeked to beginning of db in > 1933ns > I0115 18:42:25.394390 1762 leveldb.cpp:271] Iterated through 0 keys in the > db in 280ns > I0115 18:42:25.394430 1762 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0115 18:42:25.394963 1791 recover.cpp:447] Starting replica recovery > I0115 18:42:25.395396 1791 recover.cpp:473] Replica is in EMPTY status > I0115 18:42:25.396589 1795 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (11302)@172.17.0.2:49129 > I0115 18:42:25.397101 1785 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0115 18:42:25.397721 1791 recover.cpp:564] Updating replica status to > STARTING > I0115 18:42:25.398764 1789 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 684584ns > I0115 18:42:25.398807 1789 replica.cpp:320] Persisted replica status to > STARTING > I0115 18:42:25.398947 1795 master.cpp:374] Master > 544823be-76b5-47be-b326-2cd6d6a700b8 (e648fe109cb1) started on > 172.17.0.2:49129 > I0115 18:42:25.399209 1788 recover.cpp:473] Replica is in STARTING status > I0115 18:42:25.398980 1795 master.cpp:376] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/BOGaaq/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.27.0/_inst/share/mesos/webui" > --work_dir="/tmp/BOGaaq/master" --zk_session_timeout="10secs" > I0115 18:42:25.399435 1795 master.cpp:421] Master only allowing > authenticated frameworks to register > I0115 18:42:25.399451 1795 master.cpp:426] Master only allowing > authenticated slaves to register > I0115 18:42:25.399461 1795 credentials.hpp:35] Loading credentials for > authentication from '/tmp/BOGaaq/credentials' > I0115 18:42:25.399884 1795 master.cpp:466] Using default 'crammd5' > authenticator > I0115 18:42:25.400060 1795 master.cpp:535] Using default 'basic' HTTP > authenticator > I0115 18:42:25.400254 1795 master.cpp:569] Authorization enabled > I0115 18:42:25.400439 1785 hierarchical.cpp:147] Initialized hierarchical > allocator process > I0115 18:42:25.400470 1789 whitelist_watcher.cpp:77] No whitelist given > I0115 18:42:25.400656 1792 replica.cpp:673] Replica in STARTING
[jira] [Issue Comment Deleted] (MESOS-4404) SlaveTest.HTTPSchedulerSlaveRestart is flaky
[ https://issues.apache.org/jira/browse/MESOS-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4404: Comment: was deleted (was: Sure, I will take a look at this.) > SlaveTest.HTTPSchedulerSlaveRestart is flaky > > > Key: MESOS-4404 > URL: https://issues.apache.org/jira/browse/MESOS-4404 > Project: Mesos > Issue Type: Bug > Components: HTTP API, slave >Affects Versions: 0.26.0 > Environment: From the Jenkins CI: gcc,--verbose --enable-libevent > --enable-ssl,centos:7,docker >Reporter: Greg Mann > Labels: flaky-test, mesosphere > > Saw this failure on the Jenkins CI: > {code} > [ RUN ] SlaveTest.HTTPSchedulerSlaveRestart > I0115 18:42:25.393354 1762 leveldb.cpp:174] Opened db in 3.456169ms > I0115 18:42:25.394310 1762 leveldb.cpp:181] Compacted db in 922588ns > I0115 18:42:25.394361 1762 leveldb.cpp:196] Created db iterator in 18529ns > I0115 18:42:25.394378 1762 leveldb.cpp:202] Seeked to beginning of db in > 1933ns > I0115 18:42:25.394390 1762 leveldb.cpp:271] Iterated through 0 keys in the > db in 280ns > I0115 18:42:25.394430 1762 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0115 18:42:25.394963 1791 recover.cpp:447] Starting replica recovery > I0115 18:42:25.395396 1791 recover.cpp:473] Replica is in EMPTY status > I0115 18:42:25.396589 1795 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (11302)@172.17.0.2:49129 > I0115 18:42:25.397101 1785 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0115 18:42:25.397721 1791 recover.cpp:564] Updating replica status to > STARTING > I0115 18:42:25.398764 1789 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 684584ns > I0115 18:42:25.398807 1789 replica.cpp:320] Persisted replica status to > STARTING > I0115 18:42:25.398947 1795 master.cpp:374] Master > 544823be-76b5-47be-b326-2cd6d6a700b8 (e648fe109cb1) started on > 172.17.0.2:49129 > I0115 18:42:25.399209 1788 recover.cpp:473] Replica is in STARTING status > I0115 18:42:25.398980 1795 master.cpp:376] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/BOGaaq/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.27.0/_inst/share/mesos/webui" > --work_dir="/tmp/BOGaaq/master" --zk_session_timeout="10secs" > I0115 18:42:25.399435 1795 master.cpp:421] Master only allowing > authenticated frameworks to register > I0115 18:42:25.399451 1795 master.cpp:426] Master only allowing > authenticated slaves to register > I0115 18:42:25.399461 1795 credentials.hpp:35] Loading credentials for > authentication from '/tmp/BOGaaq/credentials' > I0115 18:42:25.399884 1795 master.cpp:466] Using default 'crammd5' > authenticator > I0115 18:42:25.400060 1795 master.cpp:535] Using default 'basic' HTTP > authenticator > I0115 18:42:25.400254 1795 master.cpp:569] Authorization enabled > I0115 18:42:25.400439 1785 hierarchical.cpp:147] Initialized hierarchical > allocator process > I0115 18:42:25.400470 1789 whitelist_watcher.cpp:77] No whitelist given > I0115 18:42:25.400656 1792 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (11303)@172.17.0.2:49129 > I0115 18:42:25.400943 1781 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0115 18:42:25.401612 1791 recover.cpp:564] Updating replica status to VOTING > I0115 18:42:25.402313 1785 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 458849ns > I0115 18:42:25.402345 1785 replica.cpp:320] Persisted replica status to > VOTING > I0115 18:42:25.402510 1788 recover.cpp:578] Successfully joined the Paxos > group > I0115 18:42:25.402848 1788 recover.cpp:462] Recover process terminated > I0115 18:42:25.402997 1784 master.cpp:1710] The newly elected leader is > master@172.17.0.2:49129 with id 544823be-76b5-47be-b326-2cd6d6a700b8 > I0115 18:42:25.403038 1784 master.cpp:1723] Elected
[jira] [Commented] (MESOS-4404) SlaveTest.HTTPSchedulerSlaveRestart is flaky
[ https://issues.apache.org/jira/browse/MESOS-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103209#comment-15103209 ] Jian Qiu commented on MESOS-4404: - Sure, I will take a look at this. > SlaveTest.HTTPSchedulerSlaveRestart is flaky > > > Key: MESOS-4404 > URL: https://issues.apache.org/jira/browse/MESOS-4404 > Project: Mesos > Issue Type: Bug > Components: HTTP API, slave >Affects Versions: 0.26.0 > Environment: From the Jenkins CI: gcc,--verbose --enable-libevent > --enable-ssl,centos:7,docker >Reporter: Greg Mann > Labels: flaky-test, mesosphere > > Saw this failure on the Jenkins CI: > {code} > [ RUN ] SlaveTest.HTTPSchedulerSlaveRestart > I0115 18:42:25.393354 1762 leveldb.cpp:174] Opened db in 3.456169ms > I0115 18:42:25.394310 1762 leveldb.cpp:181] Compacted db in 922588ns > I0115 18:42:25.394361 1762 leveldb.cpp:196] Created db iterator in 18529ns > I0115 18:42:25.394378 1762 leveldb.cpp:202] Seeked to beginning of db in > 1933ns > I0115 18:42:25.394390 1762 leveldb.cpp:271] Iterated through 0 keys in the > db in 280ns > I0115 18:42:25.394430 1762 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0115 18:42:25.394963 1791 recover.cpp:447] Starting replica recovery > I0115 18:42:25.395396 1791 recover.cpp:473] Replica is in EMPTY status > I0115 18:42:25.396589 1795 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (11302)@172.17.0.2:49129 > I0115 18:42:25.397101 1785 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0115 18:42:25.397721 1791 recover.cpp:564] Updating replica status to > STARTING > I0115 18:42:25.398764 1789 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 684584ns > I0115 18:42:25.398807 1789 replica.cpp:320] Persisted replica status to > STARTING > I0115 18:42:25.398947 1795 master.cpp:374] Master > 544823be-76b5-47be-b326-2cd6d6a700b8 (e648fe109cb1) started on > 172.17.0.2:49129 > I0115 18:42:25.399209 1788 recover.cpp:473] Replica is in STARTING status > I0115 18:42:25.398980 1795 master.cpp:376] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/BOGaaq/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.27.0/_inst/share/mesos/webui" > --work_dir="/tmp/BOGaaq/master" --zk_session_timeout="10secs" > I0115 18:42:25.399435 1795 master.cpp:421] Master only allowing > authenticated frameworks to register > I0115 18:42:25.399451 1795 master.cpp:426] Master only allowing > authenticated slaves to register > I0115 18:42:25.399461 1795 credentials.hpp:35] Loading credentials for > authentication from '/tmp/BOGaaq/credentials' > I0115 18:42:25.399884 1795 master.cpp:466] Using default 'crammd5' > authenticator > I0115 18:42:25.400060 1795 master.cpp:535] Using default 'basic' HTTP > authenticator > I0115 18:42:25.400254 1795 master.cpp:569] Authorization enabled > I0115 18:42:25.400439 1785 hierarchical.cpp:147] Initialized hierarchical > allocator process > I0115 18:42:25.400470 1789 whitelist_watcher.cpp:77] No whitelist given > I0115 18:42:25.400656 1792 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (11303)@172.17.0.2:49129 > I0115 18:42:25.400943 1781 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0115 18:42:25.401612 1791 recover.cpp:564] Updating replica status to VOTING > I0115 18:42:25.402313 1785 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 458849ns > I0115 18:42:25.402345 1785 replica.cpp:320] Persisted replica status to > VOTING > I0115 18:42:25.402510 1788 recover.cpp:578] Successfully joined the Paxos > group > I0115 18:42:25.402848 1788 recover.cpp:462] Recover process terminated > I0115 18:42:25.402997 1784 master.cpp:1710] The newly elected leader is > master@172.17.0.2:49129 with id 544823be-76b5-47be-b326-2cd6d6a700b8 > I0115 18:42:25.403038 1784
[jira] [Commented] (MESOS-4404) SlaveTest.HTTPSchedulerSlaveRestart is flaky
[ https://issues.apache.org/jira/browse/MESOS-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103208#comment-15103208 ] Jian Qiu commented on MESOS-4404: - Sure, I will take a look at this. > SlaveTest.HTTPSchedulerSlaveRestart is flaky > > > Key: MESOS-4404 > URL: https://issues.apache.org/jira/browse/MESOS-4404 > Project: Mesos > Issue Type: Bug > Components: HTTP API, slave >Affects Versions: 0.26.0 > Environment: From the Jenkins CI: gcc,--verbose --enable-libevent > --enable-ssl,centos:7,docker >Reporter: Greg Mann > Labels: flaky-test, mesosphere > > Saw this failure on the Jenkins CI: > {code} > [ RUN ] SlaveTest.HTTPSchedulerSlaveRestart > I0115 18:42:25.393354 1762 leveldb.cpp:174] Opened db in 3.456169ms > I0115 18:42:25.394310 1762 leveldb.cpp:181] Compacted db in 922588ns > I0115 18:42:25.394361 1762 leveldb.cpp:196] Created db iterator in 18529ns > I0115 18:42:25.394378 1762 leveldb.cpp:202] Seeked to beginning of db in > 1933ns > I0115 18:42:25.394390 1762 leveldb.cpp:271] Iterated through 0 keys in the > db in 280ns > I0115 18:42:25.394430 1762 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0115 18:42:25.394963 1791 recover.cpp:447] Starting replica recovery > I0115 18:42:25.395396 1791 recover.cpp:473] Replica is in EMPTY status > I0115 18:42:25.396589 1795 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (11302)@172.17.0.2:49129 > I0115 18:42:25.397101 1785 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0115 18:42:25.397721 1791 recover.cpp:564] Updating replica status to > STARTING > I0115 18:42:25.398764 1789 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 684584ns > I0115 18:42:25.398807 1789 replica.cpp:320] Persisted replica status to > STARTING > I0115 18:42:25.398947 1795 master.cpp:374] Master > 544823be-76b5-47be-b326-2cd6d6a700b8 (e648fe109cb1) started on > 172.17.0.2:49129 > I0115 18:42:25.399209 1788 recover.cpp:473] Replica is in STARTING status > I0115 18:42:25.398980 1795 master.cpp:376] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/BOGaaq/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.27.0/_inst/share/mesos/webui" > --work_dir="/tmp/BOGaaq/master" --zk_session_timeout="10secs" > I0115 18:42:25.399435 1795 master.cpp:421] Master only allowing > authenticated frameworks to register > I0115 18:42:25.399451 1795 master.cpp:426] Master only allowing > authenticated slaves to register > I0115 18:42:25.399461 1795 credentials.hpp:35] Loading credentials for > authentication from '/tmp/BOGaaq/credentials' > I0115 18:42:25.399884 1795 master.cpp:466] Using default 'crammd5' > authenticator > I0115 18:42:25.400060 1795 master.cpp:535] Using default 'basic' HTTP > authenticator > I0115 18:42:25.400254 1795 master.cpp:569] Authorization enabled > I0115 18:42:25.400439 1785 hierarchical.cpp:147] Initialized hierarchical > allocator process > I0115 18:42:25.400470 1789 whitelist_watcher.cpp:77] No whitelist given > I0115 18:42:25.400656 1792 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (11303)@172.17.0.2:49129 > I0115 18:42:25.400943 1781 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0115 18:42:25.401612 1791 recover.cpp:564] Updating replica status to VOTING > I0115 18:42:25.402313 1785 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 458849ns > I0115 18:42:25.402345 1785 replica.cpp:320] Persisted replica status to > VOTING > I0115 18:42:25.402510 1788 recover.cpp:578] Successfully joined the Paxos > group > I0115 18:42:25.402848 1788 recover.cpp:462] Recover process terminated > I0115 18:42:25.402997 1784 master.cpp:1710] The newly elected leader is > master@172.17.0.2:49129 with id 544823be-76b5-47be-b326-2cd6d6a700b8 > I0115 18:42:25.403038 1784
[jira] [Updated] (MESOS-4174) HookTest.VerifySlaveLaunchExecutorHook is slow
[ https://issues.apache.org/jira/browse/MESOS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4174: Shepherd: Timothy Chen > HookTest.VerifySlaveLaunchExecutorHook is slow > -- > > Key: MESOS-4174 > URL: https://issues.apache.org/jira/browse/MESOS-4174 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{HookTest.VerifySlaveLaunchExecutorHook}} test takes more than {{5s}} to > finish on my Mac OS 10.10.4: > {code} > HookTest.VerifySlaveLaunchExecutorHook (5061 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4158) Speed up SlaveRecoveryTest.*
[ https://issues.apache.org/jira/browse/MESOS-4158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4158: Shepherd: Timothy Chen > Speed up SlaveRecoveryTest.* > > > Key: MESOS-4158 > URL: https://issues.apache.org/jira/browse/MESOS-4158 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > SlaveRecoveryTest/0.RecoverStatusUpdateManager (2260 ms) > SlaveRecoveryTest/0.ReconnectExecutor (2261 ms) > SlaveRecoveryTest/0.RecoverCompletedExecutor (1288 ms) > SlaveRecoveryTest/0.CleanupExecutor (1290 ms) > SlaveRecoveryTest/0.Reboot (1360 ms) > SlaveRecoveryTest/0.ShutdownSlave (1321 ms) > SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1 (1360 ms) > SlaveRecoveryTest/0.ReconcileKillTask (3123 ms) > SlaveRecoveryTest/0.ReconcileShutdownFramework (3353 ms) > SlaveRecoveryTest/0.MasterFailover (1355 ms) > SlaveRecoveryTest/0.MultipleFrameworks (1555 ms) > SlaveRecoveryTest/0.MultipleSlaves (1444 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4161) SlaveTest.CommandExecutorWithOverride is slow
[ https://issues.apache.org/jira/browse/MESOS-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-4161: Shepherd: Timothy Chen > SlaveTest.CommandExecutorWithOverride is slow > - > > Key: MESOS-4161 > URL: https://issues.apache.org/jira/browse/MESOS-4161 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.CommandExecutorWithOverride}} test takes around {{1.3s}} to > finish on my Mac OS 10.10.4: > {code} > SlaveTest.CommandExecutorWithOverride (1311 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4174) HookTest.VerifySlaveLaunchExecutorHook is slow
[ https://issues.apache.org/jira/browse/MESOS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097444#comment-15097444 ] Jian Qiu commented on MESOS-4174: - RR https://reviews.apache.org/r/42241/ > HookTest.VerifySlaveLaunchExecutorHook is slow > -- > > Key: MESOS-4174 > URL: https://issues.apache.org/jira/browse/MESOS-4174 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{HookTest.VerifySlaveLaunchExecutorHook}} test takes more than {{5s}} to > finish on my Mac OS 10.10.4: > {code} > HookTest.VerifySlaveLaunchExecutorHook (5061 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4174) HookTest.VerifySlaveLaunchExecutorHook is slow
[ https://issues.apache.org/jira/browse/MESOS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-4174: --- Assignee: Jian Qiu > HookTest.VerifySlaveLaunchExecutorHook is slow > -- > > Key: MESOS-4174 > URL: https://issues.apache.org/jira/browse/MESOS-4174 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{HookTest.VerifySlaveLaunchExecutorHook}} test takes more than {{5s}} to > finish on my Mac OS 10.10.4: > {code} > HookTest.VerifySlaveLaunchExecutorHook (5061 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4158) Speed up SlaveRecoveryTest.*
[ https://issues.apache.org/jira/browse/MESOS-4158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082510#comment-15082510 ] Jian Qiu commented on MESOS-4158: - RR https://reviews.apache.org/r/41787/ > Speed up SlaveRecoveryTest.* > > > Key: MESOS-4158 > URL: https://issues.apache.org/jira/browse/MESOS-4158 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > SlaveRecoveryTest/0.RecoverStatusUpdateManager (2260 ms) > SlaveRecoveryTest/0.ReconnectExecutor (2261 ms) > SlaveRecoveryTest/0.RecoverCompletedExecutor (1288 ms) > SlaveRecoveryTest/0.CleanupExecutor (1290 ms) > SlaveRecoveryTest/0.Reboot (1360 ms) > SlaveRecoveryTest/0.ShutdownSlave (1321 ms) > SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1 (1360 ms) > SlaveRecoveryTest/0.ReconcileKillTask (3123 ms) > SlaveRecoveryTest/0.ReconcileShutdownFramework (3353 ms) > SlaveRecoveryTest/0.MasterFailover (1355 ms) > SlaveRecoveryTest/0.MultipleFrameworks (1555 ms) > SlaveRecoveryTest/0.MultipleSlaves (1444 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4161) SlaveTest.CommandExecutorWithOverride is slow
[ https://issues.apache.org/jira/browse/MESOS-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073739#comment-15073739 ] Jian Qiu commented on MESOS-4161: - SGTM :) There are several other tests that also relates to this issue, I will also add comments for them. > SlaveTest.CommandExecutorWithOverride is slow > - > > Key: MESOS-4161 > URL: https://issues.apache.org/jira/browse/MESOS-4161 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.CommandExecutorWithOverride}} test takes around {{1.3s}} to > finish on my Mac OS 10.10.4: > {code} > SlaveTest.CommandExecutorWithOverride (1311 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4161) SlaveTest.CommandExecutorWithOverride is slow
[ https://issues.apache.org/jira/browse/MESOS-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073392#comment-15073392 ] Jian Qiu edited comment on MESOS-4161 at 12/29/15 2:49 AM: --- CommmandExecutor::reaped sleeps 1 second to avoid races. https://github.com/apache/mesos/blob/master/src/launcher/executor.cpp#L531 we can avoid this sleep by explicit killing the executor. was (Author: qiujian): The reason to explicitly killing the executor is that CommmandExecutor::reaped sleeps 1 second to avoid races. https://github.com/apache/mesos/blob/master/src/launcher/executor.cpp#L531 we may avoid this sleep in test? > SlaveTest.CommandExecutorWithOverride is slow > - > > Key: MESOS-4161 > URL: https://issues.apache.org/jira/browse/MESOS-4161 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.CommandExecutorWithOverride}} test takes around {{1.3s}} to > finish on my Mac OS 10.10.4: > {code} > SlaveTest.CommandExecutorWithOverride (1311 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4161) SlaveTest.CommandExecutorWithOverride is slow
[ https://issues.apache.org/jira/browse/MESOS-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073392#comment-15073392 ] Jian Qiu commented on MESOS-4161: - The reason to explicitly killing the executor is that CommmandExecutor::reaped sleeps 1 second to avoid races. https://github.com/apache/mesos/blob/master/src/launcher/executor.cpp#L531 we may avoid this sleep in test? > SlaveTest.CommandExecutorWithOverride is slow > - > > Key: MESOS-4161 > URL: https://issues.apache.org/jira/browse/MESOS-4161 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.CommandExecutorWithOverride}} test takes around {{1.3s}} to > finish on my Mac OS 10.10.4: > {code} > SlaveTest.CommandExecutorWithOverride (1311 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4163) SlaveTest.HTTPSchedulerSlaveRestart is slow
[ https://issues.apache.org/jira/browse/MESOS-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073352#comment-15073352 ] Jian Qiu commented on MESOS-4163: - https://reviews.apache.org/r/41675/ > SlaveTest.HTTPSchedulerSlaveRestart is slow > --- > > Key: MESOS-4163 > URL: https://issues.apache.org/jira/browse/MESOS-4163 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.HTTPSchedulerSlaveRestart}} test takes more than {{2s}} to > finish on my Mac OS 10.10.4: > {code} > SlaveTest.HTTPSchedulerSlaveRestart (2307 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4158) Speed up SlaveRecoveryTest.*
[ https://issues.apache.org/jira/browse/MESOS-4158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066458#comment-15066458 ] Jian Qiu commented on MESOS-4158: - I think we need to add Clock::pause, Clock::settle and Clock::advance in test cases to accelerate them. > Speed up SlaveRecoveryTest.* > > > Key: MESOS-4158 > URL: https://issues.apache.org/jira/browse/MESOS-4158 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > SlaveRecoveryTest/0.RecoverStatusUpdateManager (2260 ms) > SlaveRecoveryTest/0.ReconnectExecutor (2261 ms) > SlaveRecoveryTest/0.RecoverCompletedExecutor (1288 ms) > SlaveRecoveryTest/0.CleanupExecutor (1290 ms) > SlaveRecoveryTest/0.Reboot (1360 ms) > SlaveRecoveryTest/0.ShutdownSlave (1321 ms) > SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1 (1360 ms) > SlaveRecoveryTest/0.ReconcileKillTask (3123 ms) > SlaveRecoveryTest/0.ReconcileShutdownFramework (3353 ms) > SlaveRecoveryTest/0.MasterFailover (1355 ms) > SlaveRecoveryTest/0.MultipleFrameworks (1555 ms) > SlaveRecoveryTest/0.MultipleSlaves (1444 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4158) Speed up SlaveRecoveryTest.*
[ https://issues.apache.org/jira/browse/MESOS-4158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-4158: --- Assignee: Jian Qiu > Speed up SlaveRecoveryTest.* > > > Key: MESOS-4158 > URL: https://issues.apache.org/jira/browse/MESOS-4158 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > SlaveRecoveryTest/0.RecoverStatusUpdateManager (2260 ms) > SlaveRecoveryTest/0.ReconnectExecutor (2261 ms) > SlaveRecoveryTest/0.RecoverCompletedExecutor (1288 ms) > SlaveRecoveryTest/0.CleanupExecutor (1290 ms) > SlaveRecoveryTest/0.Reboot (1360 ms) > SlaveRecoveryTest/0.ShutdownSlave (1321 ms) > SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1 (1360 ms) > SlaveRecoveryTest/0.ReconcileKillTask (3123 ms) > SlaveRecoveryTest/0.ReconcileShutdownFramework (3353 ms) > SlaveRecoveryTest/0.MasterFailover (1355 ms) > SlaveRecoveryTest/0.MultipleFrameworks (1555 ms) > SlaveRecoveryTest/0.MultipleSlaves (1444 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4163) SlaveTest.HTTPSchedulerSlaveRestart is slow
[ https://issues.apache.org/jira/browse/MESOS-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-4163: --- Assignee: Jian Qiu > SlaveTest.HTTPSchedulerSlaveRestart is slow > --- > > Key: MESOS-4163 > URL: https://issues.apache.org/jira/browse/MESOS-4163 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.HTTPSchedulerSlaveRestart}} test takes more than {{2s}} to > finish on my Mac OS 10.10.4: > {code} > SlaveTest.HTTPSchedulerSlaveRestart (2307 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4155) Speed up ExamplesTest.*
[ https://issues.apache.org/jira/browse/MESOS-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059606#comment-15059606 ] Jian Qiu commented on MESOS-4155: - One of the problem comes from authentication time out {code} W1216 15:02:52.761855 162168832 sched.cpp:429] Authentication timed out I1216 15:02:52.762025 162168832 sched.cpp:387] Failed to authenticate with master master@192.168.99.1:54409: Authentication discarded I1216 15:02:52.762082 162168832 sched.cpp:318] Authenticating with master master@192.168.99.1:54409 I1216 15:02:52.762097 162168832 sched.cpp:325] Using default CRAM-MD5 authenticatee {code} > Speed up ExamplesTest.* > --- > > Key: MESOS-4155 > URL: https://issues.apache.org/jira/browse/MESOS-4155 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > ExamplesTest.TestFramework (5225 ms) > ExamplesTest.NoExecutorFramework (5387 ms) > ExamplesTest.EventCallFramework (1238 ms) > ExamplesTest.PersistentVolumeFramework (3380 ms) > ExamplesTest.JavaFramework (6159 ms) > ExamplesTest.JavaException (1 ms) > ExamplesTest.JavaLog (1174 ms) > ExamplesTest.PythonFramework (7126 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4161) SlaveTest.CommandExecutorWithOverride is slow
[ https://issues.apache.org/jira/browse/MESOS-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-4161: --- Assignee: Jian Qiu > SlaveTest.CommandExecutorWithOverride is slow > - > > Key: MESOS-4161 > URL: https://issues.apache.org/jira/browse/MESOS-4161 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Jian Qiu >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.CommandExecutorWithOverride}} test takes around {{1.3s}} to > finish on my Mac OS 10.10.4: > {code} > SlaveTest.CommandExecutorWithOverride (1311 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3841) Master HTTP API support to get the leader
[ https://issues.apache.org/jira/browse/MESOS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050187#comment-15050187 ] Jian Qiu commented on MESOS-3841: - How about an endpoint {code} http://master:port/leader {code} with a return of {code} {"leader": {"hostname":"xxx","ip":"x.x.x.x","port":5050}} {code} > Master HTTP API support to get the leader > - > > Key: MESOS-3841 > URL: https://issues.apache.org/jira/browse/MESOS-3841 > Project: Mesos > Issue Type: Improvement > Components: HTTP API >Reporter: Cosmin Lehene >Assignee: Jian Qiu > > There's currently no good way to query the current master ensemble leader. > Some workarounds to get the leader (and parse it from leader@ip) from > {{/state.json}} or to grep it from {{master/redirect}}. > The scheduler API does an HTTP redirect, but that requires an HTTP POST > coming from a framework as well > {{POST /api/v1/scheduler HTTP/1.1}} > There should be a lightweight API call to get the current master. > This could be part of a more granular representation (REST) of the current > state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3841) Master HTTP API support to get the leader
[ https://issues.apache.org/jira/browse/MESOS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-3841: --- Assignee: Jian Qiu > Master HTTP API support to get the leader > - > > Key: MESOS-3841 > URL: https://issues.apache.org/jira/browse/MESOS-3841 > Project: Mesos > Issue Type: Improvement > Components: HTTP API >Reporter: Cosmin Lehene >Assignee: Jian Qiu > > There's currently no good way to query the current master ensemble leader. > Some workarounds to get the leader (and parse it from leader@ip) from > {{/state.json}} or to grep it from {{master/redirect}}. > The scheduler API does an HTTP redirect, but that requires an HTTP POST > coming from a framework as well > {{POST /api/v1/scheduler HTTP/1.1}} > There should be a lightweight API call to get the current master. > This could be part of a more granular representation (REST) of the current > state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3792) flags.acls in /state.json response is not the flag value passed to Mesos master
[ https://issues.apache.org/jira/browse/MESOS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021574#comment-15021574 ] Jian Qiu commented on MESOS-3792: - RR https://reviews.apache.org/r/40224/ > flags.acls in /state.json response is not the flag value passed to Mesos > master > --- > > Key: MESOS-3792 > URL: https://issues.apache.org/jira/browse/MESOS-3792 > Project: Mesos > Issue Type: Bug >Reporter: James Fisher >Assignee: Jian Qiu > > Steps to reproduce: Start Mesos master with the `--acls` flag set to the > following value: > {code} > { "run_tasks": [ { "principals": { "values": ["foo", "bar"] }, "users": { > "values": ["alice"] } } ] } > {code} > Then make a request to {{http://mesosmaster:5050/state.json}} and extract the > value for key `flags.acls` from the JSON body of the response. > Expected behavior: the value is the same JSON string passed on the > command-line. > Actual behavior: the value is this string in some unknown syntax: > {code} > run_tasks { > principals { > values: "foo" > values: "bar" > } > users { > values: "alice" > } > } > {code} > I don't know what this is, but it's not an ACL expression according to the > documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3272) CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky.
[ https://issues.apache.org/jira/browse/MESOS-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020864#comment-15020864 ] Jian Qiu commented on MESOS-3272: - [~tillt] I have a RR for this issue, could you be the shepherd? > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky. > > > Key: MESOS-3272 > URL: https://issues.apache.org/jira/browse/MESOS-3272 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Paul Brett >Assignee: Jian Qiu > Attachments: build.log > > > Test aborts when configured with python, libevent and SSL on Ubuntu12. > [ RUN ] > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer > *** Aborted at 1439667937 (unix time) try "date -d @1439667937" if you are > using GNU date *** > PC: @ 0x7feba972a753 (unknown) > *** SIGSEGV (@0x0) received by PID 4359 (TID 0x7febabf897c0) from PID 0; > stack trace: *** > @ 0x7feba8f7dcb0 (unknown) > @ 0x7feba972a753 (unknown) > @ 0x7febaaa69328 process::dispatch<>() > @ 0x7febaaa5e9a7 cgroups::freezer::thaw() > @ 0xba64ff > mesos::internal::tests::CgroupsAnyHierarchyWithCpuMemoryTest_ROOT_CGROUPS_FreezeNonFreezer_Test::TestBody() > @ 0xc199a3 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0xc0f947 testing::Test::Run() > @ 0xc0f9ee testing::TestInfo::Run() > @ 0xc0faf5 testing::TestCase::Run() > @ 0xc0fda8 testing::internal::UnitTestImpl::RunAllTests() > @ 0xc10064 testing::UnitTest::Run() > @ 0x4b3273 main > @ 0x7feba8bd176d (unknown) > @ 0x4bf1f1 (unknown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3792) flags.acls in /state.json response is not the flag value passed to Mesos master
[ https://issues.apache.org/jira/browse/MESOS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970753#comment-14970753 ] Jian Qiu edited comment on MESOS-3792 at 11/18/15 1:17 PM: --- many values (maybe all) in flags has the same issue, I tried --firewall_rules and query /state which return something as below {code} "firewall_rules":"disabled_endpoints { paths: "/files/browse" paths: "/slave(0)/stats.json" }" {code} The reason is that the value in flag returned by /state (also /flag) is stringified, however, if the value is a protobuf message, it should be converted into JSON object rather than string. I would like to take a try, anyone can help shepherding? was (Author: qiujian): many values (maybe all) in flags has the same issue, I tried --firewall_rules and query /state which return something as below {code} "firewall_rules":"disabled_endpoints { paths: "/files/browse" paths: "/slave(0)/stats.json" }" {code} The reason is that the value in flag returned by /state (also /flag) is stringified, however, if the value is a protobuf message, it should be converted into JSON object rather than string. Maybe we should add functions in stout/flag.hpp like: {code} struct Flag { ... bool protobuf; lambda::function(const FlagsBase&)> jsonfy; ... }; {code} So if a flag is protobuf message, it will be converted to a JSON object. I would like to take try, anyone can help shepherding? > flags.acls in /state.json response is not the flag value passed to Mesos > master > --- > > Key: MESOS-3792 > URL: https://issues.apache.org/jira/browse/MESOS-3792 > Project: Mesos > Issue Type: Bug >Reporter: James Fisher >Assignee: Jian Qiu > > Steps to reproduce: Start Mesos master with the `--acls` flag set to the > following value: > {code} > { "run_tasks": [ { "principals": { "values": ["foo", "bar"] }, "users": { > "values": ["alice"] } } ] } > {code} > Then make a request to {{http://mesosmaster:5050/state.json}} and extract the > value for key `flags.acls` from the JSON body of the response. > Expected behavior: the value is the same JSON string passed on the > command-line. > Actual behavior: the value is this string in some unknown syntax: > {code} > run_tasks { > principals { > values: "foo" > values: "bar" > } > users { > values: "alice" > } > } > {code} > I don't know what this is, but it's not an ACL expression according to the > documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3792) flags.acls in /state.json response is not the flag value passed to Mesos master
[ https://issues.apache.org/jira/browse/MESOS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-3792: --- Assignee: Jian Qiu > flags.acls in /state.json response is not the flag value passed to Mesos > master > --- > > Key: MESOS-3792 > URL: https://issues.apache.org/jira/browse/MESOS-3792 > Project: Mesos > Issue Type: Bug >Reporter: James Fisher >Assignee: Jian Qiu > > Steps to reproduce: Start Mesos master with the `--acls` flag set to the > following value: > {code} > { "run_tasks": [ { "principals": { "values": ["foo", "bar"] }, "users": { > "values": ["alice"] } } ] } > {code} > Then make a request to {{http://mesosmaster:5050/state.json}} and extract the > value for key `flags.acls` from the JSON body of the response. > Expected behavior: the value is the same JSON string passed on the > command-line. > Actual behavior: the value is this string in some unknown syntax: > {code} > run_tasks { > principals { > values: "foo" > values: "bar" > } > users { > values: "alice" > } > } > {code} > I don't know what this is, but it's not an ACL expression according to the > documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3792) flags.acls in /state.json response is not the flag value passed to Mesos master
[ https://issues.apache.org/jira/browse/MESOS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970753#comment-14970753 ] Jian Qiu edited comment on MESOS-3792 at 10/26/15 9:30 AM: --- many values (maybe all) in flags has the same issue, I tried --firewall_rules and query /state which return something as below {code} "firewall_rules":"disabled_endpoints { paths: "/files/browse" paths: "/slave(0)/stats.json" }" {code} The reason is that the value in flag returned by /state (also /flag) is stringified, however, if the value is a protobuf message, it should be converted into JSON object rather than string. Maybe we should add functions in stout/flag.hpp like: {code} struct Flag { ... bool protobuf; lambda::function(const FlagsBase&)> jsonfy; ... }; {code} So if a flag is protobuf message, it will be converted to a JSON object. I would like to take try, anyone can help shepherding? was (Author: qiujian): many values (maybe all) in flags has the same issue, I tried --firewall_rules and query /state which return something as below {code} "firewall_rules":"disabled_endpoints { paths: \"\/files\/browse\" paths: \"\/slave(0)\/stats.json\"}" {code} I think the reason is that the value in flag returned by /state is stringified in protobuf format rather than json format? > flags.acls in /state.json response is not the flag value passed to Mesos > master > --- > > Key: MESOS-3792 > URL: https://issues.apache.org/jira/browse/MESOS-3792 > Project: Mesos > Issue Type: Bug >Reporter: James Fisher > > Steps to reproduce: Start Mesos master with the `--acls` flag set to the > following value: > {code} > { "run_tasks": [ { "principals": { "values": ["foo", "bar"] }, "users": { > "values": ["alice"] } } ] } > {code} > Then make a request to {{http://mesosmaster:5050/state.json}} and extract the > value for key `flags.acls` from the JSON body of the response. > Expected behavior: the value is the same JSON string passed on the > command-line. > Actual behavior: the value is this string in some unknown syntax: > {code} > run_tasks { > principals { > values: "foo" > values: "bar" > } > users { > values: "alice" > } > } > {code} > I don't know what this is, but it's not an ACL expression according to the > documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3792) flags.acls in /state.json response is not the flag value passed to Mesos master
[ https://issues.apache.org/jira/browse/MESOS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970753#comment-14970753 ] Jian Qiu edited comment on MESOS-3792 at 10/23/15 10:02 AM: many values (maybe all) in flags has the same issue, I tried --firewall_rules and query /state which return something as below {code} "firewall_rules":"disabled_endpoints { paths: \"\/files\/browse\" paths: \"\/slave(0)\/stats.json\"}" {code} I think the reason is that the value in flag returned by /state is stringified in protobuf format rather than json format? was (Author: qiujian): many values (maybe all) in flags has the same issue, I tried --firewall_rules and query /state which return something as below {code} "firewall_rules":"disabled_endpoints {\n paths: \"\/files\/browse\"\n paths: \"\/slave(0)\/stats.json\"\n}\n" {code} I think the reason is that the value in flag returned by /state is stringified in protobuf format rather than json format? > flags.acls in /state.json response is not the flag value passed to Mesos > master > --- > > Key: MESOS-3792 > URL: https://issues.apache.org/jira/browse/MESOS-3792 > Project: Mesos > Issue Type: Bug >Reporter: James Fisher > > Steps to reproduce: Start Mesos master with the `--acls` flag set to the > following value: > {code} > { "run_tasks": [ { "principals": { "values": ["foo", "bar"] }, "users": { > "values": ["alice"] } } ] } > {code} > Then make a request to {{http://mesosmaster:5050/state.json}} and extract the > value for key `flags.acls` from the JSON body of the response. > Expected behavior: the value is the same JSON string passed on the > command-line. > Actual behavior: the value is this string in some unknown syntax: > {code} > run_tasks { > principals { > values: "foo" > values: "bar" > } > users { > values: "alice" > } > } > {code} > I don't know what this is, but it's not an ACL expression according to the > documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3792) flags.acls in /state.json response is not the flag value passed to Mesos master
[ https://issues.apache.org/jira/browse/MESOS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970753#comment-14970753 ] Jian Qiu commented on MESOS-3792: - many values (maybe all) in flags has the same issue, I tried --firewall_rules and query /state which return something as below {code} "firewall_rules":"disabled_endpoints {\n paths: \"\/files\/browse\"\n paths: \"\/slave(0)\/stats.json\"\n}\n" {code} I think the reason is that the value in flag returned by /state is stringified in protobuf format rather than json format? > flags.acls in /state.json response is not the flag value passed to Mesos > master > --- > > Key: MESOS-3792 > URL: https://issues.apache.org/jira/browse/MESOS-3792 > Project: Mesos > Issue Type: Bug >Reporter: James Fisher > > Steps to reproduce: Start Mesos master with the `--acls` flag set to the > following value: > {code} > { "run_tasks": [ { "principals": { "values": ["foo", "bar"] }, "users": { > "values": ["alice"] } } ] } > {code} > Then make a request to {{http://mesosmaster:5050/state.json}} and extract the > value for key `flags.acls` from the JSON body of the response. > Expected behavior: the value is the same JSON string passed on the > command-line. > Actual behavior: the value is this string in some unknown syntax: > {code} > run_tasks { > principals { > values: "foo" > values: "bar" > } > users { > values: "alice" > } > } > {code} > I don't know what this is, but it's not an ACL expression according to the > documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3550) Create a Executor Library based on the new Executor HTTP API
[ https://issues.apache.org/jira/browse/MESOS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958384#comment-14958384 ] Jian Qiu commented on MESOS-3550: - I think the Executor HTTP API interface will be pretty much the same as the Scheduler HTTP API? > Create a Executor Library based on the new Executor HTTP API > > > Key: MESOS-3550 > URL: https://issues.apache.org/jira/browse/MESOS-3550 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar > Labels: mesosphere > > Similar to the Scheduler Library {{src/scheduler/scheduler.cpp}} , we would > need a Executor Library that speaks the new Executor HTTP API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3570) Make Scheduler Library use HTTP Pipelining Abstraction in Libprocess
[ https://issues.apache.org/jira/browse/MESOS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958381#comment-14958381 ] Jian Qiu commented on MESOS-3570: - This is due to that internal::request is a one-time connection, so we actually may need pipelining abstraction in internal::request? > Make Scheduler Library use HTTP Pipelining Abstraction in Libprocess > > > Key: MESOS-3570 > URL: https://issues.apache.org/jira/browse/MESOS-3570 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar > Labels: mesosphere, newbie > > Currently, the scheduler library sends calls in order by chaining them and > sending them only when it has received a response for the earlier call. This > was done because there was no HTTP Pipelining abstraction in Libprocess > {{process::post}}. > However once {{MESOS-3332}} is resolved, we should be now able to use the new > abstraction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3215) CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04
[ https://issues.apache.org/jira/browse/MESOS-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906054#comment-14906054 ] Jian Qiu edited comment on MESOS-3215 at 9/24/15 8:57 AM: -- I tested in Ubuntu 14.04 virtual environment and got the same error. The main reason is that perf does not support "cycle" in virtual env. perf stat ls cycles 0 stalled-cycles-frontend #0.00% frontend cycles idle 0 stalled-cycles-backend#0.00% backend cycles idle instructions branches branch-misses However if I enable virtual CPU performance counter for my VM, the test is passed. was (Author: qiujian): I tested in Ubuntu 14.04 virtual environment and get the same error. The main reason is that perf does not support "cycle" in virtual env. perf stat ls cycles 0 stalled-cycles-frontend #0.00% frontend cycles idle 0 stalled-cycles-backend#0.00% backend cycles idle instructions branches branch-misses However if I enable virtual CPU performance counter, the test is passed. > CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04 > > > Key: MESOS-3215 > URL: https://issues.apache.org/jira/browse/MESOS-3215 > Project: Mesos > Issue Type: Bug >Reporter: Artem Harutyunyan > Labels: mesosphere > > [ RUN ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf > ../../src/tests/containerizer/cgroups_tests.cpp:172: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy > ../../src/tests/containerizer/cgroups_tests.cpp:190: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy > [ FAILED ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf (9 ms) > [--] 1 test from CgroupsAnyHierarchyWithPerfEventTest (9 ms total) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3215) CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04
[ https://issues.apache.org/jira/browse/MESOS-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906054#comment-14906054 ] Jian Qiu commented on MESOS-3215: - I tested in Ubuntu 14.04 virtual environment and get the same error. The main reason is that perf does not support "cycle" in virtual env. perf stat ls cycles 0 stalled-cycles-frontend #0.00% frontend cycles idle 0 stalled-cycles-backend#0.00% backend cycles idle instructions branches branch-misses However if I enable virtual CPU performance counter, the test is passed. > CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04 > > > Key: MESOS-3215 > URL: https://issues.apache.org/jira/browse/MESOS-3215 > Project: Mesos > Issue Type: Bug >Reporter: Artem Harutyunyan > Labels: mesosphere > > [ RUN ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf > ../../src/tests/containerizer/cgroups_tests.cpp:172: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy > ../../src/tests/containerizer/cgroups_tests.cpp:190: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy > [ FAILED ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf (9 ms) > [--] 1 test from CgroupsAnyHierarchyWithPerfEventTest (9 ms total) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3220) Offer ability to kill tasks from the API
[ https://issues.apache.org/jira/browse/MESOS-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901834#comment-14901834 ] Jian Qiu commented on MESOS-3220: - +1 for this API, and I have the same concern with [~vinodkone]: Can the task be killed in a more planned way like maintenance? Maybe this should be an API that schedule the task killing with a timer. So framework can be notified and kill task actively. If framework does not kill the taks, Mesos can enforce killing it after the timer expires. > Offer ability to kill tasks from the API > > > Key: MESOS-3220 > URL: https://issues.apache.org/jira/browse/MESOS-3220 > Project: Mesos > Issue Type: Improvement > Components: python api >Reporter: Sunil Shah >Assignee: Marco Massenzio >Priority: Blocker > Labels: mesosphere > > We are investigating adding a `dcos task kill` command to our DCOS (and > Mesos) command line interface. Currently the ability to kill tasks is only > offered via the scheduler API so it would be useful to have some ability to > kill tasks directly. > This is a blocker for the DCOS CLI! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791784#comment-14791784 ] Jian Qiu commented on MESOS-3293: - RR: https://reviews.apache.org/r/38454/ > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Assignee: Jian Qiu >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-3293: Comment: was deleted (was: Thanks [~haosd...@gmail.com] I think it should work.) > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Assignee: Jian Qiu >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744626#comment-14744626 ] Jian Qiu commented on MESOS-3293: - Thanks [~haosd...@gmail.com] I think it should work. > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Assignee: Jian Qiu >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744627#comment-14744627 ] Jian Qiu commented on MESOS-3293: - Thanks [~haosd...@gmail.com] I think it should work. > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Assignee: Jian Qiu >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743530#comment-14743530 ] Jian Qiu edited comment on MESOS-3293 at 9/14/15 2:12 PM: -- [~vinodkone], [~marco-mesos] Any suggestion? was (Author: qiujian): [~vinodkone][~marco-mesos] Any suggestion? > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Assignee: Jian Qiu >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743530#comment-14743530 ] Jian Qiu commented on MESOS-3293: - [~vinodkone][~marco-mesos] Any suggestion? > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Assignee: Jian Qiu >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3272) CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky.
[ https://issues.apache.org/jira/browse/MESOS-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740072#comment-14740072 ] Jian Qiu commented on MESOS-3272: - https://reviews.apache.org/r/38287/ > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky. > > > Key: MESOS-3272 > URL: https://issues.apache.org/jira/browse/MESOS-3272 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Paul Brett >Assignee: Jian Qiu > Attachments: build.log > > > Test aborts when configured with python, libevent and SSL on Ubuntu12. > [ RUN ] > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer > *** Aborted at 1439667937 (unix time) try "date -d @1439667937" if you are > using GNU date *** > PC: @ 0x7feba972a753 (unknown) > *** SIGSEGV (@0x0) received by PID 4359 (TID 0x7febabf897c0) from PID 0; > stack trace: *** > @ 0x7feba8f7dcb0 (unknown) > @ 0x7feba972a753 (unknown) > @ 0x7febaaa69328 process::dispatch<>() > @ 0x7febaaa5e9a7 cgroups::freezer::thaw() > @ 0xba64ff > mesos::internal::tests::CgroupsAnyHierarchyWithCpuMemoryTest_ROOT_CGROUPS_FreezeNonFreezer_Test::TestBody() > @ 0xc199a3 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0xc0f947 testing::Test::Run() > @ 0xc0f9ee testing::TestInfo::Run() > @ 0xc0faf5 testing::TestCase::Run() > @ 0xc0fda8 testing::internal::UnitTestImpl::RunAllTests() > @ 0xc10064 testing::UnitTest::Run() > @ 0x4b3273 main > @ 0x7feba8bd176d (unknown) > @ 0x4bf1f1 (unknown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-3293: --- Assignee: Jian Qiu > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Assignee: Jian Qiu >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3293) Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest
[ https://issues.apache.org/jira/browse/MESOS-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738278#comment-14738278 ] Jian Qiu commented on MESOS-3293: - ../../src/tests/containerizer/isolator_tests.cpp:731: Failure Value of: usage.get().processes() Actual: 2 Expected: 1U Which is: 1 ../../src/tests/containerizer/isolator_tests.cpp:732: Failure Value of: usage.get().threads() Actual: 2 Expected: 1U Which is: 1 The reason is that the test case run 'sh -c "while true; do sleep 1; done;"' in the container which generates one additional child process root 18581 0.0 0.0 4448 1496 pts/0 S+ 04:06 0:00 sh -c while true; do sleep 1; done; root 18592 0.0 0.0 7200 652 pts/0 S+ 04:06 0:00 sleep 1 So my proposal is either change the test command to something like "exec sleep 60" or change the expect value to 2 > Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest > -- > > Key: MESOS-3293 > URL: https://issues.apache.org/jira/browse/MESOS-3293 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Priority: Blocker > Labels: mesosphere, tech-debt > Attachments: 20150818-mesos-tests.log > > > h2. LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-3295) Failing ROOT_ tests on CentOS 7.1 - ContainerizerTest
[ https://issues.apache.org/jira/browse/MESOS-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-3295: Comment: was deleted (was: ../../src/tests/containerizer/isolator_tests.cpp:731: Failure Value of: usage.get().processes() Actual: 2 Expected: 1U Which is: 1 ../../src/tests/containerizer/isolator_tests.cpp:732: Failure Value of: usage.get().threads() Actual: 2 Expected: 1U Which is: 1 The reason is that the test case run 'sh -c "while true; do sleep 1; done;"' in the container which generates one additional child process root 18581 0.0 0.0 4448 1496 pts/0S+ 04:06 0:00 sh -c while true; do sleep 1; done; root 18592 0.0 0.0 7200 652 pts/0S+ 04:06 0:00 sleep 1 So my proposal is either change the test command to something like "exec sleep 60" or change the expect value to 2) > Failing ROOT_ tests on CentOS 7.1 - ContainerizerTest > - > > Key: MESOS-3295 > URL: https://issues.apache.org/jira/browse/MESOS-3295 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Priority: Blocker > Labels: mesosphere, tech-debt > > h2. ContainerizerTest.ROOT_CGROUPS_BalloonFramework > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3295) Failing ROOT_ tests on CentOS 7.1 - ContainerizerTest
[ https://issues.apache.org/jira/browse/MESOS-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738258#comment-14738258 ] Jian Qiu commented on MESOS-3295: - ../../src/tests/containerizer/isolator_tests.cpp:731: Failure Value of: usage.get().processes() Actual: 2 Expected: 1U Which is: 1 ../../src/tests/containerizer/isolator_tests.cpp:732: Failure Value of: usage.get().threads() Actual: 2 Expected: 1U Which is: 1 The reason is that the test case run 'sh -c "while true; do sleep 1; done;"' in the container which generates one additional child process root 18581 0.0 0.0 4448 1496 pts/0S+ 04:06 0:00 sh -c while true; do sleep 1; done; root 18592 0.0 0.0 7200 652 pts/0S+ 04:06 0:00 sleep 1 So my proposal is either change the test command to something like "exec sleep 60" or change the expect value to 2 > Failing ROOT_ tests on CentOS 7.1 - ContainerizerTest > - > > Key: MESOS-3295 > URL: https://issues.apache.org/jira/browse/MESOS-3295 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, test >Affects Versions: 0.23.0, 0.24.0 > Environment: CentOS Linux release 7.1 > Linux 3.10.0 >Reporter: Marco Massenzio >Priority: Blocker > Labels: mesosphere, tech-debt > > h2. ContainerizerTest.ROOT_CGROUPS_BalloonFramework > This is one of several ROOT failing tests: we want to track them > *individually* and for each of them decide whether to: > * fix; > * remove; OR > * redesign. > (full verbose logs attached) > h2. Steps to Reproduce > Completely cleaned the build, removed directory, clean pull from {{master}} > (SHA: {{fb93d93}}) - same results, 9 failed tests: > {noformat} > [==] 751 tests from 114 test cases ran. (231218 ms total) > [ PASSED ] 742 tests. > [ FAILED ] 9 tests, listed below: > [ FAILED ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess > [ FAILED ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint > [ FAILED ] > LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem > [ FAILED ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs > 9 FAILED TESTS > YOU HAVE 10 DISABLED TESTS > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3272) CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky.
[ https://issues.apache.org/jira/browse/MESOS-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-3272: --- Assignee: Jian Qiu > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky. > > > Key: MESOS-3272 > URL: https://issues.apache.org/jira/browse/MESOS-3272 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Paul Brett >Assignee: Jian Qiu > Attachments: build.log > > > Test aborts when configured with python, libevent and SSL on Ubuntu12. > [ RUN ] > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer > *** Aborted at 1439667937 (unix time) try "date -d @1439667937" if you are > using GNU date *** > PC: @ 0x7feba972a753 (unknown) > *** SIGSEGV (@0x0) received by PID 4359 (TID 0x7febabf897c0) from PID 0; > stack trace: *** > @ 0x7feba8f7dcb0 (unknown) > @ 0x7feba972a753 (unknown) > @ 0x7febaaa69328 process::dispatch<>() > @ 0x7febaaa5e9a7 cgroups::freezer::thaw() > @ 0xba64ff > mesos::internal::tests::CgroupsAnyHierarchyWithCpuMemoryTest_ROOT_CGROUPS_FreezeNonFreezer_Test::TestBody() > @ 0xc199a3 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0xc0f947 testing::Test::Run() > @ 0xc0f9ee testing::TestInfo::Run() > @ 0xc0faf5 testing::TestCase::Run() > @ 0xc0fda8 testing::internal::UnitTestImpl::RunAllTests() > @ 0xc10064 testing::UnitTest::Run() > @ 0x4b3273 main > @ 0x7feba8bd176d (unknown) > @ 0x4bf1f1 (unknown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3272) CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky.
[ https://issues.apache.org/jira/browse/MESOS-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727632#comment-14727632 ] Jian Qiu commented on MESOS-3272: - I plan to fix this bug, [~jieyu] Coule you be the shepherd on this? The possible reason is that the freezer object in dispatch method is NULL since this object has been deleted in its initialize method. Hence we need to check whether freezer is NULL before call dispatch, does it make sense? > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky. > > > Key: MESOS-3272 > URL: https://issues.apache.org/jira/browse/MESOS-3272 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Paul Brett > Attachments: build.log > > > Test aborts when configured with python, libevent and SSL on Ubuntu12. > [ RUN ] > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer > *** Aborted at 1439667937 (unix time) try "date -d @1439667937" if you are > using GNU date *** > PC: @ 0x7feba972a753 (unknown) > *** SIGSEGV (@0x0) received by PID 4359 (TID 0x7febabf897c0) from PID 0; > stack trace: *** > @ 0x7feba8f7dcb0 (unknown) > @ 0x7feba972a753 (unknown) > @ 0x7febaaa69328 process::dispatch<>() > @ 0x7febaaa5e9a7 cgroups::freezer::thaw() > @ 0xba64ff > mesos::internal::tests::CgroupsAnyHierarchyWithCpuMemoryTest_ROOT_CGROUPS_FreezeNonFreezer_Test::TestBody() > @ 0xc199a3 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0xc0f947 testing::Test::Run() > @ 0xc0f9ee testing::TestInfo::Run() > @ 0xc0faf5 testing::TestCase::Run() > @ 0xc0fda8 testing::internal::UnitTestImpl::RunAllTests() > @ 0xc10064 testing::UnitTest::Run() > @ 0x4b3273 main > @ 0x7feba8bd176d (unknown) > @ 0x4bf1f1 (unknown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3272) CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky.
[ https://issues.apache.org/jira/browse/MESOS-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727019#comment-14727019 ] Jian Qiu commented on MESOS-3272: - I get ths same errot on Ubuntu 14.04 [ RUN ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer Using temporary directory '/tmp/CgroupsAnyHierarchyWithCpuMemoryTest_ROOT_CGROUPS_FreezeNonFreezer_Ryhca1' I0902 16:44:29.917786 18259 cgroups.cpp:2433] Freezing cgroup /sys/fs/cgroup/cpu/mesos_test *** Aborted at 1441183469 (unix time) try "date -d @1441183469" if you are using GNU date *** PC: @ 0x2ac37642b187 (unknown) *** SIGSEGV (@0x2ac37b876c32) received by PID 18259 (TID 0x2ac370cb8900) from PID 2072472626; stack trace: *** @ 0x2ac376bfe340 (unknown) @ 0x2ac37642b187 (unknown) @ 0x2ac374428289 process::Process<>::self() @ 0x2ac37442c22a process::dispatch<>() @ 0x2ac3744202fb cgroups::freezer::freeze() @ 0x12aef05 mesos::internal::tests::CgroupsAnyHierarchyWithCpuMemoryTest_ROOT_CGROUPS_FreezeNonFreezer_Test::TestBody() @ 0x135faf8 testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x135a96e testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x133c083 testing::Test::Run() @ 0x133c806 testing::TestInfo::Run() @ 0x133ce4c testing::TestCase::Run() @ 0x1343594 testing::internal::UnitTestImpl::RunAllTests() @ 0x136071d testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x135b4e4 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x1342330 testing::UnitTest::Run() @ 0xcc5d41 RUN_ALL_TESTS() @ 0xcc59e9 main @ 0x2ac376e2dec5 (unknown) @ 0x8d5359 (unknown) make[3]: *** [check-local] Segmentation fault (core dumped) > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer is flaky. > > > Key: MESOS-3272 > URL: https://issues.apache.org/jira/browse/MESOS-3272 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Paul Brett > Attachments: build.log > > > Test aborts when configured with python, libevent and SSL on Ubuntu12. > [ RUN ] > CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer > *** Aborted at 1439667937 (unix time) try "date -d @1439667937" if you are > using GNU date *** > PC: @ 0x7feba972a753 (unknown) > *** SIGSEGV (@0x0) received by PID 4359 (TID 0x7febabf897c0) from PID 0; > stack trace: *** > @ 0x7feba8f7dcb0 (unknown) > @ 0x7feba972a753 (unknown) > @ 0x7febaaa69328 process::dispatch<>() > @ 0x7febaaa5e9a7 cgroups::freezer::thaw() > @ 0xba64ff > mesos::internal::tests::CgroupsAnyHierarchyWithCpuMemoryTest_ROOT_CGROUPS_FreezeNonFreezer_Test::TestBody() > @ 0xc199a3 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0xc0f947 testing::Test::Run() > @ 0xc0f9ee testing::TestInfo::Run() > @ 0xc0faf5 testing::TestCase::Run() > @ 0xc0fda8 testing::internal::UnitTestImpl::RunAllTests() > @ 0xc10064 testing::UnitTest::Run() > @ 0x4b3273 main > @ 0x7feba8bd176d (unknown) > @ 0x4bf1f1 (unknown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths
[ https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724618#comment-14724618 ] Jian Qiu commented on MESOS-3329: - Append the related review request: https://reviews.apache.org/r/37955/ > Unused hashmap::existsValue functions have incomplete code paths > > > Key: MESOS-3329 > URL: https://issues.apache.org/jira/browse/MESOS-3329 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Jan Schlicht >Assignee: Jian Qiu >Priority: Trivial > Labels: easyfix, mesosphere > > `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return > true if a certain value exists in the hashmap instance. The control flow of > these functions doesn't cover the case that the value is not found, which > should result in false. Right now the result in this case is undefined. > As the `existsValue` functions are never called this doesn't result in a > compile error atm. > Possible solutions: > 1) Add `return false` > 2) Remove function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths
[ https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu updated MESOS-3329: Comment: was deleted (was: commit 7ef97970ae4d5ea909abbf234a565ae7db62b192 Author: Jian Qiu Date: Mon Aug 31 21:28:34 2015 +0800 Remove hashmap::existsValue since it is never called Review: https://reviews.apache.org/r/37955 ) > Unused hashmap::existsValue functions have incomplete code paths > > > Key: MESOS-3329 > URL: https://issues.apache.org/jira/browse/MESOS-3329 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Jan Schlicht >Assignee: Jian Qiu >Priority: Trivial > Labels: easyfix, mesosphere > > `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return > true if a certain value exists in the hashmap instance. The control flow of > these functions doesn't cover the case that the value is not found, which > should result in false. Right now the result in this case is undefined. > As the `existsValue` functions are never called this doesn't result in a > compile error atm. > Possible solutions: > 1) Add `return false` > 2) Remove function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths
[ https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724614#comment-14724614 ] Jian Qiu commented on MESOS-3329: - [~haosd...@gmail.com] Thanks for reminding me :-) > Unused hashmap::existsValue functions have incomplete code paths > > > Key: MESOS-3329 > URL: https://issues.apache.org/jira/browse/MESOS-3329 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Jan Schlicht >Assignee: Jian Qiu >Priority: Trivial > Labels: easyfix, mesosphere > > `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return > true if a certain value exists in the hashmap instance. The control flow of > these functions doesn't cover the case that the value is not found, which > should result in false. Right now the result in this case is undefined. > As the `existsValue` functions are never called this doesn't result in a > compile error atm. > Possible solutions: > 1) Add `return false` > 2) Remove function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths
[ https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723566#comment-14723566 ] Jian Qiu commented on MESOS-3329: - commit 7ef97970ae4d5ea909abbf234a565ae7db62b192 Author: Jian Qiu Date: Mon Aug 31 21:28:34 2015 +0800 Remove hashmap::existsValue since it is never called Review: https://reviews.apache.org/r/37955 > Unused hashmap::existsValue functions have incomplete code paths > > > Key: MESOS-3329 > URL: https://issues.apache.org/jira/browse/MESOS-3329 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Jan Schlicht >Assignee: Jian Qiu >Priority: Trivial > Labels: easyfix, mesosphere > > `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return > true if a certain value exists in the hashmap instance. The control flow of > these functions doesn't cover the case that the value is not found, which > should result in false. Right now the result in this case is undefined. > As the `existsValue` functions are never called this doesn't result in a > compile error atm. > Possible solutions: > 1) Add `return false` > 2) Remove function -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths
[ https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Qiu reassigned MESOS-3329: --- Assignee: Jian Qiu > Unused hashmap::existsValue functions have incomplete code paths > > > Key: MESOS-3329 > URL: https://issues.apache.org/jira/browse/MESOS-3329 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Jan Schlicht >Assignee: Jian Qiu >Priority: Trivial > Labels: easyfix, mesosphere > > `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return > true if a certain value exists in the hashmap instance. The control flow of > these functions doesn't cover the case that the value is not found, which > should result in false. Right now the result in this case is undefined. > As the `existsValue` functions are never called this doesn't result in a > compile error atm. > Possible solutions: > 1) Add `return false` > 2) Remove function -- This message was sent by Atlassian JIRA (v6.3.4#6332)