[jira] [Commented] (MESOS-4047) MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky

2016-03-02 Thread Alexander Rojas (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176144#comment-15176144
 ] 

Alexander Rojas commented on MESOS-4047:


My previous 
[comment|https://issues.apache.org/jira/browse/MESOS-4047?focusedCommentId=15167418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15167418]
 still describes the situation. I feel the problem resides in the assumptions 
the tests makes which may be incorrect. 

With what I know, I would recommend rethink the whole tests, but more 
investigation is advised before proceeding.

> MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
> ---
>
> Key: MESOS-4047
> URL: https://issues.apache.org/jira/browse/MESOS-4047
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
> Environment: Ubuntu 14, gcc 4.8.4
>Reporter: Joseph Wu
>Assignee: Alexander Rojas
>  Labels: flaky, flaky-test
> Fix For: 0.28.0
>
>
> {code:title=Output from passed test}
> [--] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000430889 s, 2.4 GB/s
> [ RUN  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:14.319327  5062 exec.cpp:134] Version: 0.27.0
> I1202 11:09:14.17  5079 exec.cpp:208] Executor registered on slave 
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Registered executor on ubuntu
> Starting task 4e62294c-cfcf-4a13-b699-c6a4b7ac5162
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> Forked command at 5085
> I1202 11:09:14.391739  5077 exec.cpp:254] Received reconnect request from 
> slave bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> I1202 11:09:14.398598  5082 exec.cpp:231] Executor re-registered on slave 
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Re-registered executor on ubuntu
> Shutting down
> Sending SIGTERM to process tree at pid 5085
> Killing the following process trees:
> [ 
> -+- 5085 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done 
>  \--- 5086 dd count=512 bs=1M if=/dev/zero of=./temp 
> ]
> [   OK ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (1096 ms)
> {code}
> {code:title=Output from failed test}
> [--] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000404489 s, 2.6 GB/s
> [ RUN  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:15.509950  5109 exec.cpp:134] Version: 0.27.0
> I1202 11:09:15.568183  5123 exec.cpp:208] Executor registered on slave 
> 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> Registered executor on ubuntu
> Starting task 14b6bab9-9f60-4130-bdc4-44efba262bc6
> Forked command at 5132
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> I1202 11:09:15.665498  5129 exec.cpp:254] Received reconnect request from 
> slave 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> I1202 11:09:15.670995  5123 exec.cpp:381] Executor asked to shutdown
> Shutting down
> Sending SIGTERM to process tree at pid 5132
> ../../src/tests/containerizer/memory_pressure_tests.cpp:283: Failure
> (usage).failure(): Unknown container: ebe90e15-72fa-4519-837b-62f43052c913
> *** Aborted at 1449083355 (unix time) try "date -d @1449083355" if you are 
> using GNU date ***
> {code}
> Notice that in the failed test, the executor is asked to shutdown when it 
> tries to reconnect to the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4047) MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky

2016-02-29 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171670#comment-15171670
 ] 

Bernd Mathiske commented on MESOS-4047:
---

https://reviews.apache.org/r/43799/

> MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
> ---
>
> Key: MESOS-4047
> URL: https://issues.apache.org/jira/browse/MESOS-4047
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
> Environment: Ubuntu 14, gcc 4.8.4
>Reporter: Joseph Wu
>Assignee: Alexander Rojas
>  Labels: flaky, flaky-test
> Fix For: 0.28.0
>
>
> {code:title=Output from passed test}
> [--] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000430889 s, 2.4 GB/s
> [ RUN  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:14.319327  5062 exec.cpp:134] Version: 0.27.0
> I1202 11:09:14.17  5079 exec.cpp:208] Executor registered on slave 
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Registered executor on ubuntu
> Starting task 4e62294c-cfcf-4a13-b699-c6a4b7ac5162
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> Forked command at 5085
> I1202 11:09:14.391739  5077 exec.cpp:254] Received reconnect request from 
> slave bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> I1202 11:09:14.398598  5082 exec.cpp:231] Executor re-registered on slave 
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Re-registered executor on ubuntu
> Shutting down
> Sending SIGTERM to process tree at pid 5085
> Killing the following process trees:
> [ 
> -+- 5085 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done 
>  \--- 5086 dd count=512 bs=1M if=/dev/zero of=./temp 
> ]
> [   OK ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (1096 ms)
> {code}
> {code:title=Output from failed test}
> [--] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000404489 s, 2.6 GB/s
> [ RUN  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:15.509950  5109 exec.cpp:134] Version: 0.27.0
> I1202 11:09:15.568183  5123 exec.cpp:208] Executor registered on slave 
> 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> Registered executor on ubuntu
> Starting task 14b6bab9-9f60-4130-bdc4-44efba262bc6
> Forked command at 5132
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> I1202 11:09:15.665498  5129 exec.cpp:254] Received reconnect request from 
> slave 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> I1202 11:09:15.670995  5123 exec.cpp:381] Executor asked to shutdown
> Shutting down
> Sending SIGTERM to process tree at pid 5132
> ../../src/tests/containerizer/memory_pressure_tests.cpp:283: Failure
> (usage).failure(): Unknown container: ebe90e15-72fa-4519-837b-62f43052c913
> *** Aborted at 1449083355 (unix time) try "date -d @1449083355" if you are 
> using GNU date ***
> {code}
> Notice that in the failed test, the executor is asked to shutdown when it 
> tries to reconnect to the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4047) MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky

2016-02-25 Thread Alexander Rojas (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167418#comment-15167418
 ] 

Alexander Rojas commented on MESOS-4047:


So after fixing the issues raised in previous comments, I managed to reproduce 
the issue mentioned in the logs posted here. Apparently there is yet another 
race, where the executor exits before the line {{Future 
usage = containerizer2.get()->usage(containerId);}}. I managed to collect two 
verbose logs for a good and a bad run. I add only the important sections. Pay 
attention to lines which look like {{I0224 13:53:53.169703 25060 
slave.cpp:3528] executor(1)@127.0.0.1:38732 exited}}

The good run:

{noformat}
...
I0224 13:53:52.219846 25063 slave.cpp:1891] Asked to kill task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c-
Received killTask
Shutting down
Sending SIGTERM to process tree at pid 31659
Sent SIGTERM to the following process trees:
[
-+- 31659 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done
 \--- 31661 dd count=512 bs=1M if=/dev/zero of=./temp
]
Command terminated with signal Terminated (pid: 31659)
I0224 13:53:52.369876 25062 slave.cpp:3002] Handling status update TASK_KILLED 
(UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c- from executor(1)@127.0.0.1:38732
I0224 13:53:52.386056 25059 mem.cpp:353] Updated 'memory.soft_limit_in_bytes' 
to 32MB for container d78a1f77-a3a1-44e4-9898-a62523a1c1e0
I0224 13:53:53.113471 25059 mem.cpp:388] Updated 'memory.limit_in_bytes' to 
32MB for container d78a1f77-a3a1-44e4-9898-a62523a1c1e0
I0224 13:53:53.117938 25059 status_update_manager.cpp:320] Received status 
update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c-
I0224 13:53:53.118013 25059 status_update_manager.cpp:824] Checkpointing UPDATE 
for status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for 
task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c-
I0224 13:53:53.146458 25058 slave.cpp:3400] Forwarding the update TASK_KILLED 
(UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c- to master@127.0.0.1:57058
I0224 13:53:53.146702 25058 slave.cpp:3310] Sending acknowledgement for status 
update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c- to executor(1)@127.0.0.1:38732
I0224 13:53:53.147956 25062 master.cpp:4794] Status update TASK_KILLED (UUID: 
4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c- from slave 
92632338-e777-41c7-a9a3-39dc62fdea4c-S0 at slave(278)@127.0.0.1:57058 
(localhost)
I0224 13:53:53.147989 25062 master.cpp:4842] Forwarding status update 
TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c-
I0224 13:53:53.148143 25062 master.cpp:6450] Updating the state of task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c- (latest state: TASK_KILLED, status 
update state: TASK_KILLED)
I0224 13:53:53.149320 25061 master.cpp:3952] Processing ACKNOWLEDGE call 
4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8 for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c- (default) at 
scheduler-79245611-a7d2-4220-bae1-4702a34ecf14@127.0.0.1:57058 on slave 
92632338-e777-41c7-a9a3-39dc62fdea4c-S0
I0224 13:53:53.149684 25061 master.cpp:6516] Removing task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 with resources cpus(*):1; mem(*):256; 
disk(*):1024 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- on slave 
92632338-e777-41c7-a9a3-39dc62fdea4c-S0 at slave(278)@127.0.0.1:57058 
(localhost)
I0224 13:53:53.150146 25061 status_update_manager.cpp:392] Received status 
update acknowledgement (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c-
I0224 13:53:53.150410 25061 status_update_manager.cpp:824] Checkpointing ACK 
for status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for 
task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 
92632338-e777-41c7-a9a3-39dc62fdea4c-
I0224 13:53:53.153118 25056 sched.cpp:1903] Asked to stop the driver
I0224 13:53:53.153228 25064 sched.cpp:1143] Stopping framework 
'92632338-e777-41c7-a9a3-39dc62fdea4c-'
I0224 13:53:53.154057 25061 master.cpp:5926] Processing TEARDOWN call for 
framework 

[jira] [Commented] (MESOS-4047) MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky

2016-02-22 Thread Alexander Rojas (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157704#comment-15157704
 ] 

Alexander Rojas commented on MESOS-4047:


Reproduced again with following message (CentOS 6.7):

{noformat}
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from MemoryPressureMesosTest
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.000394345 s, 2.7 GB/s
[ RUN  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
I0222 09:32:20.622694 20868 leveldb.cpp:174] Opened db in 5.153509ms
I0222 09:32:20.624688 20868 leveldb.cpp:181] Compacted db in 1.914323ms
I0222 09:32:20.624778 20868 leveldb.cpp:196] Created db iterator in 24549ns
I0222 09:32:20.624795 20868 leveldb.cpp:202] Seeked to beginning of db in 2610ns
I0222 09:32:20.624804 20868 leveldb.cpp:271] Iterated through 0 keys in the db 
in 323ns
I0222 09:32:20.624874 20868 replica.cpp:779] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0222 09:32:20.625977 20888 recover.cpp:447] Starting replica recovery
I0222 09:32:20.626901 20888 recover.cpp:473] Replica is in EMPTY status
I0222 09:32:20.634701 20889 replica.cpp:673] Replica in EMPTY status received a 
broadcasted recover request from (11193)@127.0.0.1:54769
I0222 09:32:20.634953 20888 master.cpp:376] Master 
17b7da64-0c4d-4e46-ae1f-2b356dc5f266 (localhost) started on 127.0.0.1:54769
I0222 09:32:20.634986 20888 master.cpp:378] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="true" --authenticate_http="true" --authenticate_slaves="true" 
--authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/0rXncF/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/0rXncF/master" 
--zk_session_timeout="10secs"
W0222 09:32:20.635417 20888 master.cpp:381]
**
Master bound to loopback interface! Cannot communicate with remote schedulers 
or slaves. You might want to set '--ip' flag to a routable IP address.
**
I0222 09:32:20.635587 20888 master.cpp:423] Master only allowing authenticated 
frameworks to register
I0222 09:32:20.635601 20888 master.cpp:428] Master only allowing authenticated 
slaves to register
I0222 09:32:20.635622 20888 credentials.hpp:35] Loading credentials for 
authentication from '/tmp/0rXncF/credentials'
I0222 09:32:20.636018 20888 master.cpp:468] Using default 'crammd5' 
authenticator
I0222 09:32:20.636190 20888 master.cpp:537] Using default 'basic' HTTP 
authenticator
I0222 09:32:20.636174 20887 recover.cpp:193] Received a recover response from a 
replica in EMPTY status
I0222 09:32:20.636425 20888 master.cpp:571] Authorization enabled
I0222 09:32:20.637810 20885 recover.cpp:564] Updating replica status to STARTING
I0222 09:32:20.640805 20887 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 2.741248ms
I0222 09:32:20.640964 20887 replica.cpp:320] Persisted replica status to 
STARTING
I0222 09:32:20.641525 20885 recover.cpp:473] Replica is in STARTING status
I0222 09:32:20.642133 20888 master.cpp:1712] The newly elected leader is 
master@127.0.0.1:54769 with id 17b7da64-0c4d-4e46-ae1f-2b356dc5f266
I0222 09:32:20.642236 20888 master.cpp:1725] Elected as the leading master!
I0222 09:32:20.642253 20888 master.cpp:1470] Recovering from registrar
I0222 09:32:20.642496 20885 registrar.cpp:307] Recovering registrar
I0222 09:32:20.643162 20889 replica.cpp:673] Replica in STARTING status 
received a broadcasted recover request from (11195)@127.0.0.1:54769
I0222 09:32:20.643590 20885 recover.cpp:193] Received a recover response from a 
replica in STARTING status
I0222 09:32:20.644120 20887 recover.cpp:564] Updating replica status to VOTING
I0222 09:32:20.646817 20889 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 1.190281ms
I0222 09:32:20.646870 20889 replica.cpp:320] Persisted replica status to VOTING
I0222 09:32:20.647094 20885 recover.cpp:578] Successfully joined the Paxos group
I0222 09:32:20.647337 20885 recover.cpp:462] Recover process terminated
I0222 09:32:20.647781 20887 log.cpp:659] Attempting to start the writer
I0222 09:32:20.648854 20890 replica.cpp:493] Replica received 

[jira] [Commented] (MESOS-4047) MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky

2015-12-02 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036465#comment-15036465
 ] 

Joseph Wu commented on MESOS-4047:
--

Note: {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} has similar 
logic for restarting the agent, re-registering an executor, and [calling 
{{MesosContainerizer::usage}}|https://github.com/apache/mesos/blob/master/src/tests/slave_recovery_tests.cpp#L3267].
  But this test is stable.  

The flaky test waits on:
{code}
  Future _recover = FUTURE_DISPATCH(_, ::_recover);

  Future slaveReregisteredMessage =
FUTURE_PROTOBUF(SlaveReregisteredMessage(), _, _);
{code}

Whereas the stable test waits on:
{code}
  // Set up so we can wait until the new slave updates the container's
  // resources (this occurs after the executor has re-registered).
  Future update =
FUTURE_DISPATCH(_, ::update);
{code}

> MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
> ---
>
> Key: MESOS-4047
> URL: https://issues.apache.org/jira/browse/MESOS-4047
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.26.0
> Environment: Ubuntu 14, gcc 4.8.4
>Reporter: Joseph Wu
>  Labels: flaky, flaky-test
>
> {code:title=Output from passed test}
> [--] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000430889 s, 2.4 GB/s
> [ RUN  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:14.319327  5062 exec.cpp:134] Version: 0.27.0
> I1202 11:09:14.17  5079 exec.cpp:208] Executor registered on slave 
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Registered executor on ubuntu
> Starting task 4e62294c-cfcf-4a13-b699-c6a4b7ac5162
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> Forked command at 5085
> I1202 11:09:14.391739  5077 exec.cpp:254] Received reconnect request from 
> slave bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> I1202 11:09:14.398598  5082 exec.cpp:231] Executor re-registered on slave 
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Re-registered executor on ubuntu
> Shutting down
> Sending SIGTERM to process tree at pid 5085
> Killing the following process trees:
> [ 
> -+- 5085 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done 
>  \--- 5086 dd count=512 bs=1M if=/dev/zero of=./temp 
> ]
> [   OK ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (1096 ms)
> {code}
> {code:title=Output from failed test}
> [--] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000404489 s, 2.6 GB/s
> [ RUN  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:15.509950  5109 exec.cpp:134] Version: 0.27.0
> I1202 11:09:15.568183  5123 exec.cpp:208] Executor registered on slave 
> 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> Registered executor on ubuntu
> Starting task 14b6bab9-9f60-4130-bdc4-44efba262bc6
> Forked command at 5132
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> I1202 11:09:15.665498  5129 exec.cpp:254] Received reconnect request from 
> slave 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> I1202 11:09:15.670995  5123 exec.cpp:381] Executor asked to shutdown
> Shutting down
> Sending SIGTERM to process tree at pid 5132
> ../../src/tests/containerizer/memory_pressure_tests.cpp:283: Failure
> (usage).failure(): Unknown container: ebe90e15-72fa-4519-837b-62f43052c913
> *** Aborted at 1449083355 (unix time) try "date -d @1449083355" if you are 
> using GNU date ***
> {code}
> Notice that in the failed test, the executor is asked to shutdown when it 
> tries to reconnect to the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)