----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55901/#review162972 -----------------------------------------------------------
It looks like there's a deadlock in the cleanup path of the Agent/Containerizer that is only manifested under OS X. Here's the output of a failed test run on OS X: ``` [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from HealthCheckTest [ RUN ] HealthCheckTest.DefaultExecutorCmdHealthCheck I0125 14:43:57.005580 3095696320 cluster.cpp:160] Creating default 'local' authorizer I0125 14:43:57.036818 31969280 master.cpp:383] Master 41942215-2f76-4ffa-9bfc-383fc4bc2b04 (172.18.9.98) started on 172.18.9.98:49526 I0125 14:43:57.036867 31969280 master.cpp:385] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5" --authorizers="local" --credentials="/private/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/Fi6dxT/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_ timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/Users/gaston/mesos/master/share/mesos/webui" --work_dir="/private/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/Fi6dxT/master" --zk_session_timeout="10secs" I0125 14:43:57.038107 31969280 master.cpp:435] Master only allowing authenticated frameworks to register I0125 14:43:57.038125 31969280 master.cpp:449] Master only allowing authenticated agents to register I0125 14:43:57.038132 31969280 master.cpp:462] Master only allowing authenticated HTTP frameworks to register I0125 14:43:57.038138 31969280 credentials.hpp:37] Loading credentials for authentication from '/private/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/Fi6dxT/credentials' I0125 14:43:57.038478 31969280 master.cpp:507] Using default 'crammd5' authenticator I0125 14:43:57.038516 31969280 authenticator.cpp:519] Initializing server SASL I0125 14:43:57.056694 31969280 auxprop.cpp:73] Initialized in-memory auxiliary property plugin I0125 14:43:57.056784 31969280 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I0125 14:43:57.056943 31969280 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' I0125 14:43:57.057091 31969280 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I0125 14:43:57.057209 31969280 master.cpp:587] Authorization enabled I0125 14:43:57.057353 29286400 hierarchical.cpp:151] Initialized hierarchical allocator process I0125 14:43:57.057349 31432704 whitelist_watcher.cpp:77] No whitelist given I0125 14:43:57.060212 33042432 master.cpp:2121] Elected as the leading master! I0125 14:43:57.060230 33042432 master.cpp:1643] Recovering from registrar I0125 14:43:57.060317 31432704 registrar.cpp:329] Recovering registrar I0125 14:43:57.064963 31432704 registrar.cpp:362] Successfully fetched the registry (0B) in 4.62208ms I0125 14:43:57.065105 31432704 registrar.cpp:461] Applied 1 operations in 36us; attempting to update the registry I0125 14:43:57.069882 31969280 registrar.cpp:506] Successfully updated the registry in 4.75008ms I0125 14:43:57.069962 31969280 registrar.cpp:392] Successfully recovered registrar I0125 14:43:57.070274 30896128 master.cpp:1759] Recovered 0 agents from the registry (131B); allowing 10mins for agents to re-register I0125 14:43:57.070297 29286400 hierarchical.cpp:178] Skipping recovery of hierarchical allocator: nothing to recover I0125 14:43:57.071858 3095696320 containerizer.cpp:220] Using isolation: posix/cpu,posix/mem,filesystem/posix I0125 14:43:57.074162 3095696320 cluster.cpp:446] Creating default 'local' authorizer I0125 14:43:57.074981 33042432 slave.cpp:209] Mesos agent started on (1)@172.18.9.98:49526 I0125 14:43:57.075031 33042432 slave.cpp:210] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/mesos/store/appc" --authenticate_http_readonly="true" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false " --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --http_credentials="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/http_credentials" --http_heartbeat_interval="30secs" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher="posix" --launcher_dir="/Users/gaston/mesosphere/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_ timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]" --runtime_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --version="false" --work_dir="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg" I0125 14:43:57.075314 33042432 credentials.hpp:86] Loading credential for authentication from '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/credential' I0125 14:43:57.075451 33042432 slave.cpp:352] Agent using credential for: test-principal I0125 14:43:57.075469 33042432 credentials.hpp:37] Loading credentials for authentication from '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/http_credentials' I0125 14:43:57.075603 33042432 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-agent-readonly' I0125 14:43:57.076303 33042432 slave.cpp:539] Agent resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] I0125 14:43:57.076386 33042432 slave.cpp:547] Agent attributes: [ ] I0125 14:43:57.076395 33042432 slave.cpp:552] Agent hostname: 172.18.9.98 I0125 14:43:57.076504 32505856 status_update_manager.cpp:177] Pausing sending status updates I0125 14:43:57.076628 3095696320 sched.cpp:232] Version: 1.2.0 I0125 14:43:57.076966 29286400 sched.cpp:336] New master detected at master@172.18.9.98:49526 I0125 14:43:57.077011 29286400 sched.cpp:407] Authenticating with master master@172.18.9.98:49526 I0125 14:43:57.077033 29286400 sched.cpp:414] Using default CRAM-MD5 authenticatee I0125 14:43:57.077142 29822976 authenticatee.cpp:97] Initializing client SASL I0125 14:43:57.077421 29286400 state.cpp:60] Recovering state from '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/meta' I0125 14:43:57.077636 33042432 status_update_manager.cpp:203] Recovering status update manager I0125 14:43:57.077785 31969280 containerizer.cpp:599] Recovering containerizer I0125 14:43:57.078785 31432704 provisioner.cpp:251] Provisioner recovery complete I0125 14:43:57.078979 31432704 slave.cpp:5422] Finished recovery I0125 14:43:57.079376 31432704 slave.cpp:5596] Querying resource estimator for oversubscribable resources I0125 14:43:57.079521 33042432 status_update_manager.cpp:177] Pausing sending status updates I0125 14:43:57.079525 31432704 slave.cpp:929] New master detected at master@172.18.9.98:49526 I0125 14:43:57.079567 31432704 slave.cpp:964] Detecting new master I0125 14:43:57.079654 31432704 slave.cpp:5610] Received oversubscribable resources {} from the resource estimator I0125 14:43:57.079880 29822976 authenticatee.cpp:121] Creating new client SASL connection I0125 14:43:57.080003 29286400 master.cpp:7001] Authenticating scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.080070 30359552 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(1)@172.18.9.98:49526 I0125 14:43:57.080181 31432704 authenticator.cpp:98] Creating new server SASL connection I0125 14:43:57.080265 31969280 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5 I0125 14:43:57.080309 31969280 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5' I0125 14:43:57.080360 33042432 authenticator.cpp:204] Received SASL authentication start I0125 14:43:57.080399 33042432 authenticator.cpp:326] Authentication requires more steps I0125 14:43:57.080458 33042432 authenticatee.cpp:259] Received SASL authentication step I0125 14:43:57.080544 31432704 authenticator.cpp:232] Received SASL authentication step I0125 14:43:57.080564 31432704 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I0125 14:43:57.080580 31432704 auxprop.cpp:181] Looking up auxiliary property '*userPassword' I0125 14:43:57.080605 31432704 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I0125 14:43:57.080621 31432704 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I0125 14:43:57.080633 31432704 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I0125 14:43:57.080637 31432704 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I0125 14:43:57.080646 31432704 authenticator.cpp:318] Authentication success I0125 14:43:57.080687 29286400 authenticatee.cpp:299] Authentication success I0125 14:43:57.080730 30359552 master.cpp:7031] Successfully authenticated principal 'test-principal' at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.080780 32505856 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(1)@172.18.9.98:49526 I0125 14:43:57.080895 31969280 sched.cpp:513] Successfully authenticated with master master@172.18.9.98:49526 I0125 14:43:57.080906 31969280 sched.cpp:836] Sending SUBSCRIBE call to master@172.18.9.98:49526 I0125 14:43:57.080955 31969280 sched.cpp:869] Will retry registration in 788.765853ms if necessary I0125 14:43:57.081032 30359552 master.cpp:2734] Received SUBSCRIBE call for framework 'default' at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.081043 30359552 master.cpp:2157] Authorizing framework principal 'test-principal' to receive offers for role '*' I0125 14:43:57.081260 29286400 master.cpp:2810] Subscribing framework default with checkpointing disabled and capabilities [ ] I0125 14:43:57.088062 31969280 slave.cpp:991] Authenticating with master master@172.18.9.98:49526 I0125 14:43:57.096890 31969280 slave.cpp:1002] Using default CRAM-MD5 authenticatee I0125 14:43:57.097005 29822976 authenticatee.cpp:121] Creating new client SASL connection I0125 14:43:57.097234 29286400 master.cpp:7001] Authenticating slave(1)@172.18.9.98:49526 I0125 14:43:57.097245 33042432 hierarchical.cpp:271] Added framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.097241 31969280 sched.cpp:759] Framework registered with 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.097296 30896128 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(2)@172.18.9.98:49526 I0125 14:43:57.097301 33042432 hierarchical.cpp:1677] No allocations performed I0125 14:43:57.097332 33042432 hierarchical.cpp:1772] No inverse offers to send out! I0125 14:43:57.097352 33042432 hierarchical.cpp:1279] Performed allocation for 0 agents in 94us I0125 14:43:57.097369 31969280 sched.cpp:773] Scheduler::registered took 108us I0125 14:43:57.097446 29286400 authenticator.cpp:98] Creating new server SASL connection I0125 14:43:57.097564 31432704 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5 I0125 14:43:57.097605 31432704 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5' I0125 14:43:57.097694 31969280 authenticator.cpp:204] Received SASL authentication start I0125 14:43:57.097761 31969280 authenticator.cpp:326] Authentication requires more steps I0125 14:43:57.097836 30359552 authenticatee.cpp:259] Received SASL authentication step I0125 14:43:57.097904 29822976 authenticator.cpp:232] Received SASL authentication step I0125 14:43:57.097934 29822976 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I0125 14:43:57.097944 29822976 auxprop.cpp:181] Looking up auxiliary property '*userPassword' I0125 14:43:57.097965 29822976 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I0125 14:43:57.097980 29822976 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'Gastons-MacBook-Pro-2.local' server FQDN: 'Gastons-MacBook-Pro-2.local' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I0125 14:43:57.097990 29822976 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I0125 14:43:57.097997 29822976 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I0125 14:43:57.098006 29822976 authenticator.cpp:318] Authentication success I0125 14:43:57.098071 30359552 authenticatee.cpp:299] Authentication success I0125 14:43:57.098121 29286400 master.cpp:7031] Successfully authenticated principal 'test-principal' at slave(1)@172.18.9.98:49526 I0125 14:43:57.098161 31969280 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(2)@172.18.9.98:49526 I0125 14:43:57.098255 31432704 slave.cpp:1086] Successfully authenticated with master master@172.18.9.98:49526 I0125 14:43:57.098340 31432704 slave.cpp:1508] Will retry registration in 15.661984ms if necessary I0125 14:43:57.098412 29822976 master.cpp:5299] Registering agent at slave(1)@172.18.9.98:49526 (172.18.9.98) with id 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 I0125 14:43:57.098567 33042432 registrar.cpp:461] Applied 1 operations in 35us; attempting to update the registry I0125 14:43:57.103466 33042432 registrar.cpp:506] Successfully updated the registry in 4.877056ms I0125 14:43:57.103818 33042432 slave.cpp:4286] Received ping from slave-observer(1)@172.18.9.98:49526 I0125 14:43:57.103868 30896128 master.cpp:5370] Registered agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] I0125 14:43:57.103914 33042432 slave.cpp:1132] Registered with master master@172.18.9.98:49526; given agent ID 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 I0125 14:43:57.103934 33042432 fetcher.cpp:90] Clearing fetcher cache I0125 14:43:57.104035 31969280 hierarchical.cpp:478] Added agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 (172.18.9.98) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: {}) I0125 14:43:57.104058 30359552 status_update_manager.cpp:184] Resuming sending status updates I0125 14:43:57.104224 33042432 slave.cpp:1160] Checkpointing SlaveInfo to '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/meta/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/slave.info' I0125 14:43:57.104449 31969280 hierarchical.cpp:1772] No inverse offers to send out! I0125 14:43:57.104480 31969280 hierarchical.cpp:1302] Performed allocation for agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 in 439us I0125 14:43:57.104595 33042432 slave.cpp:1198] Forwarding total oversubscribed resources {} I0125 14:43:57.104617 31432704 master.cpp:6830] Sending 1 offers to framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.104727 31432704 master.cpp:5863] Received update of agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) with total oversubscribed resources {} I0125 14:43:57.104867 29286400 hierarchical.cpp:548] Agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 (172.18.9.98) updated with oversubscribed resources {} (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) I0125 14:43:57.104959 29286400 hierarchical.cpp:1677] No allocations performed I0125 14:43:57.104972 29286400 hierarchical.cpp:1772] No inverse offers to send out! I0125 14:43:57.104961 32505856 sched.cpp:933] Scheduler::resourceOffers took 173us I0125 14:43:57.104995 29286400 hierarchical.cpp:1302] Performed allocation for agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 in 74us I0125 14:43:57.106262 29286400 master.cpp:3728] Processing ACCEPT call for offers: [ 41942215-2f76-4ffa-9bfc-383fc4bc2b04-O0 ] on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.106317 29286400 master.cpp:3316] Authorizing framework principal 'test-principal' to launch task 1 I0125 14:43:57.107606 32505856 master.cpp:8882] Adding task 1 with resources cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:43:57.107693 32505856 master.cpp:4583] Launching task group { 1 } of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 with resources cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:43:57.107838 30359552 slave.cpp:1576] Got assigned task group containing tasks [ 1 ] for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.108158 30359552 slave.cpp:1736] Launching task group containing tasks [ 1 ] for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.112789 30359552 paths.cpp:547] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746' to user 'gaston' I0125 14:43:57.113065 30359552 slave.cpp:6331] Launching executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 with resources cpus(*):0.1; mem(*):32; disk(*):32 in work directory '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746' I0125 14:43:57.113381 30896128 containerizer.cpp:992] Starting container ed1df37e-c3cd-4cb7-9be0-3f57df46a746 for executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.113417 30359552 slave.cpp:2058] Queued task group containing tasks [ 1 ] for executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.113462 30359552 slave.cpp:882] Successfully attached file '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746' I0125 14:43:57.115821 31969280 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"arguments":["mesos-default-executor","--launcher_dir=\/Users\/gaston\/mesosphere\/mesos\/build\/src"],"shell":false,"value":"\/Users\/gaston\/mesosphere\/mesos\/build\/src\/mesos-default-executor"},"environment":{"variables":[{"name":"LIBPROCESS_PORT","value":"0"},{"name":"MESOS_AGENT_ENDPOINT","value":"172.18.9.98:49526"},{"name":"MESOS_CHECKPOINT","value":"0"},{"name":"MESOS_DIRECTORY","value":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"},{"name":"MESOS_EXECUTOR_ID","value":"default"},{"name":"MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD","value":"5secs"},{"name":"MESOS_FRAMEWORK_ID","value":"41942215-2f76-4ffa-9bfc-383fc 4bc2b04-0000"},{"name":"MESOS_HTTP_COMMAND_EXECUTOR","value":"0"},{"name":"MESOS_SLAVE_ID","value":"41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0"},{"name":"MESOS_SLAVE_PID","value":"slave(1)@172.18.9.98:49526"},{"name":"MESOS_SANDBOX","value":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"}]},"err":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/stderr","type":"PATH"},"out":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4b c2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/stdout","type":"PATH"},"user":"gaston","working_directory":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"}" --pipe_read="6" --pipe_write="8" --runtime_directory="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746"' I0125 14:43:57.117321 31969280 launcher.cpp:135] Forked child with pid '85059' for container 'ed1df37e-c3cd-4cb7-9be0-3f57df46a746' I0125 14:43:57.119642 30896128 fetcher.cpp:349] Starting to fetch URIs for container: ed1df37e-c3cd-4cb7-9be0-3f57df46a746, directory: /var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746 I0125 14:43:57.222311 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1/executor' I0125 14:43:57.223127 29822976 http.cpp:303] HTTP POST for /slave(1)/api/v1/executor from 172.18.9.98:49528 I0125 14:43:57.223256 29822976 slave.cpp:3109] Received Subscribe request for HTTP executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.224218 30896128 slave.cpp:2300] Sending queued task group task group containing tasks [ 1 ] to executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (via HTTP) I0125 14:43:57.237958 30896128 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1' I0125 14:43:57.238396 30896128 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49530 I0125 14:43:57.238607 30896128 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER I0125 14:43:57.239876 29822976 containerizer.cpp:1808] Starting nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1 I0125 14:43:57.240108 29822976 containerizer.cpp:1832] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1' to user 'gaston' I0125 14:43:57.241133 32505856 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"shell":true,"value":"sleep 120"},"environment":{"variables":[{"name":"MESOS_SANDBOX","value":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1"}]},"err":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1\/stderr","type":"PATH"},"out":{"path":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/ T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1\/stdout","type":"PATH"},"user":"gaston","working_directory":"\/var\/folders\/jr\/17y2w4ld019bsn9vhx0c13f80000gn\/T\/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg\/slaves\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0\/frameworks\/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000\/executors\/default\/runs\/ed1df37e-c3cd-4cb7-9be0-3f57df46a746\/containers\/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1"}" --pipe_read="10" --pipe_write="11" --runtime_directory="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1"' I0125 14:43:57.251711 32505856 launcher.cpp:135] Forked child with pid '85088' for container 'ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1' I0125 14:43:57.253597 33042432 fetcher.cpp:349] Starting to fetch URIs for container: ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1, directory: /var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1 E0125 14:43:57.257318 33579008 process.cpp:2419] Failed to shutdown socket with fd 9: Socket is not connected I0125 14:43:57.257848 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1/executor' I0125 14:43:57.258191 29286400 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1' I0125 14:43:57.258460 29286400 http.cpp:303] HTTP POST for /slave(1)/api/v1/executor from 172.18.9.98:49529 I0125 14:43:57.258599 29286400 slave.cpp:3756] Handling status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.258846 29286400 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49531 I0125 14:43:57.259006 29286400 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1' I0125 14:43:57.259308 29286400 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER_SESSION I0125 14:43:57.259663 30359552 status_update_manager.cpp:323] Received status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.259696 30359552 status_update_manager.cpp:500] Creating StatusUpdate stream for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.259893 29822976 containerizer.cpp:1808] Starting nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.259938 30359552 status_update_manager.cpp:377] Forwarding update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to the agent I0125 14:43:57.259951 29286400 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49532 I0125 14:43:57.260105 29822976 containerizer.cpp:1832] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check' to user 'gaston' I0125 14:43:57.260159 29286400 slave.cpp:4196] Forwarding the update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to master@172.18.9.98:49526 I0125 14:43:57.260329 29286400 http.cpp:464] Processing call WAIT_NESTED_CONTAINER I0125 14:43:57.260387 31432704 master.cpp:6008] Status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 from agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:43:57.260434 31432704 master.cpp:6076] Forwarding status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.260524 29286400 slave.cpp:4090] Status update manager successfully handled status update TASK_RUNNING (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.260597 31432704 master.cpp:8138] Updating the state of task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING) I0125 14:43:57.260751 30896128 sched.cpp:1041] Scheduler::statusUpdate took 137us I0125 14:43:57.260968 29286400 master.cpp:5015] Processing ACKNOWLEDGE call c8583885-558e-44a3-9e0b-c39874132e27 for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 I0125 14:43:57.261234 32505856 status_update_manager.cpp:395] Received status update acknowledgement (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.261389 31969280 slave.cpp:3045] Status update manager successfully handled status update acknowledgement (UUID: c8583885-558e-44a3-9e0b-c39874132e27) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.266468 30896128 switchboard.cpp:571] Launching 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" --help="false" --socket_address="/tmp/mesos-io-switchboard-a719e861-62a0-42dd-9a18-8af6f13b856f" --stderr_from_fd="15" --stderr_to_fd="2" --stdin_to_fd="12" --stdout_from_fd="13" --stdout_to_fd="1" --tty="false" --wait_for_connection="true"' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.267304 30896128 switchboard.cpp:601] Created I/O switchboard server (pid: 85095) listening on socket file '/tmp/mesos-io-switchboard-a719e861-62a0-42dd-9a18-8af6f13b856f' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.268826 29822976 containerizer.cpp:1541] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"shell":true,"value":"exit $STATUS"},"environment":{"variables":[{"name":"STATUS","value":"0"}]},"err":{"fd":16,"type":"FD"},"in":{"fd":9,"type":"FD"},"out":{"fd":14,"type":"FD"},"user":"gaston"}" --pipe_read="12" --pipe_write="13" --runtime_directory="/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check"' I0125 14:43:57.269855 29822976 launcher.cpp:135] Forked child with pid '85099' for container 'ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check' I0125 14:43:57.271919 33042432 fetcher.cpp:349] Starting to fetch URIs for container: ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check, directory: /var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.330857 32505856 http.cpp:2647] Received EOF attach response for ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.331059 29286400 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check in RUNNING state W0125 14:43:57.331065 32505856 http.cpp:2662] Launch nested container session connection for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check closed I0125 14:43:57.331297 29286400 launcher.cpp:151] Asked to destroy container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.332167 30896128 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1' I0125 14:43:57.332547 31969280 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49534 I0125 14:43:57.332726 31969280 http.cpp:464] Processing call WAIT_NESTED_CONTAINER I0125 14:43:57.366807 33042432 switchboard.cpp:873] I/O switchboard server process for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check has terminated (status=0) I0125 14:43:57.367132 31432704 containerizer.cpp:2482] Container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check has exited I0125 14:43:57.367938 31969280 provisioner.cpp:322] Ignoring destroy request for unknown container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.368036 33042432 containerizer.cpp:2398] Checkpointing termination state to nested container's runtime directory '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_oFfwrZ/containers/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check/termination' I0125 14:43:57.387104 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1' I0125 14:43:57.387472 29822976 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49535 I0125 14:43:57.387668 29822976 http.cpp:464] Processing call LAUNCH_NESTED_CONTAINER_SESSION I0125 14:43:57.387924 33042432 containerizer.cpp:1808] Starting nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.387951 29822976 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1/executor' I0125 14:43:57.388016 33042432 containerizer.cpp:1832] Trying to chown '/var/folders/jr/17y2w4ld019bsn9vhx0c13f80000gn/T/HealthCheckTest_DefaultExecutorCmdHealthCheck_zSkvqg/slaves/41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0/frameworks/41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000/executors/default/runs/ed1df37e-c3cd-4cb7-9be0-3f57df46a746/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1/containers/dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check' to user 'gaston' I0125 14:43:57.388273 31432704 http.cpp:303] HTTP POST for /slave(1)/api/v1/executor from 172.18.9.98:49529 I0125 14:43:57.388356 31432704 slave.cpp:3756] Handling status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.389214 29822976 status_update_manager.cpp:323] Received status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.389328 29822976 status_update_manager.cpp:377] Forwarding update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to the agent I0125 14:43:57.389411 30896128 slave.cpp:4196] Forwarding the update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 to master@172.18.9.98:49526 I0125 14:43:57.389511 30896128 slave.cpp:4090] Status update manager successfully handled status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.389554 32505856 master.cpp:6008] Status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 from agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:43:57.389578 32505856 master.cpp:6076] Forwarding status update TASK_RUNNING (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 in health state healthy of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.389693 32505856 master.cpp:8138] Updating the state of task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING) I0125 14:43:57.389803 31432704 sched.cpp:1041] Scheduler::statusUpdate took 76us I0125 14:43:57.390027 3095696320 sched.cpp:2021] Asked to stop the driver I0125 14:43:57.390022 29822976 master.cpp:5015] Processing ACKNOWLEDGE call 0a69f89b-35c9-4943-a6b4-2f9646b11338 for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 I0125 14:43:57.390105 31969280 sched.cpp:1203] Stopping framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.390228 31432704 master.cpp:7543] Processing TEARDOWN call for framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.390235 30359552 status_update_manager.cpp:395] Received status update acknowledgement (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.390249 31432704 master.cpp:7555] Removing framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.390262 31432704 master.cpp:3078] Deactivating framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (default) at scheduler-8fcc1ae4-1a01-4d5b-a2c8-444d1295751b@172.18.9.98:49526 I0125 14:43:57.390390 32505856 hierarchical.cpp:386] Deactivated framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.390399 31969280 slave.cpp:3045] Status update manager successfully handled status update acknowledgement (UUID: 0a69f89b-35c9-4943-a6b4-2f9646b11338) for task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.390424 31432704 master.cpp:8138] Updating the state of task 1 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (latest state: TASK_KILLED, status update state: TASK_KILLED) I0125 14:43:57.390444 31969280 slave.cpp:2598] Asked to shut down framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 by master@172.18.9.98:49526 I0125 14:43:57.390458 31969280 slave.cpp:2623] Shutting down framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.390477 31969280 slave.cpp:5005] Shutting down executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (via HTTP) I0125 14:43:57.390702 31432704 master.cpp:8232] Removing task 1 with resources cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:43:57.390861 31432704 master.cpp:8261] Removing executor 'default' with resources cpus(*):0.1; mem(*):32; disk(*):32 of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:43:57.390902 33042432 hierarchical.cpp:1011] Recovered cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):0.1; mem(*):32; disk(*):32) on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 from framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.391154 32505856 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746 in RUNNING state I0125 14:43:57.391171 32505856 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1 in RUNNING state I0125 14:43:57.391186 32505856 containerizer.cpp:2119] Destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check in PREPARING state I0125 14:43:57.391238 29822976 hierarchical.cpp:1011] Recovered cpus(*):0.1; mem(*):32; disk(*):32 (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: {}) on agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 from framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.391365 29822976 hierarchical.cpp:337] Removed framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 I0125 14:43:57.391458 32505856 containerizer.cpp:2190] Waiting for the isolators to complete preparing before destroying container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.394384 31432704 process.cpp:3697] Handling HTTP event for process 'slave(1)' with path: '/slave(1)/api/v1' I0125 14:43:57.394676 30896128 http.cpp:303] HTTP POST for /slave(1)/api/v1 from 172.18.9.98:49536 I0125 14:43:57.394672 29286400 switchboard.cpp:571] Launching 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" --help="false" --socket_address="/tmp/mesos-io-switchboard-9127715f-8aed-454e-8ec3-458fbec784fa" --stderr_from_fd="15" --stderr_to_fd="2" --stdin_to_fd="12" --stdout_from_fd="13" --stdout_to_fd="1" --tty="false" --wait_for_connection="true"' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check I0125 14:43:57.394807 30896128 http.cpp:464] Processing call KILL_NESTED_CONTAINER I0125 14:43:57.395575 29286400 switchboard.cpp:601] Created I/O switchboard server (pid: 85132) listening on socket file '/tmp/mesos-io-switchboard-9127715f-8aed-454e-8ec3-458fbec784fa' for container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check W0125 14:43:57.396780 31432704 http.cpp:2226] Failed to launch nested container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check: Container is being destroyed during preparing I0125 14:43:57.396934 30896128 process.cpp:3754] Failed to process request for '/slave(1)/api/v1': Container is being destroyed during preparing I0125 14:43:57.396984 31432704 process.cpp:1456] Returning '500 Internal Server Error' for '/slave(1)/api/v1' (Container is being destroyed during preparing) I0125 14:43:58.061504 30359552 hierarchical.cpp:1677] No allocations performed I0125 14:43:58.061565 30359552 hierarchical.cpp:1279] Performed allocation for 1 agents in 189us I0125 14:43:59.067081 30896128 hierarchical.cpp:1677] No allocations performed I0125 14:43:59.067144 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 206us I0125 14:44:00.068758 33042432 hierarchical.cpp:1677] No allocations performed I0125 14:44:00.068822 33042432 hierarchical.cpp:1279] Performed allocation for 1 agents in 182us I0125 14:44:01.070752 31969280 hierarchical.cpp:1677] No allocations performed I0125 14:44:01.070814 31969280 hierarchical.cpp:1279] Performed allocation for 1 agents in 171us I0125 14:44:02.071462 30896128 hierarchical.cpp:1677] No allocations performed I0125 14:44:02.071526 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 175us I0125 14:44:02.393194 32505856 slave.cpp:5078] Killing executor 'default' of framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 (via HTTP) E0125 14:44:02.394971 33579008 process.cpp:2419] Failed to shutdown socket with fd 11: Socket is not connected E0125 14:44:02.395092 33579008 process.cpp:2419] Failed to shutdown socket with fd 18: Socket is not connected E0125 14:44:02.395231 33579008 process.cpp:2419] Failed to shutdown socket with fd 6: Socket is not connected E0125 14:44:02.395437 33579008 process.cpp:2419] Failed to shutdown socket with fd 8: Socket is not connected I0125 14:44:02.397722 33579008 switchboard.cpp:782] Sending SIGTERM to I/O switchboard server (pid: 85132) since container ed1df37e-c3cd-4cb7-9be0-3f57df46a746.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1.dc6d91c1-1a7d-4563-bef6-a9e3fcad3cb1-health-check is being destroyed I0125 14:44:02.431910 33042432 containerizer.cpp:2482] Container ed1df37e-c3cd-4cb7-9be0-3f57df46a746 has exited I0125 14:44:03.074764 29286400 hierarchical.cpp:1677] No allocations performed I0125 14:44:03.074825 29286400 hierarchical.cpp:1279] Performed allocation for 1 agents in 223us I0125 14:44:04.080370 29822976 hierarchical.cpp:1677] No allocations performed I0125 14:44:04.080440 29822976 hierarchical.cpp:1279] Performed allocation for 1 agents in 190us I0125 14:44:05.080876 30896128 hierarchical.cpp:1677] No allocations performed I0125 14:44:05.080950 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 197us I0125 14:44:06.082023 30896128 hierarchical.cpp:1677] No allocations performed I0125 14:44:06.082082 30896128 hierarchical.cpp:1279] Performed allocation for 1 agents in 186us I0125 14:44:07.086395 29286400 hierarchical.cpp:1677] No allocations performed I0125 14:44:07.086458 29286400 hierarchical.cpp:1279] Performed allocation for 1 agents in 226us I0125 14:44:08.089059 33042432 hierarchical.cpp:1677] No allocations performed I0125 14:44:08.089119 33042432 hierarchical.cpp:1279] Performed allocation for 1 agents in 168us I0125 14:44:09.090726 31969280 hierarchical.cpp:1677] No allocations performed I0125 14:44:09.090795 31969280 hierarchical.cpp:1279] Performed allocation for 1 agents in 205us I0125 14:44:10.096534 33042432 hierarchical.cpp:1677] No allocations performed I0125 14:44:10.096603 33042432 hierarchical.cpp:1279] Performed allocation for 1 agents in 243us I0125 14:44:11.098918 30359552 hierarchical.cpp:1677] No allocations performed I0125 14:44:11.098978 30359552 hierarchical.cpp:1279] Performed allocation for 1 agents in 175us I0125 14:44:12.080423 31432704 slave.cpp:5596] Querying resource estimator for oversubscribable resources I0125 14:44:12.080592 30359552 slave.cpp:5610] Received oversubscribable resources {} from the resource estimator I0125 14:44:12.099279 31969280 hierarchical.cpp:1677] No allocations performed I0125 14:44:12.099328 31969280 hierarchical.cpp:1279] Performed allocation for 1 agents in 180us I0125 14:44:12.104923 31969280 slave.cpp:4286] Received ping from slave-observer(1)@172.18.9.98:49526 ../../src/tests/cluster.cpp:570: Failure Failed to wait 15secs for wait I0125 14:44:12.395627 3095696320 slave.cpp:801] Agent terminating I0125 14:44:12.395679 3095696320 slave.cpp:2598] Asked to shut down framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 by @0.0.0.0:0 W0125 14:44:12.395700 3095696320 slave.cpp:2619] Ignoring shutdown framework 41942215-2f76-4ffa-9bfc-383fc4bc2b04-0000 because it is terminating I0125 14:44:12.395988 31969280 master.cpp:1261] Agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) disconnected I0125 14:44:12.396028 31969280 master.cpp:3115] Disconnecting agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:44:12.396126 31969280 master.cpp:3134] Deactivating agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 at slave(1)@172.18.9.98:49526 (172.18.9.98) I0125 14:44:12.396370 32505856 hierarchical.cpp:577] Agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 deactivated I0125 14:44:12.401453 29822976 master.cpp:1100] Master terminating I0125 14:44:12.401763 31432704 hierarchical.cpp:510] Removed agent 41942215-2f76-4ffa-9bfc-383fc4bc2b04-S0 [ FAILED ] HealthCheckTest.DefaultExecutorCmdHealthCheck (15404 ms) [----------] 1 test from HealthCheckTest (15407 ms total) [----------] Global test environment tear-down ../../src/tests/environment.cpp:843: Failure Failed Tests completed with child processes remaining: -+- 85038 /Users/gaston/mesosphere/mesos/build/src/.libs/mesos-tests --gtest_filter=HealthCheckTest.DefaultExecutorCmd* --break-on-error --verbose |-+- 85088 /Users/gaston/mesosphere/mesos/build/src/.libs/mesos-containerizer launch | --- 85130 sleep 120 --- 85132 /Users/gaston/mesosphere/mesos/build/src/.libs/mesos-io-switchboard --heartbeat_interval=30secs --help=false --socket_address=/tmp/mesos-io-switchboard-9127715f-8aed-454e-8ec3-458fbec784fa --stderr_from_fd=15 --stderr_to_fd=2 --stdin_to_fd=12 --stdout_from_fd=13 --stdout_to_fd=1 --tty=false --wait_for_connection=true [==========] 1 test from 1 test case ran. (15418 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] HealthCheckTest.DefaultExecutorCmdHealthCheck 1 FAILED TEST ``` - Gastón Kleiman On Jan. 25, 2017, 12:33 a.m., Gastón Kleiman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/55901/ > ----------------------------------------------------------- > > (Updated Jan. 25, 2017, 12:33 a.m.) > > > Review request for mesos, Alexander Rukletsov, Anand Mazumdar, haosdent > huang, and Vinod Kone. > > > Bugs: MESOS-6280 > https://issues.apache.org/jira/browse/MESOS-6280 > > > Repository: mesos > > > Description > ------- > > [WIP] Added support for command health checks to the default executor. > > This is still a work in progress, but I am posting it in order to get some > feedback on the general approach. > > Open questions/issues: > > - The code that handles communication with the Agent is on the > `DefaultExecutor`, so the health checker calls `evolve` when serializing the > API calls. Should it use the v1 protos instead, and skip the `evolve` call? > - CMD health checks for command tasks inherit the env from the executor, > but allow overriding env variables. I don't think that inheriting the > executor env makes sense for cmd health checks launched by the > `DefaultExecutor` - each task can have its own env, so the health check > command should inherit the task env and merge it with the one in the check's > `CommandInfo`. > I added some code that takes the task env from `TaskInfo` and merges it > check's env, but I don't think that this is an ideal approach, since it will > miss env variables set/modified by containerizers/hooks/isolators. Do you > think that it would make sense for all Debug Containers to inherit the > environment of the parent container? > - There's a TODO for stopping/resuming health checking when the > `DefaultExecutor` is disconnected from the agent. After talking to AlexR, I > believe that we should do this for all types of checks, not just for `CMD`. > - The test that I introduced passes on Linux, but not on macOS, I still > have to do some more debugging. > > > Diffs > ----- > > src/checks/health_checker.hpp 6e558f2061a9e31157c47d31cb64b3a8568aace3 > src/checks/health_checker.cpp 50aa2858e807b27bbab58a3618f5200cfe4eca9e > src/launcher/default_executor.cpp a03794934adb93868734f8cf00b337a1bff9b5ab > src/tests/health_check_tests.cpp debbd3c09b7555145aaf3f62a24d795d1423a269 > > Diff: https://reviews.apache.org/r/55901/diff/ > > > Testing > ------- > > Introduced a new test: `HealthCheckTest.DefaultExecutorCmdHealthCheck`. It > passes on Linux, but not on macOS. > > > Thanks, > > Gastón Kleiman > >