I think the executor wants to get registered by communicating with mesos master and it fails due to network restriction. How can I change the /tmp/ path? I have mentioned /var/lib/mesos as my work_dir.
*I am explaining my setup here:* I have a Mesos setup where master and slave both are running on the same network of my university campus. Mesos agent node is situated under a firewall and only port: 5000 to port:6000 are open for incoming traffic whereas Mesos master has no such restrictions. I am running master service on master:5050 and agent is running on agent:5051 as default. I can see agent is communicating correctly to master and offering the available resources. I have mentioned the available ports for agents are ports:[5001-6000] in *src/slave/constants.cpp* file so that framework can communicate only through those ports which are open for my agent system behind the firewall. Now when I am launching jobs through Mesosphere marathon framework, I can see all jobs are connected to mesos-agent through those mentioned port ranges[5001-6000]. But my jobs are not getting submitted. So I started debugging and realised that when launching jobs mesos slaves create and launch an executor (*/erc/executor/executor.cpp*) which communicates to the mesos master through a random port. Which is outside my available range of 5000-6000 open ports. Now as through those ports my agent machine can not take any requests so executor is getting timed out and restarting the executor again and again after every 1 min of time limit. I could not find out where exactly that random port is assigned. Is there any socket connection that we can change to get executor connection happen on desired range of ports? Please let me know if my understanding is correct and how can I change those ports for executor registration. On Wed, Aug 31, 2016 at 3:09 AM, haosdent <[email protected]> wrote: > >I0829 14:27:38.322805 2700 slave.cpp:4307] *Terminating executor > ''test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' because it did not register > within 1mins > > This log looks wired. Could you find anything in the stdout/stderr of the > executor. For the executor 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' > above, it should be under the folder '/tmp/mesos/slaves/d6f0e3e2- > d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb- > 46fa-90a2-72ad649c5dd3-0000/executors/test.1fb85a35-6e16- > 11e6-bec9-c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71/' > > Apart from that, run mesos under '/tmp' is not recommended. > > On Tue, Aug 30, 2016 at 2:32 AM, Pankaj Saha <[email protected]> > wrote: > > > here is the log: > > > > > > > > I0829 14:24:21.727960 2679 main.cpp:223] Build: 2016-08-28 13:39:46 by > > root > > I0829 14:24:21.728159 2679 main.cpp:225] Version: 0.28.2 > > I0829 14:24:21.733256 2679 containerizer.cpp:149] Using isolation: > > posix/cpu,posix/mem,filesystem/posix > > I0829 14:24:21.738895 2679 linux_launcher.cpp:101] Using > > /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > > I0829 14:24:21.748019 2679 main.cpp:328] Starting Mesos slave > > I0829 14:24:21.750063 2679 slave.cpp:193] Slave started on 1)@ > > 128.226.116.69:8082 > > I0829 14:24:21.750114 2679 slave.cpp:194] Flags at startup: > > --advertise_ip="128.226.116.69" --appc_simple_discovery_uri_ > > prefix="http://" > > --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" > > --cgroups_cpu_enable_pids_and_tids_count="false" > > --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" > > --cgroups_limit_swap="false" --cgroups_root="mesos" > > --container_disk_watch_interval="15secs" --containerizers="mesos" > > --default_role="*" --disk_watch_interval="1mins" --docker="docker" > > --docker_kill_orphans="true" --docker_registry="https:// > > registry-1.docker.io" > > --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" > > --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" > > --enforce_container_disk_quota="false" > > --executor_registration_timeout="1mins" > > --executor_shutdown_grace_period="5secs" > > --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" > > --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" > > --hadoop_home="" --help="false" --hostname_lookup="true" > > --image_provisioner_backend="copy" --initialize_driver_logging="true" > > --isolation="posix/cpu,posix/mem" > > --launcher_dir="/home/pankaj/mesos-0.28.2/build/src" --logbufsecs="0" > > --logging_level="INFO" --master="129.114.110.143:5050" > > --oversubscribed_resources_interval="15secs" --perf_duration="10secs" > > --perf_interval="1mins" --port="8082" --qos_correction_interval_min= > "0ns" > > --quiet="false" --recover="reconnect" --recovery_timeout="15mins" > > --registration_backoff_factor="1secs" --revocable_cpu_low_priority=" > true" > > --sandbox_directory="/mnt/mesos/sandbox" --strict="true" > > --switch_user="true" --systemd_enable_support="true" > > --systemd_runtime_directory="/run/systemd/system" --version="false" > > --work_dir="/tmp/mesos" > > I0829 14:24:21.753572 2679 slave.cpp:464] Slave resources: cpus(*):2; > > mem(*):2855; disk(*):84691; ports(*):[8081-8081] > > I0829 14:24:21.753706 2679 slave.cpp:472] Slave attributes: [ ] > > I0829 14:24:21.753762 2679 slave.cpp:477] Slave hostname: > > venom.cs.binghamton.edu > > I0829 14:24:21.770992 2696 state.cpp:58] Recovering state from > > '/tmp/mesos/meta' > > I0829 14:24:21.771304 2696 state.cpp:698] No checkpointed resources > found > > at '/tmp/mesos/meta/resources/resources.info' > > I0829 14:24:21.771644 2696 state.cpp:101] Failed to find the latest > slave > > from '/tmp/mesos/meta' > > I0829 14:24:21.772583 2696 status_update_manager.cpp:200] Recovering > > status update manager > > I0829 14:24:21.773082 2698 containerizer.cpp:407] Recovering > containerizer > > I0829 14:24:21.777489 2702 provisioner.cpp:245] Provisioner recovery > > complete > > I0829 14:24:21.778149 2699 slave.cpp:4550] Finished recovery > > I0829 14:24:21.779564 2699 slave.cpp:796] New master detected at > > [email protected]:5050 > > I0829 14:24:21.779742 2697 status_update_manager.cpp:174] Pausing > sending > > status updates > > I0829 14:24:21.780607 2699 slave.cpp:821] No credentials provided. > > Attempting to register without authentication > > I0829 14:24:21.781394 2699 slave.cpp:832] Detecting new master > > I0829 14:24:22.698812 2702 slave.cpp:971] Registered with master > > [email protected]:5050; given slave ID > > d6f0e3e2-d144-4275-9d38-82327408622b-S12 > > I0829 14:24:22.699113 2698 status_update_manager.cpp:181] Resuming > sending > > status updates > > I0829 14:24:22.700258 2702 slave.cpp:1030] Forwarding total > oversubscribed > > resources > > I0829 14:24:43.638958 2695 http.cpp:190] HTTP GET for /slave(1)/state > from > > 128.226.119.78:59261 with User-Agent='Mozilla/5.0 (X11; Linux x86_64) > > AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.86 Safari/537.36' > > I0829 14:25:21.764268 2702 slave.cpp:4359] Current disk usage 9.67%. Max > > allowed age: 5.622868987169502days > > I0829 14:26:21.778849 2695 slave.cpp:4359] Current disk usage 9.67%. Max > > allowed age: 5.622860462326585days > > I0829 14:26:38.271085 2698 slave.cpp:1361] Got assigned task > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c for framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:26:38.311063 2698 slave.cpp:1480] Launching task > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c for framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:26:38.314755 2698 paths.cpp:528] Trying to chown > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/ > > dff399f0-beb1-4c49-bd8e-c19621de2f71' > > to user 'root' > > I0829 14:26:38.320300 2698 slave.cpp:5352] Launching executor > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 with resources cpus(*):0.1; > > mem(*):32 in work directory > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/ > > dff399f0-beb1-4c49-bd8e-c19621de2f71' > > I0829 14:26:38.321523 2702 containerizer.cpp:666] Starting container > > 'dff399f0-beb1-4c49-bd8e-c19621de2f71' for executor > > 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework > > 'c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > I0829 14:26:38.322588 2698 slave.cpp:1698] Queuing task > > 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' for executor > > 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:26:38.358906 2702 linux_launcher.cpp:304] Cloning child process > > with flags = > > I0829 14:26:38.366492 2702 containerizer.cpp:1179] Checkpointing > > executor's forked pid 2758 to > > '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2- > > 72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9- > > c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71/pids/forked.pid' > > I0829 14:27:21.779755 2701 slave.cpp:4359] Current disk usage 9.67%. Max > > allowed age: 5.622850415190289days > > I0829 14:27:38.322805 2700 slave.cpp:4307] > > *Terminating executor ''test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of > > framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' because it did not > > register within 1minsI0829 14:27:38.323226 2700 containerizer.cpp:1453] > > Destroying container 'dff399f0-beb1-4c49-bd8e-c19621de2f71'* > > I0829 14:27:38.329186 2702 cgroups.cpp:2427] Freezing cgroup > > /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 > > I0829 14:27:38.331509 2699 cgroups.cpp:1409] Successfully froze cgroup > > /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 after > > 2.19392ms > > I0829 14:27:38.334520 2698 cgroups.cpp:2445] Thawing cgroup > > /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 > > I0829 14:27:38.337821 2698 cgroups.cpp:1438] Successfullly thawed cgroup > > /sys/fs/cgroup/freezer/mesos/dff399f0-beb1-4c49-bd8e-c19621de2f71 after > > 3.194112ms > > I0829 14:27:38.435214 2696 containerizer.cpp:1689] Executor for > container > > 'dff399f0-beb1-4c49-bd8e-c19621de2f71' has exited > > I0829 14:27:38.441556 2695 provisioner.cpp:306] Ignoring destroy request > > for unknown container dff399f0-beb1-4c49-bd8e-c19621de2f71 > > I0829 14:27:38.442186 2695 slave.cpp:3871] Executor > > 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 terminated with signal Killed > > I0829 14:27:38.445689 2695 slave.cpp:3012] Handling status update > > TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 from @0.0.0.0:0 > > W0829 14:27:38.447599 2702 containerizer.cpp:1295] Ignoring update for > > unknown container: dff399f0-beb1-4c49-bd8e-c19621de2f71 > > I0829 14:27:38.448391 2702 status_update_manager.cpp:320] Received > status > > update TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:38.449525 2702 status_update_manager.cpp:824] Checkpointing > > UPDATE for status update TASK_FAILED (UUID: > > 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:38.523027 2696 slave.cpp:3410] Forwarding the update > > TASK_FAILED (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 to [email protected]:5050 > > I0829 14:27:38.627722 2698 status_update_manager.cpp:392] Received > status > > update acknowledgement (UUID: 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for > > task test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:38.627943 2698 status_update_manager.cpp:824] Checkpointing > > ACK for status update TASK_FAILED (UUID: > > 154a67ea-f6d5-4e9c-ad3d-9b09161ba34d) for task > > test.1fb85a35-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:38.698822 2701 slave.cpp:3975] Cleaning up executor > > 'test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:38.699582 2698 gc.cpp:55] Scheduling > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c/runs/ > > dff399f0-beb1-4c49-bd8e-c19621de2f71' > > for gc 6.99999190486222days in the future > > I0829 14:27:38.700202 2698 gc.cpp:55] Scheduling > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' > > for gc 6.99999190029037days in the future > > I0829 14:27:38.700382 2701 slave.cpp:4063] Cleaning up framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:38.700443 2698 gc.cpp:55] Scheduling > > '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2- > > 72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9- > > c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71' > > for gc 6.99999189796148days in the future > > I0829 14:27:38.700649 2698 gc.cpp:55] Scheduling > > '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2- > > 72ad649c5dd3-0000/executors/test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' > > for gc 6.99999189622815days in the future > > I0829 14:27:38.700845 2698 gc.cpp:55] Scheduling > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > for gc 6.99999189143704days in the future > > I0829 14:27:38.701015 2701 status_update_manager.cpp:282] Closing status > > update streams for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:38.701161 2698 gc.cpp:55] Scheduling > > '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > for gc 6.9999918900237days in the future > > I0829 14:27:39.651463 2697 slave.cpp:1361] Got assigned task > > test.445696e6-6e16-11e6-bec9-c27afc834a0c for framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:39.655815 2696 gc.cpp:83] Unscheduling > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > from gc > > I0829 14:27:39.656445 2696 gc.cpp:83] Unscheduling > > '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > from gc > > I0829 14:27:39.656855 2702 slave.cpp:1480] Launching task > > test.445696e6-6e16-11e6-bec9-c27afc834a0c for framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:39.660585 2702 paths.cpp:528] Trying to chown > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > executors/test.445696e6-6e16-11e6-bec9-c27afc834a0c/runs/ > > ea676570-0a2a-49c3-a75c-14e045eb842b' > > to user 'root' > > I0829 14:27:39.666008 2702 slave.cpp:5352] Launching executor > > test.445696e6-6e16-11e6-bec9-c27afc834a0c of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 with resources cpus(*):0.1; > > mem(*):32 in work directory > > '/tmp/mesos/slaves/d6f0e3e2-d144-4275-9d38-82327408622b- > > S12/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > executors/test.445696e6-6e16-11e6-bec9-c27afc834a0c/runs/ > > ea676570-0a2a-49c3-a75c-14e045eb842b' > > I0829 14:27:39.667603 2702 slave.cpp:1698] Queuing task > > 'test.445696e6-6e16-11e6-bec9-c27afc834a0c' for executor > > 'test.445696e6-6e16-11e6-bec9-c27afc834a0c' of framework > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > I0829 14:27:39.668207 2702 containerizer.cpp:666] Starting container > > 'ea676570-0a2a-49c3-a75c-14e045eb842b' for executor > > 'test.445696e6-6e16-11e6-bec9-c27afc834a0c' of framework > > 'c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > I0829 14:27:39.678665 2702 linux_launcher.cpp:304] Cloning child process > > with flags = > > I0829 14:27:39.681824 2702 containerizer.cpp:1179] Checkpointing > > executor's forked pid 2799 to > > '/tmp/mesos/meta/slaves/d6f0e3e2-d144-4275-9d38- > > 82327408622b-S12/frameworks/c796100f-9ecb-46fa-90a2- > > 72ad649c5dd3-0000/executors/test.445696e6-6e16-11e6-bec9- > > c27afc834a0c/runs/ea676570-0a2a-49c3-a75c-14e045eb842b/pids/forked.pid' > > > > Thanks > > Pankaj > > > > > > On Mon, Aug 29, 2016 at 7:25 AM, haosdent <[email protected]> wrote: > > > > > Hi, @Pankaj, Could you provide logs during " the job is getting > restarted > > > and a new container is created with a new process id. ". The logs you > > > provided looks normal. > > > > > > On Mon, Aug 29, 2016 at 5:26 AM, Pankaj Saha <[email protected]> > > > wrote: > > > > > > > Hi > > > > I am facing an issue with a launched jobs into my mesos agents. I am > > > trying > > > > to launch a job through marathon framework and job is staying in > > stagged > > > > state and not running. > > > > I could see the log message at the agent console as below: > > > > > > > > Scheduling > > > > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275- > > > 9d38-82327408622b-S8/ > > > > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > > > for gc 6.99999884239407days in the future > > > > I0828 16:20:36.053483 28512 slave.cpp:1361] *Got assigned task > > > > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > > > > for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > > > I0828 16:20:36.056224 28510 gc.cpp:83] Unscheduling > > > > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > > > > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2- > 72ad649c5dd3-0000' > > > > from gc > > > > I0828 16:20:36.056715 28510 gc.cpp:83] Unscheduling > > > > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275- > > > 9d38-82327408622b-S8/ > > > > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > > > from gc > > > > I0828 16:20:36.057231 28509 slave.cpp:1480] *Launching task > > > > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > > > > for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > > > I0828 16:20:36.058661 28509 paths.cpp:528]* Trying to chown* > > > > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > > > > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2- > > > > 72ad649c5dd3-0000/executors/test-crixus.eb66a42b-6d5c- > > > > 11e6-bec9-c27afc834a0c/runs/99620406-87b5-406c-a88b-13adb145c12d' > > > > to user 'root' > > > > I0828 16:20:36.067807 28509 slave.cpp:5352]* Launching executor > > > > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > > > > of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 with > resources > > > > cpus(*):0.1; mem(*):32 in work directory > > > > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > > > > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2- > > > > 72ad649c5dd3-0000/executors/test-crixus.eb66a42b-6d5c- > > > > 11e6-bec9-c27afc834a0c/runs/99620406-87b5-406c-a88b-13adb145c12d' > > > > I0828 16:20:36.069314 28509 slave.cpp:1698] *Queuing task > > > > 'test-crixus.*eb66a42b-6d5c-11e6-bec9-c27afc834a0c' > > > > for executor 'test-crixus.eb66a42b-6d5c-11e6-bec9-c27afc834a0c' of > > > > framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > > > > I0828 16:20:36.069902 28509 containerizer.cpp:666] *Starting > container* > > > > '99620406-87b5-406c-a88b-13adb145c12d' for executor > > > > 'test-crixus.eb66a42b-6d5c-11e6-bec9-c27afc834a0c' of framework > > > > 'c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > > > > I0828 16:20:36.080713 28509 linux_launcher.cpp:304] *Cloning child > > > process* > > > > with flags = > > > > I0828 16:20:36.084738 28509 containerizer.cpp:1179] *Checkpointing > > > > executor's forked pid 29629* to > > > > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275- > > > 9d38-82327408622b-S8/ > > > > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > > > > executors/test-crixus.eb66a42b-6d5c-11e6-bec9- > > > c27afc834a0c/runs/99620406- > > > > 87b5-406c-a88b-13adb145c12d/pids/forked.pid' > > > > > > > > > > > > But after that, the job is getting restarted and a new container is > > > created > > > > with a new process id. It happening infinitely which is keeping the > job > > > in > > > > stagged state to mesos-master. > > > > > > > > This job is nothing but a simle echo "hello world" kind of shell > > command. > > > > Can anyone please point out where its failing or I am doing wrong. > > > > > > > > > > > > > > > > Thanks > > > > Pankaj > > > > > > > > > > > > > > > > -- > > > Best Regards, > > > Haosdent Huang > > > > > > > > > -- > Best Regards, > Haosdent Huang >
