Hi, @Pankaj, Could you provide logs during " the job is getting restarted and a new container is created with a new process id. ". The logs you provided looks normal.
On Mon, Aug 29, 2016 at 5:26 AM, Pankaj Saha <[email protected]> wrote: > Hi > I am facing an issue with a launched jobs into my mesos agents. I am trying > to launch a job through marathon framework and job is staying in stagged > state and not running. > I could see the log message at the agent console as below: > > Scheduling > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S8/ > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > for gc 6.99999884239407days in the future > I0828 16:20:36.053483 28512 slave.cpp:1361] *Got assigned task > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > I0828 16:20:36.056224 28510 gc.cpp:83] Unscheduling > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > from gc > I0828 16:20:36.056715 28510 gc.cpp:83] Unscheduling > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S8/ > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > from gc > I0828 16:20:36.057231 28509 slave.cpp:1480] *Launching task > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > for framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > I0828 16:20:36.058661 28509 paths.cpp:528]* Trying to chown* > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2- > 72ad649c5dd3-0000/executors/test-crixus.eb66a42b-6d5c- > 11e6-bec9-c27afc834a0c/runs/99620406-87b5-406c-a88b-13adb145c12d' > to user 'root' > I0828 16:20:36.067807 28509 slave.cpp:5352]* Launching executor > test-crixus*.eb66a42b-6d5c-11e6-bec9-c27afc834a0c > of framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 with resources > cpus(*):0.1; mem(*):32 in work directory > '/var/lib/mesos-8082/slaves/d6f0e3e2-d144-4275-9d38- > 82327408622b-S8/frameworks/c796100f-9ecb-46fa-90a2- > 72ad649c5dd3-0000/executors/test-crixus.eb66a42b-6d5c- > 11e6-bec9-c27afc834a0c/runs/99620406-87b5-406c-a88b-13adb145c12d' > I0828 16:20:36.069314 28509 slave.cpp:1698] *Queuing task > 'test-crixus.*eb66a42b-6d5c-11e6-bec9-c27afc834a0c' > for executor 'test-crixus.eb66a42b-6d5c-11e6-bec9-c27afc834a0c' of > framework c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000 > I0828 16:20:36.069902 28509 containerizer.cpp:666] *Starting container* > '99620406-87b5-406c-a88b-13adb145c12d' for executor > 'test-crixus.eb66a42b-6d5c-11e6-bec9-c27afc834a0c' of framework > 'c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000' > I0828 16:20:36.080713 28509 linux_launcher.cpp:304] *Cloning child process* > with flags = > I0828 16:20:36.084738 28509 containerizer.cpp:1179] *Checkpointing > executor's forked pid 29629* to > '/var/lib/mesos-8082/meta/slaves/d6f0e3e2-d144-4275-9d38-82327408622b-S8/ > frameworks/c796100f-9ecb-46fa-90a2-72ad649c5dd3-0000/ > executors/test-crixus.eb66a42b-6d5c-11e6-bec9-c27afc834a0c/runs/99620406- > 87b5-406c-a88b-13adb145c12d/pids/forked.pid' > > > But after that, the job is getting restarted and a new container is created > with a new process id. It happening infinitely which is keeping the job in > stagged state to mesos-master. > > This job is nothing but a simle echo "hello world" kind of shell command. > Can anyone please point out where its failing or I am doing wrong. > > > > Thanks > Pankaj > -- Best Regards, Haosdent Huang
