hmm, I think it should be a bug that executor didn't reap the task status
after I checked src/launcher/executor.cpp .

On Sat, Jun 4, 2016 at 6:50 PM, Tomek Janiszewski <[email protected]> wrote:

> Too late. Sandbox was collected by GC.
>
> sob., 4.06.2016 o 12:45 użytkownik haosdent <[email protected]> napisał:
>
> > Usually executor would terminate itself if it reap the task status is
> > killed or finished.
> > Otherwise the reap callback have not yet registered not our executor has
> > bug when
> > reap task status. Could you find something in the executor stdout/stderr
> ?
> >
> > On Sat, Jun 4, 2016 at 6:08 PM, Tomek Janiszewski <[email protected]>
> > wrote:
> >
> > > Thanks. I just manually find that executor pid and killed it. Any idea
> > why
> > > it was still running without tasks?
> > >
> > > sob., 4.06.2016, 05:35 użytkownik haosdent <[email protected]>
> napisał:
> > >
> > > > > 13:33:39.031054  [slave.cpp:2643] Got registration for executor
> > > > 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
> > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000 from executor(1)@
> > > > 10.55.97.170:60083
> > > >
> > > > Yes, according to your log, your executor is still running. If your
> > > > executor is http_command_executor,
> > > > you could use
> > > >
> > > >
> > >
> >
> https://github.com/apache/mesos/blob/master/docs/executor-http-api.md#shutdown
> > > > to shutdown it.
> > > > If it is other type executor, seems don't have a api to shutdown
> > executor
> > > > as I know. Not sure whether kill the executor in
> > > > Agent could resolve your problem or not.
> > > >
> > > > On Fri, Jun 3, 2016 at 4:33 PM, Tomek Janiszewski <[email protected]
> >
> > > > wrote:
> > > >
> > > > > Here is truncated response from slave(1)/state
> > > > >
> > > > > {
> > > > >     "attributes": {...},
> > > > >     "completed_frameworks": [],
> > > > >     "flags": {...},
> > > > >     "frameworks": [
> > > > >         {
> > > > >             "checkpoint": true,
> > > > >             "completed_executors": [...],
> > > > >             "executors": [
> > > > >               {
> > > > >                   "queued_tasks": [],
> > > > >                   "tasks": [],
> > > > >                   "completed_tasks": [
> > > > >                       {
> > > > >                           "discovery": {...},
> > > > >                           "executor_id": "",
> > > > >                           "framework_id":
> > > > > "f65b163c-0faf-441f-ac14-91739fa4394c-0000",
> > > > >                           "id":
> > > > > "service.a3b609b8-27ec-11e6-8044-02c89eb9127e",
> > > > >                           "labels": [...],
> > > > >                           "name": "service",
> > > > >                           "resources": {...},
> > > > >                           "slave_id":
> > > > > "ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13",
> > > > >                           "state": "TASK_KILLED",
> > > > >                           "statuses": []
> > > > >                       }
> > > > >                   ],
> > > > >                   "container":
> > "ead42e63-ac92-4ad0-a99c-4af9c3fa5e31",
> > > > >                   "directory": "...",
> > > > >                   "id":
> > "service.a3b609b8-27ec-11e6-8044-02c89eb9127e",
> > > > >                   "name": "Command Executor (Task:
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e) (Command: sh -c 'cd
> > > > > service...')",
> > > > >                   "resources": {...},
> > > > >                   "source":
> > > > "service.a3b609b8-27ec-11e6-8044-02c89eb9127e"
> > > > >
> > > > >               },
> > > > >               ...
> > > > >             ],
> > > > >         }
> > > > >     ],
> > > > >     "git_sha": "961edbd82e691a619a4c171a7aadc9c32957fa73",
> > > > >     "git_tag": "0.28.0",
> > > > >     "version": "0.28.0",
> > > > >     ...
> > > > > }
> > > > >
> > > > > Here is the log for this container:
> > > > >
> > > > > > 13:33:19.479182  [slave.cpp:1361] Got assigned task
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e for framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000
> > > > > > 13:33:19.482566  [slave.cpp:1480] Launching task
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e for framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000
> > > > > > 13:33:19.483921  [paths.cpp:528] Trying to chown
> > > > >
> > > > >
> > > >
> > >
> >
> '/tmp/mesos/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-0000/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31'
> > > > > to user 'mesosuser'
> > > > > > 13:33:19.504173  [slave.cpp:5367] Launching executor
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000 with resources
> cpus(*):0.1;
> > > > > mem(*):32 in work directory
> > > > >
> > > > >
> > > >
> > >
> >
> '/tmp/mesos/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-0000/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31'
> > > > > > 13:33:19.505537  [containerizer.cpp:666] Starting container
> > > > > 'ead42e63-ac92-4ad0-a99c-4af9c3fa5e31' for executor
> > > > > 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
> > > > > 'f65b163c-0faf-441f-ac14-91739fa4394c-0000'
> > > > > > 13:33:19.505734  [slave.cpp:1698] Queuing task
> > > > > 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' for executor
> > > > > 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000
> > > > > ...
> > > > > > 13:33:19.977483  [containerizer.cpp:1118] Checkpointing
> executor's
> > > > forked
> > > > > pid 25576 to
> > > > >
> > > > >
> > > >
> > >
> >
> '/tmp/mesos/meta/slaves/ef232fd9-5114-4d8f-adc3-1669c1e6fdc5-S13/frameworks/f65b163c-0faf-441f-ac14-91739fa4394c-0000/executors/service.a3b609b8-27ec-11e6-8044-02c89eb9127e/runs/ead42e63-ac92-4ad0-a99c-4af9c3fa5e31/pids/forked.pid'
> > > > > > 13:33:35.775195  [slave.cpp:1891] Asked to kill task
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000
> > > > > > 13:33:35.775645  [slave.cpp:3002] Handling status update
> > TASK_KILLED
> > > > > (UUID: eba64915-7df2-483d-8982-a9a46a48a81b) for task
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000 f
> > > > > rom @0.0.0.0:0
> > > > > > 13:33:35.778105  [cpushare.cpp:389] Updated 'cpu.shares' to 102
> > (cpus
> > > > > 0.1) for container ead42e63-ac92-4ad0-a99c-4af9c3fa5e31
> > > > > > 13:33:35.778488  [disk.cpp:169] Updating the disk resources for
> > > > container
> > > > > ead42e63-ac92-4ad0-a99c-4af9c3fa5e31 to cpus(*):0.1
> > > > > ; mem(*):32
> > > > > > 13:33:35.780349  [mem.cpp:353] Updated
> 'memory.soft_limit_in_bytes'
> > > to
> > > > > 32MB for container ead42e63-ac92-4ad0-a99c-4af9c3fa5e3
> > > > > 1
> > > > > > 13:33:35.782573  [status_update_manager.cpp:320] Received status
> > > update
> > > > > TASK_KILLED (UUID: eba64915-7df2-483d-8982-a9a46a48a8
> > > > > 1b) for task service.a3b609b8-27ec-11e6-8044-02c89eb9127e of
> > framework
> > > > > f65b163c-0faf-441f-ac14-9173
> > > > > 9fa4394c-0000
> > > > > > 13:33:35.783860  [status_update_manager.cpp:824] Checkpointing
> > UPDATE
> > > > for
> > > > > status update TASK_KILLED (UUID:
> > eba64915-7df2-483d-8982-a9a46a48a81b)
> > > > for
> > > > > task service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000
> > > > > > 13:33:35.788767  [slave.cpp:3400] Forwarding the update
> TASK_KILLED
> > > > > (UUID: eba64915-7df2-483d-8982-a9a46a48a81b) for task
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000 to
> > [email protected]:5050
> > > > > > 13:33:35.917932  [status_update_manager.cpp:392] Received status
> > > update
> > > > > acknowledgement (UUID: eba64915-7df2-483d-8982-a9a46a48a81b) for
> task
> > > > > service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000
> > > > > > 13:33:35.918143  [status_update_manager.cpp:824] Checkpointing
> ACK
> > > for
> > > > > status update TASK_KILLED (UUID:
> > eba64915-7df2-483d-8982-a9a46a48a81b)
> > > > for
> > > > > task service.a3b609b8-27ec-11e6-8044-02c89eb9127e of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000
> > > > > ...
> > > > > > 13:33:39.031054  [slave.cpp:2643] Got registration for executor
> > > > > 'service.a3b609b8-27ec-11e6-8044-02c89eb9127e' of framework
> > > > > f65b163c-0faf-441f-ac14-91739fa4394c-0000 from executor(1)@
> > > > > 10.55.97.170:60083
> > > > >
> > > > >
> > > > > Visible container is no longer running but it appears as running.
> > What
> > > > > should I do with it?
> > > > >
> > > > > Thanks
> > > > > Tomek
> > > > >
> > > > >
> > > > > czw., 2.06.2016 o 15:55 użytkownik Tomek Janiszewski <
> > > [email protected]>
> > > > > napisał:
> > > > >
> > > > > > Yes. I see dead executor in executors. It's tasks and
> queued_tasks
> > > are
> > > > > > empty but there is one task in completed_tasks.
> > > > > frameworks.completed_executors
> > > > > > are filled with other executors.
> > > > > >
> > > > > > czw., 2.06.2016 o 15:39 użytkownik haosdent <[email protected]>
> > > > > napisał:
> > > > > >
> > > > > >> Hi, @janiszt Seems the completed executors only exists
> > > > > >> in completed_frameworks.completed_executors
> > > > > >> or frameworks.completed_executors in my side.
> > > > > >>
> > > > > >> In your side, does completed_executors exists in any other
> fields?
> > > > > >>
> > > > > >> On Thu, Jun 2, 2016 at 5:39 PM, Tomek Janiszewski <
> > > [email protected]>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Hi
> > > > > >> >
> > > > > >> > I'm running Mesos 0.28.0. Mesos slave(1)/state endpoint
> returns
> > > some
> > > > > >> > completed executors not in frameworks.completed_executors but
> in
> > > > > >> > frameworks.
> > > > > >> > executors.
> > > > > >> > Is it normal behavior? How to force Mesos to move completed
> > > > > >> > executors into frameworks.executors?
> > > > > >> >
> > > > > >> > Thanks
> > > > > >> > Tomek
> > > > > >> >
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Best Regards,
> > > > > >> Haosdent Huang
> > > > > >>
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards,
> > > > Haosdent Huang
> > > >
> > >
> >
> >
> >
> > --
> > Best Regards,
> > Haosdent Huang
> >
>



-- 
Best Regards,
Haosdent Huang

Reply via email to