On Tue, May 30, 2017 at 3:43 PM, Neil Conway <neil.con...@gmail.com> wrote:

> On Tue, May 30, 2017 at 2:36 PM, Vinod Kone <vinodk...@apache.org> wrote:
> > Ran on ASF CI.
> >
> > Found following issues.
> >
> > Failed test: CommandExecutorCheckTest.CommandCheckDeliveredAndReconciled
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/36/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> 20MESOS_VERBOSE=1,OS=centos:7,label_exp=(docker%7C%7CHadoop)
> &&(!ubuntu-us1)&&(!ubuntu-eu2)/console>
>
> Attached is the test log for this failure. From a quick look, seems as
> though the agent starts to launch the task, including forking the
> child process, but no subsequent task status updates or error messages
> are observed. Gaston, have you seen this before?
>
> I filed https://issues.apache.org/jira/browse/MESOS-7589 to track this.
>

Nope, I haven't seen this before.

It looks like the executor wasn't launched or something went wrong at a
very early stage of the executor's launch.

This is what the logs from a successful run look like:

I0530 20:19:52.679759 11282 launcher.cpp:140] Forked child with pid '11304'
>> for container '703bded9-43de-4950-986f-0eaec7bbe664'
>
> I0530 20:19:52.692080 11268 fetcher.cpp:324] Starting to fetch URIs for
>> container: 703bded9-43de-4950-986f-0eaec7bbe664, directory:
>> /tmp/CommandExecutorCheckTest_CommandCheckDeliveredAndReconciled_fqE44y/slaves/7c74b451-3e64-4c95-8570-ad3b5b60962b-S0/frameworks/7c74b451-3e64-4c95-8570-ad3b5b60962b-0000/executors/55b387b7-c1f1-47d5-9f19-2d676cd4ef2e/runs/703bded9-43de-4950-986f-0eaec7bbe664
>
> I0530 20:19:53.517201 11268 hierarchical.cpp:2095] Filtered offer with
>> cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-S0 for role * of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000
>
> I0530 20:19:53.517283 11268 hierarchical.cpp:1861] No allocations performed
>
> I0530 20:19:53.517318 11268 hierarchical.cpp:1951] No inverse offers to
>> send out!
>
> I0530 20:19:53.517367 11268 hierarchical.cpp:1445] Performed allocation
>> for 1 agents in 654478ns
>
> I0530 20:19:54.518506 11277 hierarchical.cpp:2095] Filtered offer with
>> cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-S0 for role * of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000
>
> I0530 20:19:54.518584 11277 hierarchical.cpp:1861] No allocations performed
>
> I0530 20:19:54.518616 11277 hierarchical.cpp:1951] No inverse offers to
>> send out!
>
> I0530 20:19:54.518662 11277 hierarchical.cpp:1445] Performed allocation
>> for 1 agents in 568919ns
>
> I0530 20:19:55.520282 11276 hierarchical.cpp:2095] Filtered offer with
>> cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-S0 for role * of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000
>
> I0530 20:19:55.520364 11276 hierarchical.cpp:1861] No allocations performed
>
> I0530 20:19:55.520409 11276 hierarchical.cpp:1951] No inverse offers to
>> send out!
>
> I0530 20:19:55.520463 11276 hierarchical.cpp:1445] Performed allocation
>> for 1 agents in 621097ns
>
> I0530 20:19:56.521884 11271 hierarchical.cpp:2095] Filtered offer with
>> cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-S0 for role * of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000
>
> I0530 20:19:56.521966 11271 hierarchical.cpp:1861] No allocations performed
>
> I0530 20:19:56.522001 11271 hierarchical.cpp:1951] No inverse offers to
>> send out!
>
> I0530 20:19:56.522047 11271 hierarchical.cpp:1445] Performed allocation
>> for 1 agents in 597601ns
>
> I0530 20:19:57.523320 11267 hierarchical.cpp:2095] Filtered offer with
>> cpus(*):1.9; mem(*):992; disk(*):992; ports(*):[31000-32000] on agent
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-S0 for role * of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000
>
> I0530 20:19:57.523398 11267 hierarchical.cpp:1861] No allocations performed
>
> I0530 20:19:57.523432 11267 hierarchical.cpp:1951] No inverse offers to
>> send out!
>
> I0530 20:19:57.523478 11267 hierarchical.cpp:1445] Performed allocation
>> for 1 agents in 573802ns
>
> I0530 20:19:57.641808 11358 exec.cpp:162] Version: 1.4.0
>
> I0530 20:19:57.648437 11268 slave.cpp:3809] Got registration for executor
>> '55b387b7-c1f1-47d5-9f19-2d676cd4ef2e' of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000 from executor(1)@
>> 192.99.40.208:43797
>
> I0530 20:19:57.650589 11279 slave.cpp:2526] Sending queued task
>> '55b387b7-c1f1-47d5-9f19-2d676cd4ef2e' to executor
>> '55b387b7-c1f1-47d5-9f19-2d676cd4ef2e' of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000 at executor(1)@
>> 192.99.40.208:43797
>
> I0530 20:19:57.650804 11356 exec.cpp:237] Executor registered on agent
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-S0
>
> Received SUBSCRIBED event
>
> Subscribed executor on gru1.hw.ca1.mesosphere.com
>
> Received LAUNCH event
>
> Starting task 55b387b7-c1f1-47d5-9f19-2d676cd4ef2e
>
> Running '/home/gaston/mesos-custom/build/src/mesos-containerizer launch
>> <POSSIBLY-SENSITIVE-DATA>'
>
> Forked command at 11361
>
> I0530 20:19:57.672251 11273 slave.cpp:4279] Handling status update
>> TASK_RUNNING (UUID: b2e0d7b6-0f15-41d6-ae59-7c915a639672) for task
>> 55b387b7-c1f1-47d5-9f19-2d676cd4ef2e of framework
>> 7c74b451-3e64-4c95-8570-ad3b5b60962b-0000 from executor(1)@
>> 192.99.40.208:43797
>
>
The line saying "Forked command at <pid>" is logged by the command executor
before creating a checker, since it is missing from these logs, I believe
that the failure is not related to checks.

-Gastón

Reply via email to