[ 
https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476944#comment-15476944
 ] 

Kevin Klues commented on MESOS-6118:
------------------------------------

Brian. Those logs you posted appear to be incomplete. Some of them don't 
contain the id of the mount point where the cycle occurred. The last one 
actually seems to be cut off half way through the printed line. Could you 
please double check that all of the logs are included?

Unfortunately, I am not able to reproduce the bug locally either. Could you 
point me at some code that I could run to reproduce it?

> Agent would crash with docker container tasks due to host mount table read.
> ---------------------------------------------------------------------------
>
>                 Key: MESOS-6118
>                 URL: https://issues.apache.org/jira/browse/MESOS-6118
>             Project: Mesos
>          Issue Type: Bug
>          Components: slave
>    Affects Versions: 1.0.1
>         Environment: Build: 2016-08-26 23:06:27 by centos
> Version: 1.0.1
> Git tag: 1.0.1
> Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3
> systemd version `219` detected
> Inializing systemd state
> Created systemd slice: `/run/systemd/system/mesos_executors.slice`
> Started systemd slice `mesos_executors.slice`
> Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni
>  Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 
> UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Jamie Briant
>            Assignee: Kevin Klues
>            Priority: Critical
>              Labels: linux, slave
>             Fix For: 1.1.0, 1.0.2
>
>         Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, 
> cycle6.log, slave-crash.log
>
>
> I have a framework which schedules thousands of short running (a few seconds 
> to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the 
> slave process will crash every few minutes (with systemd restarting it).
> Crash is:
> Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678  1232 
> fs.cpp:140] Check failed: !visitedParents.contains(parentId)
> Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: 
> ***
> Version 1.0.0 works without this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to