Re: Catching kill due to oom

Ben Parees Mon, 03 Jul 2017 11:42:40 -0700

In this case the container being killed is probably the assemble-container,
which is not even part of the build pod, unfortunately.  It's a container
that is manually launched by the build-pod-container via direct access to
the docker socket.  It is subject to the same cgroup constraints as the
build pod (thus a frequent issue is running something like maven as part of
your assemble script, and having it try to use more memory than the cgroup
allows because maven sees the entire host memory as available.  Our s2i
images try to configure maven more appropriately now to avoid that).


However if the container is being oom killed by the system itself (vs the
process inside the container hitting an OOM and failing), i'm not sure what
options we have to report that back on the build.  Perhaps there is a way
for us to retrieve that information from the terminated container (as
apparently k8s does for the pod-managed containers).  Can you open an issue
against origin and we'll track it there?



On Mon, Jul 3, 2017 at 11:20 AM, Seth Jennings <sjenn...@redhat.com> wrote:

> Hey Andrew,  It is true that we don't generate a pod level event when
> a container in the pod is OOM killed.   There is a container status in
> the pod status that indicates with OOM with status.state.reason set to
> OOMKilled.
>
> status:
> ...
>   containerStatuses:
>   - containerID:
> docker://f2389dccd11a6575aeccbc12d360bc02eb0d2cf67c0f8d439fda57637e916628
> ...
>     state:
>       terminated:
>         containerID:
> docker://f2389dccd11a6575aeccbc12d360bc02eb0d2cf67c0f8d439fda57637e916628
>         exitCode: 1
>         finishedAt: 2017-07-03T15:08:40Z
>         reason: OOMKilled
>         startedAt: 2017-07-03T15:08:40Z
>
> Since builds have a restartPolicy: Never, the status isn't changed on
> a restart, and you can see this in on the Pods tab in the Status
> column in the web console.
>
> Thanks,
> Seth
>
>
>
>
>
> On Sun, Jul 2, 2017 at 7:23 PM, Andrew Lau <and...@andrewklau.com> wrote:
> > Hi,
> >
> > I'm often seeing issues where builds are getting killed due to oom. I'm
> > hoping to get some ideas on ways we could perhaps catch the OOM for the
> > purpose of displaying some sort of useful message.
> >
> > Based on what I am seeing, a SIGKILL is being sent to the container, so
> it's
> > not possible to catch anything like a SIGTERM from within the container
> to
> > at least display an error message in the logs. Users are often left
> confused
> > wondering why their build suddenly died.
> >
> > It's also not currently possible to configure the memory limit for the
> > buildconfig in the web console.
> >
> > _______________________________________________
> > users mailing list
> > users@lists.openshift.redhat.com
> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> >
>
> _______________________________________________
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>



-- 
Ben Parees | OpenShift

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Catching kill due to oom

Reply via email to