Re: On adding a debug endpoint for Mesos containerizer

2019-06-07 Thread Benno Evers
I agree, this looks pretty nice. Maybe we can elevate it to libprocess
itself if it proves useful.

On Thu, Jun 6, 2019 at 1:07 AM James Peach  wrote:

> I really like this proposal and I think that it would help opertional
> teams a lot. Let’s make sure that it is well documented :)
>
> > On Jun 5, 2019, at 1:05 AM, Andrei Budnik  wrote:
> >
> > Hi folks,
> >
> > We have been encountering container stuck issues for quite a long time.
> Some of these issues are caused by external components such as CNI/CSI
> plugins, custom Mesos modules, etc. Also, there were cases when a container
> become stuck due to a Linux kernel bug. All these kinds of issues make it
> difficult to debug container stuck issues.
> >
> > We are proposing a container debug endpoint for the Mesos agent [1],
> which is based on a new mechanism for tracking pending libprocess futures
> [2].
> >
> > Please review both of them.
> >
> > [1] Container debug endpoint:
> https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
> > [2] Tracking libprocess futures:
> https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY
>
>

-- 
Benno Evers
Software Engineer, Mesosphere


Re: On adding a debug endpoint for Mesos containerizer

2019-06-05 Thread James Peach
I really like this proposal and I think that it would help opertional teams a 
lot. Let’s make sure that it is well documented :)

> On Jun 5, 2019, at 1:05 AM, Andrei Budnik  wrote:
> 
> Hi folks,
> 
> We have been encountering container stuck issues for quite a long time. Some 
> of these issues are caused by external components such as CNI/CSI plugins, 
> custom Mesos modules, etc. Also, there were cases when a container become 
> stuck due to a Linux kernel bug. All these kinds of issues make it difficult 
> to debug container stuck issues.
> 
> We are proposing a container debug endpoint for the Mesos agent [1], which is 
> based on a new mechanism for tracking pending libprocess futures [2].
> 
> Please review both of them.
> 
> [1] Container debug endpoint: 
> https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
> [2] Tracking libprocess futures: 
> https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY



On adding a debug endpoint for Mesos containerizer

2019-06-04 Thread Andrei Budnik
Hi folks,

We have been encountering container stuck issues for quite a long time.
Some of these issues are caused by external components such as CNI/CSI
plugins, custom Mesos modules, etc. Also, there were cases when a container
become stuck due to a Linux kernel bug. All these kinds of issues make it
difficult to debug container stuck issues.

We are proposing a container debug endpoint for the Mesos agent [1], which
is based on a new mechanism for tracking pending libprocess futures [2].

Please review both of them.

[1] Container debug endpoint:
https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
[2] Tracking libprocess futures:
https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY