On Mon, Dec 11, 2017 at 7:34 PM, Benjamin Mahler <[email protected]> wrote:
> 1) My guess is that we only added the errors because we alerted on there > being an error increase. I assume you also care about every error? Having a > 'success' count and 'total' count sounds reasonable to me. > Thanks I filed https://issues.apache.org/jira/browse/MESOS-8324 > > 2) Not sure, have you read the code? What would you want to be the case? > Would you need them split apart or would you want all container launches > (nested and non-nested) to be included? > It seems like `container_launch_errors ` is only incremented in Slave::executorLaunched, so standalone container errors are not tracked by this metric either. > > On Fri, Dec 1, 2017 at 9:34 AM, Zhitao Li <[email protected]> wrote: > > > Hi, > > > > We are working on defining containerizer SLA metrics in our cluster. I > > found that Mesos agent only provides a "slave/container_launch_errors" > > counter right now. > > > > A couple of questions: > > > > 1) is there a "container launch success/count" on the agent which can be > > used as the comparative baseline of the above metric? If not, should we > add > > one? > > 2) does the `errors` metric above also cover nested and standalone > > containers? > > > > Thanks! > > > > -- > > Cheers, > > > > Zhitao Li > > > -- Cheers, Zhitao Li
