(google groups is taking days when I use my non-gmail email, so I'm sending via gmail again)
On Fri, Oct 27, 2017 at 6:17 PM, David Rosenstrauch <dar...@darose.net> wrote: > I'm trying to make sure that as I'm deploying new services on our cluster, > that failures/restarts get handled in a way that's most optimal for > resiliency/uptime. > > > I'm simplifying things a bit, but if a piece of code running inside a > container crashes, there's more or less 2 possibilities: 1) bug in the code > (and/or it's trying to process data that causes an error), or 2) problems It can be a "random" issue (like a network burp, etc.), or also exceeding memory limits of the container and gets restarted (that happens when X event is processed in the container and uses tons of mem), etc, etc. Which will, most probably, work if restarted. > with the hardware/network (full disk, bad disk, network outage, etc.) If As Tim said, network outage is handled (like if the node <--> master network is not working) just fine. Full disk should be handled fine since a few versions of kubernetes now, as there is accounting of inodes and space used by containers, so it can be reclaimed too. > the issue is #1, then it doesn't matter whether you restart the container or > the pod. But if the issue is #2, then restarting the pod (i.e., on another > host) would fix the problem, while restarting the container probably > wouldn't. > > So I guess this is sort of alluding to a bigger question, then: does k8s > have any ability to detect if a host is having hardware problems and, if so, > avoid scheduling new pods on it, move pods off of it if their containers are > crashing, etc. I know of https://github.com/kubernetes/node-problem-detector, that I think tries to solve exactly that. But I have not used that myself. So I guess the answer is "yes" :-) Thanks, Rodrigo -- You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscr...@googlegroups.com. To post to this group, send email to kubernetes-users@googlegroups.com. Visit this group at https://groups.google.com/group/kubernetes-users. For more options, visit https://groups.google.com/d/optout.