I started my cluster with the kubeadm init command and joined 2 nodes on it. For testing purposes i wanted a nginx server inside a container. just to test the high availability of the cluster. I created a replication controller yaml file that will create a pod with 3 replica's and inside the pod a nginx container. With a service yaml file i created a service so that i need to surf to 1 ip address to access the nginx server.
So far so good . i see the pods and which nodes they are running one with the kubectl get pods -o wide command. And i struggle with a high availability problem. Rught now i'am testing inside virtual machines until my raspberry pi's are arrived. But the goal is a high available raspberry pi cluster. When i kill one node (poweroff 1 vm ) it takes 5 minutes until kubernetes knows that a node is down. it reschedule it good but 5 minutes is too long , especially when in production on a raspberry pi cluster. Online(https://stackoverflow.com/a/47317892/6351302) i found some answers to change the default lookup from 5 minutes to a time i can choose. with the following commands ``` --node-monitor-period=2s (default 5s) --node-monitor-grace-period=16s (default 40s) --pod-eviction-timeout=30s (default 5m) ``` Because i try to do everything in yaml files to automate the entire process i searched for the kube controller manager yaml file and found it at /etc/kubernetes/manifest/kube-controller-manager. When i add those parameters in "spec-containers-command" my entire cluster freezes. - i can not delete previous replication controllers - when i try to delete pods kubernetes won't create a new one. - when i create a new replication controller , the pods will not be created. I thought oké maybe it is because the new parameters are not set yet and i have to restart the kube controller manager service but there isn't any . I rebooted my vm and it still does not use the new parameters. With the command journalctl -xeu kubelet i got this error ``` Feb 28 11:56:36 kubeMaster kubelet[448]: I0228 11:56:36.984643 448 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kubemaster_kube-system(e410a4a570860693452a4d5d29069eba) Feb 28 11:56:36 kubeMaster kubelet[448]: E0228 11:56:36.984671 448 pod_workers.go:186] Error syncing pod e410a4a570860693452a4d5d29069eba ("kube-controller-manager-kubemaster_kube-system(e410a4a570860693452a4d5d29069eba)"), skipping: failed to "StartContainer Feb 28 11:56:48 kubeMaster kubelet[448]: I0228 11:56:48.987153 448 kuberuntime_manager.go:514] Container {Name:kube-controller-manager Image:gcr.io/google_containers/kube-controller-manager-amd64:v1.9.3 Command:[kube-controller-manager --leader-elect=true --u ``` -- You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/kubernetes-users. For more options, visit https://groups.google.com/d/optout.
