I upgraded Go's build system to GKE 1.4.6 the other day. I'm now finding Docker dying and the GKE nodes to go unhealthy (their kubernetes.service crash looping on boot, looking for Docker).
$ kubectl get nodes NAME STATUS AGE gke-buildlets-default-pool-a4a240f8-0pf0 NotReady 10h gke-buildlets-default-pool-a4a240f8-7uwv NotReady 10h gke-buildlets-default-pool-a4a240f8-spbd NotReady 10h .. Ready Unknown Wed, 23 Nov 2016 10:12:50 +0000 Wed, 23 Nov 2016 10:13:35 +0000 NodeStatusUnknown Kubelet stopped posting node status. .... If I ssh into one of the nodes and sudo journalctl -f, I see kubernetes.service repeatedly crash looping with: Nov 23 16:24:22 gke-buildlets-default-pool-a4a240f8-7uwv systemd[602]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Nov 23 16:24:22 gke-buildlets-default-pool-a4a240f8-7uwv systemd[602]: kubelet.service: Unit entered failed state. Nov 23 16:24:22 gke-buildlets-default-pool-a4a240f8-7uwv systemd[602]: kubelet.service: Failed with result 'exit-code'. Nov 23 16:24:22 gke-buildlets-default-pool-a4a240f8-7uwv kubelet[9136]: error: failed to run Kubelet: failed to create kubelet: failed to get runtime version: docker: failed to get docker version: Cannot connect to the Docker daemon. Is the docker daemon running on this host? And yup, Docker is dead: # journalctl -u docker.service --since="10 hours ago" .... .... Nov 23 11:40:28 gke-buildlets-default-pool-a4a240f8-7uwv docker[18440]: time="2016-11-23T11:40:28.546703577Z" level=warning msg="container 464da48e2869e518e7bc5e052ced1e25b8ce70ba15168d1d34599b825dc61519 restart canceled" Nov 23 11:40:28 gke-buildlets-default-pool-a4a240f8-7uwv docker[18440]: time="2016-11-23T11:40:28.766556953Z" level=warning msg="container d1f1425cfe29a34783668a39bb50ad5db3f218a7d0caae12d8ba81387776b708 restart canceled" *Nov 23 11:40:33 gke-buildlets-default-pool-a4a240f8-7uwv docker[18440]: time="2016-11-23T11:40:33.516533433Z" level=error msg="Force shutdown daemon"* *Nov 23 11:40:33 gke-buildlets-default-pool-a4a240f8-7uwv docker[18440]: time="2016-11-23T11:40:33Z" level=info msg="stopping containerd after receiving terminated"* *Nov 23 11:40:33 gke-buildlets-default-pool-a4a240f8-7uwv docker[18440]: time="2016-11-23T11:40:33Z" level=fatal msg="containerd: serve grpc" error="accept unix /var/run/docker/libcontainerd/docker-containerd.sock: use of closed network connection"* Nov 23 11:40:34 gke-buildlets-default-pool-a4a240f8-7uwv sh[4983]: + [ ! -s /var/lib/docker/repositories-overlay ] Nov 23 11:40:34 gke-buildlets-default-pool-a4a240f8-7uwv sh[4983]: + rm -f /var/lib/docker/repositories-overlay Nov 23 11:40:59 gke-buildlets-default-pool-a4a240f8-7uwv sh[5113]: + [ ! -s /var/lib/docker/repositories-overlay ] Nov 23 11:40:59 gke-buildlets-default-pool-a4a240f8-7uwv sh[5113]: + rm -f /var/lib/docker/repositories-overlay .... Nov 23 11:47:35 gke-buildlets-default-pool-a4a240f8-7uwv sh[15220]: + [ ! -s /var/lib/docker/repositories-overlay ]Nov 23 11:47:35 gke-buildlets-default-pool-a4a240f8-7uwv sh[15220]: + rm -f /var/lib/docker/repositories-overlay Nov 23 11:47:36 gke-buildlets-default-pool-a4a240f8-7uwv docker[15226]: time="2016-11-23T11:47:36.494367508Z" level=fatal msg=*"Error starting daemon: Error initializing network controller: Error creating default \"bridge\" network: failed to allocate gateway (169.254.123.1): Address already in use"* Nov 23 11:47:36 gke-buildlets-default-pool-a4a240f8-7uwv sh[15264]: + [ ! -s /var/lib/docker/repositories-overlay ] Nov 23 11:47:36 gke-buildlets-default-pool-a4a240f8-7uwv sh[15264]: + rm -f /var/lib/docker/repositories-overlay Nov 23 11:47:37 gke-buildlets-default-pool-a4a240f8-7uwv docker[15270]: time="2016-11-23T11:47:37.607995725Z" level=fatal msg="Error starting daemon: Error initializing network controller: Error creating default \"bridge\" network: failed to allocate gateway (169.254.123.1): Address already in use" And I can't restart docker: # systemctl start docker.service Nov 23 16:53:55 gke-buildlets-default-pool-a4a240f8-7uwv sh[15731]: + [ ! -s /var/lib/docker/repositories-overlay ] Nov 23 16:53:55 gke-buildlets-default-pool-a4a240f8-7uwv sh[15731]: + rm -f /var/lib/docker/repositories-overlay Nov 23 16:53:56 gke-buildlets-default-pool-a4a240f8-7uwv docker[15737]: time="2016-11-23T16:53:56.413732453Z" level=fatal msg="Error starting daemon: Error init ializing network controller: Error creating default \"bridge\" network: failed to allocate gateway (169.254.123.1): Address already in use" Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details. # docker --version Docker version 1.11.2, build 4dc5990 Any clues? Is this a known issue? -- You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscr...@googlegroups.com. To post to this group, send email to kubernetes-users@googlegroups.com. Visit this group at https://groups.google.com/group/kubernetes-users. For more options, visit https://groups.google.com/d/optout.