On Aug 15, 2016 11:08, "Skarbek, John" <john.skar...@ca.com> wrote: > > So I figured it out. Ntp went kaboom on one of our master nodes. > > ERROR: [DCli0015 from diagnostic > ConfigContexts@openshift/origin/pkg/diagnostics/client/config_contexts.go:285] For client config context 'default/cluster:8443/system:admin': The server URL is 'https://cluster:8443' The user authentication is 'system:admin/cluster:8443' The current project is 'default' (*url.Error) Get https://cluster:8443/api: x509: certificate has expired or is not yet valid Diagnostics does not have an explanation for what this means. Please report this error so one can be added. > > I ended up finding that the master node clock just…. I have no idea: > > [/etc/origin/master]# date Wed Feb 14 12:23:13 UTC 2001 > > I’d like to suggest that diagnostics checks the date and time of all the certificates and perhaps do some sort of ntp check and maybe even go the extra mile and compare the time on the server to …life. I have no idea why my master node decided to back to Valentines day in 2001. I think I was single way back when.
Good idea. At minimum it seems like a good idea to record the build date for the binary and check against that. I think Chrome does something similar - perhaps figuring out how Chrome handles this is a reasonable starting point > > > > -- > John Skarbek > > On August 15, 2016 at 13:32:13, Skarbek, John (john.skar...@ca.com) wrote: >> >> It would appear the certificate is valid 2018: >> >> `[/etc/origin/node]# openssl x509 -enddate -in system:node:node-001.crt notAfter=Mar 21 15:18:10 2018 GMT >> >> Got any other ideas? >> >> >> >> -- >> John Skarbek >> >> On August 15, 2016 at 13:27:57, Clayton Coleman (ccole...@redhat.com) wrote: >>> >>> The node's client certificate may have expired - that a common failure mode. >>> >>> On Aug 15, 2016, at 1:23 PM, Skarbek, John <john.skar...@ca.com> wrote: >>> >>>> Good Morning, >>>> >>>> We recently had a node go down, upon trying to get it back online, the origin-node service fails to start. The rest of the cluster appears to be just fine, so with the desire to troubleshoot, what can I look at to determine the root cause of the following error: >>>> >>>> Aug 15 17:12:59 node-001 origin-node[14536]: E0815 17:12:59.469682 14536 common.go:194] Failed to obtain ClusterNetwork: the server has asked for the client to provide credentials (get clusterNetworks default) Aug 15 17:12:59 node-001 origin-node[14536]: F0815 17:12:59.469705 14536 node.go:310] error: SDN node startup failed: the server has asked for the client to provide credentials (get clusterNetworks default) >>>> >>>> >>>> >>>> -- >>>> John Skarbek >>>> >>>> _______________________________________________ >>>> users mailing list >>>> users@lists.openshift.redhat.com >>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users > > > _______________________________________________ > users mailing list > users@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/users >
_______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users