Re: Node startup Failure on SDN

Jonathan Yu Mon, 15 Aug 2016 23:51:13 -0700

On Aug 15, 2016 11:08, "Skarbek, John" <john.skar...@ca.com> wrote:
>
> So I figured it out. Ntp went kaboom on one of our master nodes.
>
> ERROR: [DCli0015 from diagnostic 
> ConfigContexts@openshift/origin/pkg/diagnostics/client/config_contexts.go:285]
For client config context 'default/cluster:8443/system:admin': The server
URL is 'https://cluster:8443' The user authentication is
'system:admin/cluster:8443' The current project is 'default' (*url.Error)
Get https://cluster:8443/api: x509: certificate has expired or is not yet
valid Diagnostics does not have an explanation for what this means. Please
report this error so one can be added.
>
> I ended up finding that the master node clock just…. I have no idea:
>
> [/etc/origin/master]# date Wed Feb 14 12:23:13 UTC 2001
>
> I’d like to suggest that diagnostics checks the date and time of all the
certificates and perhaps do some sort of ntp check and maybe even go the
extra mile and compare the time on the server to …life. I have no idea why
my master node decided to back to Valentines day in 2001. I think I was
single way back when.


Good idea. At minimum it seems like a good idea to record the build date
for the binary and check against that. I think Chrome does something
similar - perhaps figuring out how Chrome handles this is a reasonable
starting point
>
>
>
> --
> John Skarbek
>
> On August 15, 2016 at 13:32:13, Skarbek, John (john.skar...@ca.com) wrote:
>>
>> It would appear the certificate is valid 2018:
>>
>> `[/etc/origin/node]# openssl x509 -enddate -in system:node:node-001.crt
notAfter=Mar 21 15:18:10 2018 GMT
>>
>> Got any other ideas?
>>
>>
>>
>> --
>> John Skarbek
>>
>> On August 15, 2016 at 13:27:57, Clayton Coleman (ccole...@redhat.com)
wrote:
>>>
>>> The node's client certificate may have expired - that a common failure
mode.
>>>
>>> On Aug 15, 2016, at 1:23 PM, Skarbek, John <john.skar...@ca.com> wrote:
>>>
>>>> Good Morning,
>>>>
>>>> We recently had a node go down, upon trying to get it back online, the
origin-node service fails to start. The rest of the cluster appears to be
just fine, so with the desire to troubleshoot, what can I look at to
determine the root cause of the following error:
>>>>
>>>> Aug 15 17:12:59 node-001 origin-node[14536]: E0815 17:12:59.469682
14536 common.go:194] Failed to obtain ClusterNetwork: the server has asked
for the client to provide credentials (get clusterNetworks default) Aug 15
17:12:59 node-001 origin-node[14536]: F0815 17:12:59.469705 14536
node.go:310] error: SDN node startup failed: the server has asked for the
client to provide credentials (get clusterNetworks default)
>>>>
>>>>
>>>>
>>>> --
>>>> John Skarbek
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users@lists.openshift.redhat.com
>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
> _______________________________________________
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Node startup Failure on SDN

Reply via email to