Maybe an uncanny coincidence but with think the cluster was created
almost EXACTLY 1 year before it failed.
On 31/03/2020 16:17, Ben Holmes wrote:
Hi Tim,
Can you verify that the host's clocks are being synced correctly as
per Simon's other suggestion?
Ben
On Tue, 31 Mar 2020 at 16:05, Tim Dudgeon <tdudgeon...@gmail.com
<mailto:tdudgeon...@gmail.com>> wrote:
Hi Simon,
we're run those playbooks and all certs are reported as still
being valid.
Tim
On 31/03/2020 15:59, Simon Krenger wrote:
> Hi Tim,
>
> Note that there are multiple sets of certificates, both external and
> internal. So it would be worth checking the certificates again using
> the Certificate Expiration Playbooks (see link below). The
> documentation also has an overview of what can be done to renew
> certain certificates:
>
> - [ Redeploying Certificates ]
>
https://docs.okd.io/3.11/install_config/redeploying_certificates.html
>
> Apart from checking all certificates, I'd certainly review the time
> synchronisation for the whole cluster, as we see the message "x509:
> certificate has expired or is not yet valid".
>
> I hope this helps.
>
> Kind regards
> Simon
>
> On Tue, Mar 31, 2020 at 4:33 PM Tim Dudgeon
<tdudgeon...@gmail.com <mailto:tdudgeon...@gmail.com>> wrote:
>> One of our OKD 3.11 clusters has suddenly stopped working
without any
>> obvious reason.
>>
>> The origin-node service on the nodes does not start (times out).
>> The master-api pod is running on the master.
>> The nodes can access the master-api endpoints.
>>
>> The logs of the master-api pod look mostly OK other than a huge
number
>> of warnings about certificates that don't really make sense as the
>> certificates are valid (we use named certificates from let's
Encryt and
>> they were renewed about 2 weeks ago and all appear to be correct.
>>
>> Examples of errors from the master-api pod are:
>>
>> I0331 12:46:57.065147 1 establishing_controller.go:73]
Starting
>> EstablishingController
>> I0331 12:46:57.065561 1 logs.go:49] http: TLS handshake
error from
>> 192.168.160.17:58024 <http://192.168.160.17:58024>: EOF
>> I0331 12:46:57.071932 1 logs.go:49] http: TLS handshake
error from
>> 192.168.160.19:48102 <http://192.168.160.19:48102>: EOF
>> I0331 12:46:57.072036 1 logs.go:49] http: TLS handshake
error from
>> 192.168.160.19:37178 <http://192.168.160.19:37178>: EOF
>> I0331 12:46:57.072141 1 logs.go:49] http: TLS handshake
error from
>> 192.168.160.17:58022 <http://192.168.160.17:58022>: EOF
>>
>> E0331 12:47:37.855023 1 memcache.go:147] couldn't get
resource
>> list for metrics.k8s.io/v1beta1
<http://metrics.k8s.io/v1beta1>: the server is currently unable to
>> handle the request
>> E0331 12:47:37.856569 1 memcache.go:147] couldn't get
resource
>> list for servicecatalog.k8s.io/v1beta1
<http://servicecatalog.k8s.io/v1beta1>: the server is currently unable
>> to handle the request
>> E0331 12:47:44.115290 1 authentication.go:62] Unable to
>> authenticate the request due to an error: [x509: certificate
has expired
>> or is not yet valid, x509: certificate
>> has expired or is not yet valid]
>> E0331 12:47:44.118976 1 authentication.go:62] Unable to
>> authenticate the request due to an error: [x509: certificate
has expired
>> or is not yet valid, x509: certificate
>> has expired or is not yet valid]
>> E0331 12:47:44.122276 1 authentication.go:62] Unable to
>> authenticate the request due to an error: [x509: certificate
has expired
>> or is not yet valid, x509: certificate
>> has expired or is not yet valid]
>>
>> Huge number of this second sort.
>>
>> Any ideas what is wrong?
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.openshift.redhat.com
<mailto:users@lists.openshift.redhat.com>
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
<mailto:users@lists.openshift.redhat.com>
http://lists.openshift.redhat.com/openshiftmm/listinfo/users
--
BENJAMIN HOLMES
SENIOR Solution ARCHITECT
Red Hat UKI Presales <https://www.redhat.com/>
bhol...@redhat.com <mailto:bhol...@redhat.com> M: 07876-885388
<http://redhatemailsignature-marketing.itos.redhat.com/>
<https://red.ht/sig>
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users