Maybe an uncanny coincidence but with think the cluster was created almost EXACTLY 1 year before it failed.

On 31/03/2020 16:17, Ben Holmes wrote:
Hi Tim,

Can you verify that the host's clocks are being synced correctly as per Simon's other suggestion?

Ben

On Tue, 31 Mar 2020 at 16:05, Tim Dudgeon <tdudgeon...@gmail.com <mailto:tdudgeon...@gmail.com>> wrote:

    Hi Simon,

    we're run those playbooks and all certs are reported as still
    being valid.

    Tim

    On 31/03/2020 15:59, Simon Krenger wrote:
    > Hi Tim,
    >
    > Note that there are multiple sets of certificates, both external and
    > internal. So it would be worth checking the certificates again using
    > the Certificate Expiration Playbooks (see link below). The
    > documentation also has an overview of what can be done to renew
    > certain certificates:
    >
    > - [ Redeploying Certificates ]
    >
    https://docs.okd.io/3.11/install_config/redeploying_certificates.html
    >
    > Apart from checking all certificates, I'd certainly review the time
    > synchronisation for the whole cluster, as we see the message "x509:
    > certificate has expired or is not yet valid".
    >
    > I hope this helps.
    >
    > Kind regards
    > Simon
    >
    > On Tue, Mar 31, 2020 at 4:33 PM Tim Dudgeon
    <tdudgeon...@gmail.com <mailto:tdudgeon...@gmail.com>> wrote:
    >> One of our OKD 3.11 clusters has suddenly stopped working
    without any
    >> obvious reason.
    >>
    >> The origin-node service on the nodes does not start (times out).
    >> The master-api pod is running on the master.
    >> The nodes can access the master-api endpoints.
    >>
    >> The logs of the master-api pod look mostly OK other than a huge
    number
    >> of warnings about certificates that don't really make sense as the
    >> certificates are valid (we use named certificates from let's
    Encryt and
    >> they were renewed about 2 weeks ago and all appear to be correct.
    >>
    >> Examples of errors from the master-api pod are:
    >>
    >> I0331 12:46:57.065147       1 establishing_controller.go:73]
    Starting
    >> EstablishingController
    >> I0331 12:46:57.065561       1 logs.go:49] http: TLS handshake
    error from
    >> 192.168.160.17:58024 <http://192.168.160.17:58024>: EOF
    >> I0331 12:46:57.071932       1 logs.go:49] http: TLS handshake
    error from
    >> 192.168.160.19:48102 <http://192.168.160.19:48102>: EOF
    >> I0331 12:46:57.072036       1 logs.go:49] http: TLS handshake
    error from
    >> 192.168.160.19:37178 <http://192.168.160.19:37178>: EOF
    >> I0331 12:46:57.072141       1 logs.go:49] http: TLS handshake
    error from
    >> 192.168.160.17:58022 <http://192.168.160.17:58022>: EOF
    >>
    >> E0331 12:47:37.855023       1 memcache.go:147] couldn't get
    resource
    >> list for metrics.k8s.io/v1beta1
    <http://metrics.k8s.io/v1beta1>: the server is currently unable to
    >> handle the request
    >> E0331 12:47:37.856569       1 memcache.go:147] couldn't get
    resource
    >> list for servicecatalog.k8s.io/v1beta1
    <http://servicecatalog.k8s.io/v1beta1>: the server is currently unable
    >> to handle the request
    >> E0331 12:47:44.115290       1 authentication.go:62] Unable to
    >> authenticate the request due to an error: [x509: certificate
    has expired
    >> or is not yet valid, x509: certificate
    >>    has expired or is not yet valid]
    >> E0331 12:47:44.118976       1 authentication.go:62] Unable to
    >> authenticate the request due to an error: [x509: certificate
    has expired
    >> or is not yet valid, x509: certificate
    >>    has expired or is not yet valid]
    >> E0331 12:47:44.122276       1 authentication.go:62] Unable to
    >> authenticate the request due to an error: [x509: certificate
    has expired
    >> or is not yet valid, x509: certificate
    >>    has expired or is not yet valid]
    >>
    >> Huge number of this second sort.
    >>
    >> Any ideas what is wrong?
    >>
    >>
    >>
    >> _______________________________________________
    >> users mailing list
    >> users@lists.openshift.redhat.com
    <mailto:users@lists.openshift.redhat.com>
    >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
    >
    >

    _______________________________________________
    users mailing list
    users@lists.openshift.redhat.com
    <mailto:users@lists.openshift.redhat.com>
    http://lists.openshift.redhat.com/openshiftmm/listinfo/users



--

BENJAMIN HOLMES

SENIOR Solution ARCHITECT

Red Hat UKI Presales <https://www.redhat.com/>

bhol...@redhat.com <mailto:bhol...@redhat.com> M: 07876-885388 <http://redhatemailsignature-marketing.itos.redhat.com/>

<https://red.ht/sig>

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to