Re: ocp4 cluster fails to initialize on GCP

Just Marvin Sun, 06 Oct 2019 05:39:14 -0700

Trevor,

    Responded to your email with the logs, but that is still pending
moderator approval to be posted to the list (since the size of the message
was above the size cap). While a moderator makes up his / her mind, I
thought that I'd repeat some of my observations:


E1006 11:39:34.041543       1 status.go:71] RouteSyncProgressing FailedHost
route is not available at canonical host []
E1006 11:39:34.041641       1 controller.go:129] {Console Console} failed
with: route is not available at canonical host []
W1006 11:40:12.831667       1 reflector.go:289]
k8s.io/client-go/informers/factory.go:133: watch of *v1.ConfigMap ended
with: too old resource version: 17351 (19676)
W1006 11:40:30.881153       1 reflector.go:289]
k8s.io/client-go/informers/factory.go:133: watch of *v1.ConfigMap ended
with: too old resource version: 18133 (19758)
E1006 11:45:13.003476       1 status.go:71] RouteSyncProgressing FailedHost
route is not available at canonical host []
E1006 11:45:13.003613       1 controller.go:129] {Console Console} failed
with: route is not available at canonical host []
W1006 11:46:25.850256       1 reflector.go:289]
k8s.io/client-go/informers/factory.go:133: watch of *v1.ConfigMap ended
with: too old resource version: 19478 (21246)
E1006 11:46:26.883137       1 status.go:71] RouteSyncProgressing FailedHost
route is not available at canonical host []
E1006 11:46:26.883217       1 controller.go:129] {Console Console} failed
with: route is not available at canonical host []
W1006 11:47:00.837769       1 reflector.go:289]
k8s.io/client-go/informers/factory.go:133: watch of *v1.ConfigMap ended
with: too old resource version: 19826 (21404)
W1006 11:47:03.833901       1 reflector.go:289]
k8s.io/client-go/informers/factory.go:133: watch of *v1.Deployment ended
with: too old resource version: 13636 (14162)
E1006 11:47:28.977942       1 status.go:71] RouteSyncProgressing FailedHost
route is not available at canonical host []
E1006 11:47:28.978052       1 controller.go:129] {Console Console} failed
with: route is not available at canonical host []
E1006 11:47:29.001738       1 status.go:71] RouteSyncProgressing FailedHost
route is not available at canonical host []
E1006 11:47:29.001853       1 controller.go:129] {Console Console} failed
with: route is not available at canonical host []
E1006 11:47:29.031826       1 status.go:71] RouteSyncProgressing FailedHost
route is not available at canonical host []
E1006 11:47:29.031924       1 controller.go:129] {Console Console} failed
with: route is not available at canonical host []
E1006 11:47:29.455882       1 status.go:71] RouteSyncProgressing FailedHost
route is not available at canonical host []
E1006 11:47:29.456081       1 controller.go:129] {Console Console} failed
with: route is not available at canonical host []

     This is a default install with three masters. Previously, I had done
an install with one master, and even though that was hanging, the console
route had been defined (and so, I imagine, there would have been an A
record set up in DNS). Currently, I see an A record for the master, but not
for the console. I'm wondering if there is perhaps a screwup on my DNS
setup that is contributing to the problem.

    As of right now (probably another 30 mins past the install timing out),
I still see the following messages in the cluster version operator log:

I1006 12:31:16.462865       1 sync_worker.go:745] Update error 294 of 432:
ClusterOperatorNotAvailable Cluster operator console has not yet reported
success (*errors.errorString: cluster operator console is not done; it is
available=false, progressing=true, degraded=false)
I1006 12:31:16.462876       1 sync_worker.go:745] Update error 135 of 432:
ClusterOperatorNotAvailable Cluster operator authentication is still
updating (*errors.errorString: cluster operator authentication is still
updating)
I1006 12:31:16.462884       1 sync_worker.go:745] Update error 260 of 432:
ClusterOperatorNotAvailable Cluster operator monitoring is still updating
(*errors.errorString: cluster operator monitoring is still updating)
I1006 12:31:16.462890       1 sync_worker.go:745] Update error 183 of 432:
ClusterOperatorNotAvailable Cluster operator ingress is still updating
(*errors.errorString: cluster operator ingress is still updating)
I1006 12:31:16.462897       1 sync_worker.go:745] Update error 170 of 432:
ClusterOperatorNotAvailable Cluster operator image-registry is still
updating (*errors.errorString: cluster operator image-registry is still
updating)
E1006 12:31:16.462944       1 sync_worker.go:311] unable to synchronize
image (waiting 2m52.525702462s): Some cluster operators are still updating:
authentication, console, image-registry, ingress, monitoring

Regards,
Marvin

On Sat, Oct 5, 2019 at 6:22 PM W. Trevor King <wk...@redhat.com> wrote:

> On Sat, Oct 5, 2019 at 11:22 AM Just Marvin wrote:
> > INFO Destroying the bootstrap resources...
> > INFO Waiting up to 30m0s for the cluster at
> https://api.one.discworld.a.random.domain:6443 to initialize...
> > FATAL failed to initialize the cluster: Working towards
> 4.2.0-0.nightly-2019-10-01-210901: 99% complete
> > ...
> >     How do I track down what went wrong. And at this point, is it just a
> matter of waiting for a while? Suppose I let it go for a few hours, will
> there be a way to see if the initialization did complete?
>
> Might be.  You can launch additional waiters with 'openshift-install
> wait-for install-complete'.  You can also see exactly what the
> cluster-version operator is stuck on by looking in the cluster-version
> operator pod's logs.  Or you can inspect the ClusterOperator resources
> and see if any of the core operators has more-specific complaints.  Or
> you can 'oc adm must-gather' to get a tarball of OpenStack components
> to send to us if you'd rather have us poke around.
>
> Cheers,
> Trevor
>

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: ocp4 cluster fails to initialize on GCP

Reply via email to