Hi Wally

Could you please check the following items

1. Check if you have set the global setting  “Endpoint url (endpoint.url)” to 
management server ip addrees.

2. Login to the control node and check if the nodes are in ready state and also 
the pods are in running state

cloud@k1-control-18d9cc9f10d:~$ sudo su -
root@k1-control-18d9cc9f10d:~#
Exectue the Kubectl command

root@k1-control-18d9cc9f10d:~# cd /opt/cloud/bin

Make sure the  nodes are in running state

root@k1-control-18d9cc9f10d:/opt/cloud/bin# kubectl get nodes
NAME                     STATUS   ROLES           AGE     VERSION
k1-control-18d9cc9f10d   Ready    control-plane   8m37s   v1.28.4
k1-node-18d9cca4fad      Ready    <none>          8m22s   v1.28.4

Make sure all pods are in running state

root@k1-control-18d9cc9f10d:/opt/cloud/bin# kubectl get pods --all-namespaces
NAMESPACE              NAME                                             READY   
STATUS    RESTARTS       AGE
kube-system            cloud-controller-manager-574bcb86c-tz9cj         1/1     
Running   0              8m36s
kube-system            coredns-5dd5756b68-245tn                         1/1     
Running   0              9m12s
kube-system            coredns-5dd5756b68-jplbr                         1/1     
Running   0              9m12s
kube-system            etcd-k1-control-18d9cc9f10d                      1/1     
Running   0              9m19s
kube-system            kube-apiserver-k1-control-18d9cc9f10d            1/1     
Running   0              9m15s
kube-system            kube-controller-manager-k1-control-18d9cc9f10d   1/1     
Running   0              9m15s
kube-system            kube-proxy-4qq2h                                 1/1     
Running   0              9m5s
kube-system            kube-proxy-jfq7k                                 1/1     
Running   0              9m12s
kube-system            kube-scheduler-k1-control-18d9cc9f10d            1/1     
Running   0              9m19s
kube-system            weave-net-77lcj                                  2/2     
Running   1 (9m9s ago)   9m12s
kube-system            weave-net-k8cnk                                  2/2     
Running   0              9m5s
kubernetes-dashboard   dashboard-metrics-scraper-5657497c4c-gmt6m       1/1     
Running   0              9m12s
kubernetes-dashboard   kubernetes-dashboard-5b749d9495-vth8j            1/1     
Running   0              9m12s


3. Please provide the following logs to investigate which step its failing

cat /var/log/daemon.log

cat /var/log/messages

Regards
Kiran


From: Wally B <wvbauman...@gmail.com>
Date: Thursday, 15 February 2024 at 4:35 AM
To: users@cloudstack.apache.org <users@cloudstack.apache.org>
Subject: Kubernetes Clusters Failing to Start 4.19.0
Hello Everyone!

We are currently attempting to deploy k8s clusters and are running into
issues with the deployment.


Current CS Environment:

CloudStack Verison: 4.19.0 (Same issue before we upgraded from 4.18.1).
Hypervisor Type: Ubuntu 20.04.03 KVM
Attempted K8s Bins: 1.23.3, 1.27.3



======== ISSUE =========

For some reason when we attempt the cluster provisioning all of the VMs
start up, SSH Keys are installed, but then at least 1, sometimes 2 of the
VMs (control and/or worker) we get:

[FAILED] Failed to start deploy-kube-system.service.
[FAILED] Failed to start Execute cloud user/final scripts.

The Cloudstack UI just says:
Create Kubernetes cluster test-cluster in progress
for about an hour (I assume this is the 3600 second timeout) and then
fails.

In the users event log it stays on:
INFO KUBERNETES.CLUSTER.CREATE
Scheduled
Creating Kubernetes cluster. Cluster Id: XXX



I can ssh into the VMs with their assigned private keys. I attempted to run
the deploy-kube-system script but it just says already provisioned! I'm not
sure how I would Execute cloud user/final scripts. If I attempt to stop the
cluster and start it again nothing seems to change.



Any help would be appreciated, I can provide any details as they are needed!

Thanks!
Wally

 

Reply via email to