kiranchavala opened a new issue, #12783:
URL: https://github.com/apache/cloudstack/issues/12783

   ### problem
   
   CKS cluster goes into alert state 
   
   ### versions
   
   ACS 4.22 , KVM hypervisor
   
   ### The steps to reproduce the bug
   
   1. Created a K8s issue with 1 controller, and 3 workers
   
   2. Make sure the CKS cluster is in running state 
   
   3. Navigate to one of the worker  instances and unmanaged it 
   
   <img width="1619" height="108" alt="Image" 
src="https://github.com/user-attachments/assets/51ec4106-f2e9-4a4c-8431-e5723876eb7c";
 />
   
   
   4. The CKS cluster went into Alert state
   
   
   <img width="1634" height="190" alt="Image" 
src="https://github.com/user-attachments/assets/f501ad43-12be-44e2-b176-c962a959b0f0";
 />
   
   5. Delete the CKS Cluster, following error is issued 
   
   "com.cloud.vm.VMInstanceVO.getBackupOfferingId()" because "vm" is null"
   
   <img width="1603" height="321" alt="Image" 
src="https://github.com/user-attachments/assets/76fd2e61-90ad-49fc-963b-fc69f2a9d1b4";
 />
   
   logs 
   
   ```
   2026-03-10 11:39:56,869 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-55:[ctx-53050b87, job-73]) (logid:755e27ea) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.user.kubernetes.cluster.DeleteKubernetesClusterCmd
 java.lang.NullPointerException: Cannot invoke 
"com.cloud.vm.VMInstanceVO.getBackupOfferingId()" because "vm" is null
           at 
com.cloud.kubernetes.cluster.KubernetesClusterManagerImpl.checkIfVmsAssociatedWithBackupOffering(KubernetesClusterManagerImpl.java:2018)
   
   ```
       
   6. Stop the CKS  cluster if the following error is issued
   
   Failed to find all VMs in Kubernetes cluster : 
   
   <img width="1626" height="348" alt="Image" 
src="https://github.com/user-attachments/assets/6adb347b-bd82-4318-953a-3dade688ef77";
 />
   
   
   logs 
   
   ```
   2026-03-10 11:40:47,436 ERROR [c.c.k.c.a.KubernetesClusterStopWorker] 
(API-Job-Executor-56:[ctx-37106f8f, job-74, ctx-f1370fcc]) (logid:eeffc52f) 
Failed to find all VMs in Kubernetes cluster : test
   
   ```
   
    7. Import back the unmanged instance using the api > select the same cks 
network
   
   
https://cloudstack.apache.org/api/apidocs-4.22/apis/importUnmanagedInstance.html
   
      
   8. Add the imported worker node instance back to the CKS cluster 
   
   <img width="1450" height="563" alt="Image" 
src="https://github.com/user-attachments/assets/4310496c-544e-49c1-a691-0d06ccac4f21";
 />
   
   logs 
   
   ```
   2026-03-10 12:52:22,913 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-61:[ctx-3eab1e04, job-81]) (logid:fba281a3) Complete async 
job-81, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
 to add nodes to cluster ID 1 due to: No valid nodes found to be added to the 
Kubernetes cluster"}
   ```
   
   <img width="1615" height="308" alt="Image" 
src="https://github.com/user-attachments/assets/3184cc41-7348-48ae-a5de-94682f018274";
 />
   
   Other related CKS Alert issues 
   
   https://github.com/apache/cloudstack/issues/12641
   https://github.com/apache/cloudstack/issues/12633
   https://github.com/apache/cloudstack/issues/11581
   
   
   
   ### What to do about it?
   
   CloudStack should not allow to unmanage an instance if its a part of a cks 
cluster 
   
   or Cloudstack CKS cluster should support  addition of an unmanaged instance 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to