I don’t know that it’s acceptable as a known bug for 4.19 unless the workaround that I did for my cluster (whack last_host_id in the database if it points at removed hosts) is documented.
But honestly, any bug that requires manually whacking things in the database as a workaround probably needs to be fixed. Especially if it is in the n-1 release. I, and many other people using Cloudstack in production, don’t update to the very latest and greatest release, we stick back on the n-1 release, which has had more testing and bug fixing. Usually. > On Jan 10, 2025, at 12:22 AM, Daan Hoogland <daan.hoogl...@gmail.com> wrote: > > Wei, Eric, > > This sounds like we must reapply for 4.19, and make sure it doesn't get > pulled forward. The 4.20 code for this class has been heavily refactorred. > We could also accept this as a known bug for 4.19? > > On Fri, Jan 10, 2025 at 8:47 AM Eric Green <eric.lee.gr...@gmail.com> wrote: > >> Thanks, Wei. The log definitely stated that ‘considerlasthost’ was >> getting defaulted to true (and honestly that should be configurable because >> I really want the planner to start up the vm for users on whatever host has >> the most excess capacity at the moment), and that the so I was confused >> when Kiran claimed it was not. That’s not what the log said! (“This VM has >> last host_id specified, trying to choose the same host: 62”). >> >> So it appears that this is a bug that was fixed in 4.18.x, was supposedly >> pulled into 4.19.x, but somehow did not actually get pulled into 4.19.x. >> Since 4.20.x does not appear to have the bug, the solution appear to be to >> upgrade to 4.20.x before I delete another host 😊. (I do not plan to delete >> any hosts anytime soon, I just got all the old hosts retired). >> >> I resolved the issue in my own Cloudstack deployment by whacking >> last_host_id in the database to a current host if it pointed at a retired >> host. That allowed my users to start virtual machines last started on the >> retired hosts without involving an administrator. But it does appear that I >> happened upon a bug here, from what you say. I presume that since you know >> the exact cause, you will file an actual bug in Github that this fix has >> not been integrated to 4.19.x branch? >> >> From: Wei ZHOU <ustcweiz...@gmail.com> >> Date: Thursday, January 9, 2025 at 11:27 PM >> To: users@cloudstack.apache.org <users@cloudstack.apache.org>, Daan >> Hoogland <daan.hoogl...@shapeblue.com>, suresh.anapa...@shapeblue.com < >> suresh.anapa...@shapeblue.com> >> Subject: Re: A possible bug in instance start for end users when a host is >> deleted >> Hi Kiran, >> >> For backward compatibility, the `considerlasthost` is default to true if it >> is not set. >> >> The issue seems to be fixed in 4.19.1.0 by >> https://github.com/apache/cloudstack/pull/9037 >> >> However, I cannot find the changes in the code base of 4.19 branch. >> >> https://github.com/apache/cloudstack/blob/4.19/server/src/main/java/com/cloud/deploy/DeploymentPlanningManagerImpl.java#L449-L453 >> >> It looks like something is wrong during the merge of the 4.18 to 4.19 >> branch. >> cc @Daan Hoogland <daan.hoogl...@shapeblue.com> >> @suresh.anapa...@shapeblue.com <suresh.anapa...@shapeblue.com> >> >> >> -Wei >> >> On Fri, Jan 10, 2025 at 7:58 AM Kiran Chavala <kiran.chav...@shapeblue.com >>> >> wrote: >> >>> Hi Eric >>> >>> I am not hitting the issue on the latest 4.20 release of cloudstack >>> >>> For a normal end user, it doesn’t consider the last host to start the >>> virtual machine >>> >>> the startVirtualMachine Api call has the parameter “considerlasthost” >> for >>> the admin user >>> >>> >>> >> https://cloudstack.apache.org/api/apidocs-4.20/apis/startVirtualMachine.html >>> >>> For a normal user there is no parameter “ considerlasthost” , the >>> deployment planner picks up the available host >>> >>> Normal user >>> >>> (localcloud2) 🐱 > start virtualmachine >>> bootintosetup= clusterid= filter= hostid= id= >>> podid= >>> >>> >>> Regards >>> Kiran >>> >>> From: Eric Green <eric.lee.gr...@gmail.com> >>> Date: Friday, 10 January 2025 at 4:05 AM >>> To: users@cloudstack.apache.org <users@cloudstack.apache.org> >>> Subject: A possible bug in instance start for end users when a host is >>> deleted >>> This appears to me to be a bug. Is it a known bug? I tried searching the >>> Github bug list but Github's search function is awful. >>> >>> Cloudstack 4.19.1.3, Ubuntu 20.04 management server, Ubuntu 22.04 compute >>> hosts, KVM hypervisor. >>> >>> Scenario: There are four physical hosts, A, B, C, D. Let's say they're >>> host ID 62,63,64,65 in the hosts table. Each host has at least one >> virtual >>> machine running on it. >>> >>> Shut down the virtual machines on host A (without starting them >> elsewhere) >>> which sets the last_host_id to 62, and delete host A in the >> Infrastructure >>> tab. >>> >>> As an end user (not an admin), try to start one of the virtual machines >>> that were formerly on host A. >>> >>> End result: The deployment planner returns the last_host_id of host A as >>> the host to start the VM on. Since that host is marked deleted, the >>> deployment executor removes it from the list of available hosts to start >>> the VM on, and then throws an exception. The exception then appears on >> the >>> GUI as a red box that basically gives the end user zero useful >> information. >>> >>> Expected result: The same as for admin users — the deployment planner >>> notices that host A is deleted and cannot be used to deploy the VM, and >>> instead picks a host from [B,C,D]. >>> >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> Adding pods to avoid lists for non-explicit VM deployment: [] >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> Adding clusters to avoid lists for non-explicit VM deployment: [] >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> Adding hosts to avoid lists for non-explicit VM deployment: [] >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> DeploymentPlanner allocation algorithm: null >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> Trying to allocate a host and storage pools from dc:1, >>> pod:null,cluster:null, requested cpu: 4000, requested ram: (4.00 GB) >>> 4294967296 >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> Is ROOT volume READY (pool already allocated)?: Yes >>> 2025-01-09 13:28:35,466 DEBUG [c.c.a.ApiServer] >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) CIDRs from >>> which account 'Account >>> >> [{"accountName":"eric.green","id":8,"uuid":"e8baeda0-90a5-4c97-aa51-0adbfa3c0e4c"}]' >>> is allowed to perform API calls: 0.0.0.0/0,::/0 >>> 2025-01-09 <http://0.0.0.0/0,::/02025-01-09> 13:28:35,466 DEBUG >>> [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-81:ctx-4b9a2500 >>> job-418252 ctx-34040330) (logid:086f0ba0) Deploy avoids pods: [], >> clusters: >>> [], hosts: [] >>> 2025-01-09 13:28:35,466 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> Deploy hosts with priorities {} , hosts have NORMAL priority by default >>> 2025-01-09 13:28:35,467 DEBUG [c.c.d.DeploymentPlanningManagerImpl] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) >> (logid:086f0ba0) >>> This VM has last host_id specified, trying to choose the same host: 62 >>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.a.StaticRoleBasedAPIAccessChecker] >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) >> RoleService >>> is enabled. We will use it instead of StaticRoleBasedAPIAccessChecker. >>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.r.ApiRateLimitServiceImpl] >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) API rate >>> limiting is disabled. We will not use ApiRateLimitService. >>> 2025-01-09 13:28:35,473 ERROR [c.c.a.ApiAsyncJobDispatcher] >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252) (logid:086f0ba0) Unexpected >>> exception while executing >>> org.apache.cloudstack.api.command.user.vm.StartVMCmd >>> java.lang.NullPointerException >>> at >>> com.cloud.host.dao.HostDaoImpl.loadHostTags(HostDaoImpl.java:879) >>> at jdk.internal.reflect.GeneratedMethodAccessor336.invoke(Unknown >>> Source) >>> at >>> >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.base/java.lang.reflect.Method.invoke(Method.java:566) >>> at >>> >> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) >>> at >>> >> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) >>> at >>> >> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) >>> at >>> >> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) >>> at com.sun.proxy.$Proxy96.loadHostTags(Unknown Source) >>> at >>> >> com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:451) >>> at >>> >> org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:206) >>> at >>> >> org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:202) >>> at >>> >> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5445) >>> at >>> >> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5299) >>> at >>> >> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3224) >>> at jdk.internal.reflect.GeneratedMethodAccessor787.invoke(Unknown >>> Source) >>> at >>> >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.base/java.lang.reflect.Method.invoke(Method.java:566) >>> at >>> >> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) >>> at >>> >> org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:107) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) >>> at >>> >> com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) >>> at >>> >> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) >>> at >>> >> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) >>> at >>> >> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) >>> at com.sun.proxy.$Proxy186.startVirtualMachine(Unknown Source) >>> at >>> >> org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:181) >>> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:172) >>> at >>> >> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:112) >>> at >>> >> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:654) >>> at >>> >> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48) >>> at >>> >> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) >>> at >>> >> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102) >>> at >>> >> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) >>> at >>> >> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45) >>> at >>> >> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:602) >>> at >>> >> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) >>> at >>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) >>> at >>> >> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) >>> at >>> >> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) >>> at java.base/java.lang.Thread.run(Thread.java:829) >>> >>> >>> >>> >> > > > -- > Daan