Eric, Do you have the means to test a PR? if so, please have a look at [10175](https://github.com/apache/cloudstack/pull/10175/)
On 2025/01/10 08:54:30 Daan Hoogland wrote: > Makes sense Eric, I'll reapply this old PR on 4.19 > > On Fri, Jan 10, 2025 at 9:31 AM Eric Green <eric.lee.gr...@gmail.com> wrote: > > > I don’t know that it’s acceptable as a known bug for 4.19 unless the > > workaround that I did for my cluster (whack last_host_id in the database if > > it points at removed hosts) is documented. > > > > But honestly, any bug that requires manually whacking things in the > > database as a workaround probably needs to be fixed. Especially if it is in > > the n-1 release. I, and many other people using Cloudstack in production, > > don’t update to the very latest and greatest release, we stick back on the > > n-1 release, which has had more testing and bug fixing. Usually. > > > > > On Jan 10, 2025, at 12:22 AM, Daan Hoogland <daan.hoogl...@gmail.com> > > wrote: > > > > > > Wei, Eric, > > > > > > This sounds like we must reapply for 4.19, and make sure it doesn't get > > > pulled forward. The 4.20 code for this class has been heavily > > refactorred. > > > We could also accept this as a known bug for 4.19? > > > > > > On Fri, Jan 10, 2025 at 8:47 AM Eric Green <eric.lee.gr...@gmail.com> > > wrote: > > > > > >> Thanks, Wei. The log definitely stated that ‘considerlasthost’ was > > >> getting defaulted to true (and honestly that should be configurable > > because > > >> I really want the planner to start up the vm for users on whatever host > > has > > >> the most excess capacity at the moment), and that the so I was confused > > >> when Kiran claimed it was not. That’s not what the log said! (“This VM > > has > > >> last host_id specified, trying to choose the same host: 62”). > > >> > > >> So it appears that this is a bug that was fixed in 4.18.x, was > > supposedly > > >> pulled into 4.19.x, but somehow did not actually get pulled into 4.19.x. > > >> Since 4.20.x does not appear to have the bug, the solution appear to be > > to > > >> upgrade to 4.20.x before I delete another host 😊. (I do not plan to > > delete > > >> any hosts anytime soon, I just got all the old hosts retired). > > >> > > >> I resolved the issue in my own Cloudstack deployment by whacking > > >> last_host_id in the database to a current host if it pointed at a > > retired > > >> host. That allowed my users to start virtual machines last started on > > the > > >> retired hosts without involving an administrator. But it does appear > > that I > > >> happened upon a bug here, from what you say. I presume that since you > > know > > >> the exact cause, you will file an actual bug in Github that this fix has > > >> not been integrated to 4.19.x branch? > > >> > > >> From: Wei ZHOU <ustcweiz...@gmail.com> > > >> Date: Thursday, January 9, 2025 at 11:27 PM > > >> To: users@cloudstack.apache.org <users@cloudstack.apache.org>, Daan > > >> Hoogland <daan.hoogl...@shapeblue.com>, suresh.anapa...@shapeblue.com < > > >> suresh.anapa...@shapeblue.com> > > >> Subject: Re: A possible bug in instance start for end users when a host > > is > > >> deleted > > >> Hi Kiran, > > >> > > >> For backward compatibility, the `considerlasthost` is default to true > > if it > > >> is not set. > > >> > > >> The issue seems to be fixed in 4.19.1.0 by > > >> https://github.com/apache/cloudstack/pull/9037 > > >> > > >> However, I cannot find the changes in the code base of 4.19 branch. > > >> > > >> > > https://github.com/apache/cloudstack/blob/4.19/server/src/main/java/com/cloud/deploy/DeploymentPlanningManagerImpl.java#L449-L453 > > >> > > >> It looks like something is wrong during the merge of the 4.18 to 4.19 > > >> branch. > > >> cc @Daan Hoogland <daan.hoogl...@shapeblue.com> > > >> @suresh.anapa...@shapeblue.com <suresh.anapa...@shapeblue.com> > > >> > > >> > > >> -Wei > > >> > > >> On Fri, Jan 10, 2025 at 7:58 AM Kiran Chavala < > > kiran.chav...@shapeblue.com > > >>> > > >> wrote: > > >> > > >>> Hi Eric > > >>> > > >>> I am not hitting the issue on the latest 4.20 release of cloudstack > > >>> > > >>> For a normal end user, it doesn’t consider the last host to start the > > >>> virtual machine > > >>> > > >>> the startVirtualMachine Api call has the parameter “considerlasthost” > > >> for > > >>> the admin user > > >>> > > >>> > > >>> > > >> > > https://cloudstack.apache.org/api/apidocs-4.20/apis/startVirtualMachine.html > > >>> > > >>> For a normal user there is no parameter “ considerlasthost” , the > > >>> deployment planner picks up the available host > > >>> > > >>> Normal user > > >>> > > >>> (localcloud2) 🐱 > start virtualmachine > > >>> bootintosetup= clusterid= filter= hostid= id= > > >>> podid= > > >>> > > >>> > > >>> Regards > > >>> Kiran > > >>> > > >>> From: Eric Green <eric.lee.gr...@gmail.com> > > >>> Date: Friday, 10 January 2025 at 4:05 AM > > >>> To: users@cloudstack.apache.org <users@cloudstack.apache.org> > > >>> Subject: A possible bug in instance start for end users when a host is > > >>> deleted > > >>> This appears to me to be a bug. Is it a known bug? I tried searching > > the > > >>> Github bug list but Github's search function is awful. > > >>> > > >>> Cloudstack 4.19.1.3, Ubuntu 20.04 management server, Ubuntu 22.04 > > compute > > >>> hosts, KVM hypervisor. > > >>> > > >>> Scenario: There are four physical hosts, A, B, C, D. Let's say they're > > >>> host ID 62,63,64,65 in the hosts table. Each host has at least one > > >> virtual > > >>> machine running on it. > > >>> > > >>> Shut down the virtual machines on host A (without starting them > > >> elsewhere) > > >>> which sets the last_host_id to 62, and delete host A in the > > >> Infrastructure > > >>> tab. > > >>> > > >>> As an end user (not an admin), try to start one of the virtual machines > > >>> that were formerly on host A. > > >>> > > >>> End result: The deployment planner returns the last_host_id of host A > > as > > >>> the host to start the VM on. Since that host is marked deleted, the > > >>> deployment executor removes it from the list of available hosts to > > start > > >>> the VM on, and then throws an exception. The exception then appears on > > >> the > > >>> GUI as a red box that basically gives the end user zero useful > > >> information. > > >>> > > >>> Expected result: The same as for admin users — the deployment planner > > >>> notices that host A is deleted and cannot be used to deploy the VM, and > > >>> instead picks a host from [B,C,D]. > > >>> > > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> Adding pods to avoid lists for non-explicit VM deployment: [] > > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> Adding clusters to avoid lists for non-explicit VM deployment: [] > > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> Adding hosts to avoid lists for non-explicit VM deployment: [] > > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> DeploymentPlanner allocation algorithm: null > > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> Trying to allocate a host and storage pools from dc:1, > > >>> pod:null,cluster:null, requested cpu: 4000, requested ram: (4.00 GB) > > >>> 4294967296 > > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> Is ROOT volume READY (pool already allocated)?: Yes > > >>> 2025-01-09 13:28:35,466 DEBUG [c.c.a.ApiServer] > > >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) CIDRs > > from > > >>> which account 'Account > > >>> > > >> > > [{"accountName":"eric.green","id":8,"uuid":"e8baeda0-90a5-4c97-aa51-0adbfa3c0e4c"}]' > > >>> is allowed to perform API calls: 0.0.0.0/0,::/0 > > >>> 2025-01-09 <http://0.0.0.0/0,::/02025-01-09> 13:28:35,466 DEBUG > > >>> [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-81:ctx-4b9a2500 > > >>> job-418252 ctx-34040330) (logid:086f0ba0) Deploy avoids pods: [], > > >> clusters: > > >>> [], hosts: [] > > >>> 2025-01-09 13:28:35,466 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> Deploy hosts with priorities {} , hosts have NORMAL priority by default > > >>> 2025-01-09 13:28:35,467 DEBUG [c.c.d.DeploymentPlanningManagerImpl] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330) > > >> (logid:086f0ba0) > > >>> This VM has last host_id specified, trying to choose the same host: 62 > > >>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.a.StaticRoleBasedAPIAccessChecker] > > >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) > > >> RoleService > > >>> is enabled. We will use it instead of StaticRoleBasedAPIAccessChecker. > > >>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.r.ApiRateLimitServiceImpl] > > >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) API rate > > >>> limiting is disabled. We will not use ApiRateLimitService. > > >>> 2025-01-09 13:28:35,473 ERROR [c.c.a.ApiAsyncJobDispatcher] > > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252) (logid:086f0ba0) > > Unexpected > > >>> exception while executing > > >>> org.apache.cloudstack.api.command.user.vm.StartVMCmd > > >>> java.lang.NullPointerException > > >>> at > > >>> com.cloud.host.dao.HostDaoImpl.loadHostTags(HostDaoImpl.java:879) > > >>> at > > jdk.internal.reflect.GeneratedMethodAccessor336.invoke(Unknown > > >>> Source) > > >>> at > > >>> > > >> > > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > >>> at java.base/java.lang.reflect.Method.invoke(Method.java:566) > > >>> at > > >>> > > >> > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) > > >>> at > > >>> > > >> > > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) > > >>> at > > >>> > > >> > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) > > >>> at com.sun.proxy.$Proxy96.loadHostTags(Unknown Source) > > >>> at > > >>> > > >> > > com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:451) > > >>> at > > >>> > > >> > > org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:206) > > >>> at > > >>> > > >> > > org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:202) > > >>> at > > >>> > > >> > > com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5445) > > >>> at > > >>> > > >> > > com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5299) > > >>> at > > >>> > > >> > > com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3224) > > >>> at > > jdk.internal.reflect.GeneratedMethodAccessor787.invoke(Unknown > > >>> Source) > > >>> at > > >>> > > >> > > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > >>> at java.base/java.lang.reflect.Method.invoke(Method.java:566) > > >>> at > > >>> > > >> > > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) > > >>> at > > >>> > > >> > > org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:107) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) > > >>> at > > >>> > > >> > > com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) > > >>> at > > >>> > > >> > > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) > > >>> at > > >>> > > >> > > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) > > >>> at com.sun.proxy.$Proxy186.startVirtualMachine(Unknown Source) > > >>> at > > >>> > > >> > > org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:181) > > >>> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:172) > > >>> at > > >>> > > >> > > com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:112) > > >>> at > > >>> > > >> > > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:654) > > >>> at > > >>> > > >> > > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48) > > >>> at > > >>> > > >> > > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) > > >>> at > > >>> > > >> > > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102) > > >>> at > > >>> > > >> > > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) > > >>> at > > >>> > > >> > > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45) > > >>> at > > >>> > > >> > > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:602) > > >>> at > > >>> > > >> > > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > > >>> at > > >>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > > >>> at > > >>> > > >> > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > > >>> at > > >>> > > >> > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > > >>> at java.base/java.lang.Thread.run(Thread.java:829) > > >>> > > >>> > > >>> > > >>> > > >> > > > > > > > > > -- > > > Daan > > > > > > -- > Daan >