Eric,
Do you have the means to test a PR? if so, please have a look at 
[10175](https://github.com/apache/cloudstack/pull/10175/)

On 2025/01/10 08:54:30 Daan Hoogland wrote:
> Makes sense Eric, I'll reapply this old PR on 4.19
> 
> On Fri, Jan 10, 2025 at 9:31 AM Eric Green <eric.lee.gr...@gmail.com> wrote:
> 
> > I don’t know that it’s acceptable as a known bug for 4.19 unless the
> > workaround that I did for my cluster (whack last_host_id in the database if
> > it points at removed hosts) is documented.
> >
> > But honestly, any bug that requires manually whacking things in the
> > database as a workaround probably needs to be fixed. Especially if it is in
> > the n-1 release. I, and many other people using Cloudstack in production,
> > don’t update to the very latest and greatest release, we stick back on the
> > n-1 release, which has had more testing and bug fixing. Usually.
> >
> > > On Jan 10, 2025, at 12:22 AM, Daan Hoogland <daan.hoogl...@gmail.com>
> > wrote:
> > >
> > > Wei, Eric,
> > >
> > > This sounds like we must reapply for 4.19, and make sure it doesn't get
> > > pulled forward. The 4.20 code for this class has been heavily
> > refactorred.
> > > We could also accept this as a known bug for 4.19?
> > >
> > > On Fri, Jan 10, 2025 at 8:47 AM Eric Green <eric.lee.gr...@gmail.com>
> > wrote:
> > >
> > >> Thanks, Wei.  The log definitely stated that ‘considerlasthost’ was
> > >> getting defaulted to true (and honestly that should be configurable
> > because
> > >> I really want the planner to start up the vm for users on whatever host
> > has
> > >> the most excess capacity at the moment), and that the so I was confused
> > >> when Kiran claimed it was not. That’s not what the log said! (“This VM
> > has
> > >> last host_id specified, trying to choose the same host: 62”).
> > >>
> > >> So it appears that this is a bug that was fixed in 4.18.x, was
> > supposedly
> > >> pulled into 4.19.x, but somehow did not actually get pulled into 4.19.x.
> > >> Since 4.20.x does not appear to have the bug, the solution appear to be
> > to
> > >> upgrade to 4.20.x before I delete another host 😊. (I do not plan to
> > delete
> > >> any hosts anytime soon, I just got all the old hosts retired).
> > >>
> > >> I resolved the issue in my own Cloudstack deployment by whacking
> > >> last_host_id in the database to a current host if it pointed at a
> > retired
> > >> host. That allowed my users to start virtual machines last started on
> > the
> > >> retired hosts without involving an administrator. But it does appear
> > that I
> > >> happened upon a bug here, from what you say. I presume that since you
> > know
> > >> the exact cause, you will file an actual bug in Github that this fix has
> > >> not been integrated to 4.19.x branch?
> > >>
> > >> From: Wei ZHOU <ustcweiz...@gmail.com>
> > >> Date: Thursday, January 9, 2025 at 11:27 PM
> > >> To: users@cloudstack.apache.org <users@cloudstack.apache.org>, Daan
> > >> Hoogland <daan.hoogl...@shapeblue.com>, suresh.anapa...@shapeblue.com <
> > >> suresh.anapa...@shapeblue.com>
> > >> Subject: Re: A possible bug in instance start for end users when a host
> > is
> > >> deleted
> > >> Hi Kiran,
> > >>
> > >> For backward compatibility, the `considerlasthost` is default to true
> > if it
> > >> is not set.
> > >>
> > >> The issue seems to be fixed in 4.19.1.0 by
> > >> https://github.com/apache/cloudstack/pull/9037
> > >>
> > >> However, I cannot find the changes in the code base of 4.19 branch.
> > >>
> > >>
> > https://github.com/apache/cloudstack/blob/4.19/server/src/main/java/com/cloud/deploy/DeploymentPlanningManagerImpl.java#L449-L453
> > >>
> > >> It looks like something is wrong during the merge of the 4.18 to 4.19
> > >> branch.
> > >> cc @Daan Hoogland <daan.hoogl...@shapeblue.com>
> > >> @suresh.anapa...@shapeblue.com <suresh.anapa...@shapeblue.com>
> > >>
> > >>
> > >> -Wei
> > >>
> > >> On Fri, Jan 10, 2025 at 7:58 AM Kiran Chavala <
> > kiran.chav...@shapeblue.com
> > >>>
> > >> wrote:
> > >>
> > >>> Hi Eric
> > >>>
> > >>> I am not hitting the issue on the latest 4.20 release of cloudstack
> > >>>
> > >>> For a normal end user, it  doesn’t consider the last host to start the
> > >>> virtual machine
> > >>>
> > >>> the startVirtualMachine Api call has the parameter “considerlasthost”
> > >> for
> > >>> the  admin user
> > >>>
> > >>>
> > >>>
> > >>
> > https://cloudstack.apache.org/api/apidocs-4.20/apis/startVirtualMachine.html
> > >>>
> > >>> For a normal user there is no parameter “ considerlasthost” , the
> > >>> deployment planner picks up the available host
> > >>>
> > >>> Normal user
> > >>>
> > >>> (localcloud2) 🐱 > start virtualmachine
> > >>> bootintosetup= clusterid=     filter=        hostid=        id=
> > >>> podid=
> > >>>
> > >>>
> > >>> Regards
> > >>> Kiran
> > >>>
> > >>> From: Eric Green <eric.lee.gr...@gmail.com>
> > >>> Date: Friday, 10 January 2025 at 4:05 AM
> > >>> To: users@cloudstack.apache.org <users@cloudstack.apache.org>
> > >>> Subject: A possible bug in instance start for end users when a host is
> > >>> deleted
> > >>> This appears to me to be a bug. Is it a known bug? I tried searching
> > the
> > >>> Github bug list but Github's search function is awful.
> > >>>
> > >>> Cloudstack 4.19.1.3, Ubuntu 20.04 management server, Ubuntu 22.04
> > compute
> > >>> hosts, KVM hypervisor.
> > >>>
> > >>> Scenario: There are four physical hosts, A, B, C, D.  Let's say they're
> > >>> host ID 62,63,64,65 in the hosts table.  Each host has at least one
> > >> virtual
> > >>> machine running on it.
> > >>>
> > >>> Shut down the virtual machines on host A (without starting them
> > >> elsewhere)
> > >>> which sets the last_host_id to 62, and delete host A in the
> > >> Infrastructure
> > >>> tab.
> > >>>
> > >>> As an end user (not an admin), try to start one of the virtual machines
> > >>> that were formerly on host A.
> > >>>
> > >>> End result: The deployment planner returns the last_host_id of host A
> > as
> > >>> the host to start the VM on. Since that host is marked deleted, the
> > >>> deployment executor removes it from the list of available hosts to
> > start
> > >>> the VM on, and then throws an exception. The exception then appears on
> > >> the
> > >>> GUI as a red box that basically gives the end user zero useful
> > >> information.
> > >>>
> > >>> Expected result:  The same as for admin users — the deployment planner
> > >>> notices that host A is deleted and cannot be used to deploy the VM, and
> > >>> instead picks a host from [B,C,D].
> > >>>
> > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> Adding pods to avoid lists for non-explicit VM deployment: []
> > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> Adding clusters to avoid lists for non-explicit VM deployment: []
> > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> Adding hosts to avoid lists for non-explicit VM deployment: []
> > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> DeploymentPlanner allocation algorithm: null
> > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> Trying to allocate a host and storage pools from dc:1,
> > >>> pod:null,cluster:null, requested cpu: 4000, requested ram: (4.00 GB)
> > >>> 4294967296
> > >>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> Is ROOT volume READY (pool already allocated)?: Yes
> > >>> 2025-01-09 13:28:35,466 DEBUG [c.c.a.ApiServer]
> > >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) CIDRs
> > from
> > >>> which account 'Account
> > >>>
> > >>
> > [{"accountName":"eric.green","id":8,"uuid":"e8baeda0-90a5-4c97-aa51-0adbfa3c0e4c"}]'
> > >>> is allowed to perform API calls: 0.0.0.0/0,::/0
> > >>> 2025-01-09 <http://0.0.0.0/0,::/02025-01-09> 13:28:35,466 DEBUG
> > >>> [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-81:ctx-4b9a2500
> > >>> job-418252 ctx-34040330) (logid:086f0ba0) Deploy avoids pods: [],
> > >> clusters:
> > >>> [], hosts: []
> > >>> 2025-01-09 13:28:35,466 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> Deploy hosts with priorities {} , hosts have NORMAL priority by default
> > >>> 2025-01-09 13:28:35,467 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
> > >> (logid:086f0ba0)
> > >>> This VM has last host_id specified, trying to choose the same host: 62
> > >>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.a.StaticRoleBasedAPIAccessChecker]
> > >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491)
> > >> RoleService
> > >>> is enabled. We will use it instead of StaticRoleBasedAPIAccessChecker.
> > >>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.r.ApiRateLimitServiceImpl]
> > >>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) API rate
> > >>> limiting is disabled. We will not use ApiRateLimitService.
> > >>> 2025-01-09 13:28:35,473 ERROR [c.c.a.ApiAsyncJobDispatcher]
> > >>> (API-Job-Executor-81:ctx-4b9a2500 job-418252) (logid:086f0ba0)
> > Unexpected
> > >>> exception while executing
> > >>> org.apache.cloudstack.api.command.user.vm.StartVMCmd
> > >>> java.lang.NullPointerException
> > >>>        at
> > >>> com.cloud.host.dao.HostDaoImpl.loadHostTags(HostDaoImpl.java:879)
> > >>>        at
> > jdk.internal.reflect.GeneratedMethodAccessor336.invoke(Unknown
> > >>> Source)
> > >>>        at
> > >>>
> > >>
> > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
> > >>>        at
> > >>>
> > >>
> > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
> > >>>        at com.sun.proxy.$Proxy96.loadHostTags(Unknown Source)
> > >>>        at
> > >>>
> > >>
> > com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:451)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:206)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:202)
> > >>>        at
> > >>>
> > >>
> > com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5445)
> > >>>        at
> > >>>
> > >>
> > com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5299)
> > >>>        at
> > >>>
> > >>
> > com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3224)
> > >>>        at
> > jdk.internal.reflect.GeneratedMethodAccessor787.invoke(Unknown
> > >>> Source)
> > >>>        at
> > >>>
> > >>
> > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:107)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
> > >>>        at
> > >>>
> > >>
> > com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
> > >>>        at
> > >>>
> > >>
> > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
> > >>>        at com.sun.proxy.$Proxy186.startVirtualMachine(Unknown Source)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:181)
> > >>>        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:172)
> > >>>        at
> > >>>
> > >>
> > com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:112)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:654)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
> > >>>        at
> > >>>
> > >>
> > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:602)
> > >>>        at
> > >>>
> > >>
> > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> > >>>        at
> > >>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > >>>        at
> > >>>
> > >>
> > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > >>>        at
> > >>>
> > >>
> > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > >>>        at java.base/java.lang.Thread.run(Thread.java:829)
> > >>>
> > >>>
> > >>>
> > >>>
> > >>
> > >
> > >
> > > --
> > > Daan
> >
> >
> 
> -- 
> Daan
> 

Reply via email to