I don’t know that it’s acceptable as a known bug for 4.19 unless the workaround 
that I did for my cluster (whack last_host_id in the database if it points at 
removed hosts) is documented. 

But honestly, any bug that requires manually whacking things in the database as 
a workaround probably needs to be fixed. Especially if it is in the n-1 
release. I, and many other people using Cloudstack in production, don’t update 
to the very latest and greatest release, we stick back on the n-1 release, 
which has had more testing and bug fixing. Usually. 

> On Jan 10, 2025, at 12:22 AM, Daan Hoogland <daan.hoogl...@gmail.com> wrote:
> 
> Wei, Eric,
> 
> This sounds like we must reapply for 4.19, and make sure it doesn't get
> pulled forward. The 4.20 code for this class has been heavily refactorred.
> We could also accept this as a known bug for 4.19?
> 
> On Fri, Jan 10, 2025 at 8:47 AM Eric Green <eric.lee.gr...@gmail.com> wrote:
> 
>> Thanks, Wei.  The log definitely stated that ‘considerlasthost’ was
>> getting defaulted to true (and honestly that should be configurable because
>> I really want the planner to start up the vm for users on whatever host has
>> the most excess capacity at the moment), and that the so I was confused
>> when Kiran claimed it was not. That’s not what the log said! (“This VM has
>> last host_id specified, trying to choose the same host: 62”).
>> 
>> So it appears that this is a bug that was fixed in 4.18.x, was supposedly
>> pulled into 4.19.x, but somehow did not actually get pulled into 4.19.x.
>> Since 4.20.x does not appear to have the bug, the solution appear to be to
>> upgrade to 4.20.x before I delete another host 😊. (I do not plan to delete
>> any hosts anytime soon, I just got all the old hosts retired).
>> 
>> I resolved the issue in my own Cloudstack deployment by whacking
>> last_host_id in the database to a current host if it pointed at a retired
>> host. That allowed my users to start virtual machines last started on the
>> retired hosts without involving an administrator. But it does appear that I
>> happened upon a bug here, from what you say. I presume that since you know
>> the exact cause, you will file an actual bug in Github that this fix has
>> not been integrated to 4.19.x branch?
>> 
>> From: Wei ZHOU <ustcweiz...@gmail.com>
>> Date: Thursday, January 9, 2025 at 11:27 PM
>> To: users@cloudstack.apache.org <users@cloudstack.apache.org>, Daan
>> Hoogland <daan.hoogl...@shapeblue.com>, suresh.anapa...@shapeblue.com <
>> suresh.anapa...@shapeblue.com>
>> Subject: Re: A possible bug in instance start for end users when a host is
>> deleted
>> Hi Kiran,
>> 
>> For backward compatibility, the `considerlasthost` is default to true if it
>> is not set.
>> 
>> The issue seems to be fixed in 4.19.1.0 by
>> https://github.com/apache/cloudstack/pull/9037
>> 
>> However, I cannot find the changes in the code base of 4.19 branch.
>> 
>> https://github.com/apache/cloudstack/blob/4.19/server/src/main/java/com/cloud/deploy/DeploymentPlanningManagerImpl.java#L449-L453
>> 
>> It looks like something is wrong during the merge of the 4.18 to 4.19
>> branch.
>> cc @Daan Hoogland <daan.hoogl...@shapeblue.com>
>> @suresh.anapa...@shapeblue.com <suresh.anapa...@shapeblue.com>
>> 
>> 
>> -Wei
>> 
>> On Fri, Jan 10, 2025 at 7:58 AM Kiran Chavala <kiran.chav...@shapeblue.com
>>> 
>> wrote:
>> 
>>> Hi Eric
>>> 
>>> I am not hitting the issue on the latest 4.20 release of cloudstack
>>> 
>>> For a normal end user, it  doesn’t consider the last host to start the
>>> virtual machine
>>> 
>>> the startVirtualMachine Api call has the parameter “considerlasthost”
>> for
>>> the  admin user
>>> 
>>> 
>>> 
>> https://cloudstack.apache.org/api/apidocs-4.20/apis/startVirtualMachine.html
>>> 
>>> For a normal user there is no parameter “ considerlasthost” , the
>>> deployment planner picks up the available host
>>> 
>>> Normal user
>>> 
>>> (localcloud2) 🐱 > start virtualmachine
>>> bootintosetup= clusterid=     filter=        hostid=        id=
>>> podid=
>>> 
>>> 
>>> Regards
>>> Kiran
>>> 
>>> From: Eric Green <eric.lee.gr...@gmail.com>
>>> Date: Friday, 10 January 2025 at 4:05 AM
>>> To: users@cloudstack.apache.org <users@cloudstack.apache.org>
>>> Subject: A possible bug in instance start for end users when a host is
>>> deleted
>>> This appears to me to be a bug. Is it a known bug? I tried searching the
>>> Github bug list but Github's search function is awful.
>>> 
>>> Cloudstack 4.19.1.3, Ubuntu 20.04 management server, Ubuntu 22.04 compute
>>> hosts, KVM hypervisor.
>>> 
>>> Scenario: There are four physical hosts, A, B, C, D.  Let's say they're
>>> host ID 62,63,64,65 in the hosts table.  Each host has at least one
>> virtual
>>> machine running on it.
>>> 
>>> Shut down the virtual machines on host A (without starting them
>> elsewhere)
>>> which sets the last_host_id to 62, and delete host A in the
>> Infrastructure
>>> tab.
>>> 
>>> As an end user (not an admin), try to start one of the virtual machines
>>> that were formerly on host A.
>>> 
>>> End result: The deployment planner returns the last_host_id of host A as
>>> the host to start the VM on. Since that host is marked deleted, the
>>> deployment executor removes it from the list of available hosts to start
>>> the VM on, and then throws an exception. The exception then appears on
>> the
>>> GUI as a red box that basically gives the end user zero useful
>> information.
>>> 
>>> Expected result:  The same as for admin users — the deployment planner
>>> notices that host A is deleted and cannot be used to deploy the VM, and
>>> instead picks a host from [B,C,D].
>>> 
>>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> Adding pods to avoid lists for non-explicit VM deployment: []
>>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> Adding clusters to avoid lists for non-explicit VM deployment: []
>>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> Adding hosts to avoid lists for non-explicit VM deployment: []
>>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> DeploymentPlanner allocation algorithm: null
>>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> Trying to allocate a host and storage pools from dc:1,
>>> pod:null,cluster:null, requested cpu: 4000, requested ram: (4.00 GB)
>>> 4294967296
>>> 2025-01-09 13:28:35,463 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> Is ROOT volume READY (pool already allocated)?: Yes
>>> 2025-01-09 13:28:35,466 DEBUG [c.c.a.ApiServer]
>>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) CIDRs from
>>> which account 'Account
>>> 
>> [{"accountName":"eric.green","id":8,"uuid":"e8baeda0-90a5-4c97-aa51-0adbfa3c0e4c"}]'
>>> is allowed to perform API calls: 0.0.0.0/0,::/0
>>> 2025-01-09 <http://0.0.0.0/0,::/02025-01-09> 13:28:35,466 DEBUG
>>> [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-81:ctx-4b9a2500
>>> job-418252 ctx-34040330) (logid:086f0ba0) Deploy avoids pods: [],
>> clusters:
>>> [], hosts: []
>>> 2025-01-09 13:28:35,466 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> Deploy hosts with priorities {} , hosts have NORMAL priority by default
>>> 2025-01-09 13:28:35,467 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252 ctx-34040330)
>> (logid:086f0ba0)
>>> This VM has last host_id specified, trying to choose the same host: 62
>>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.a.StaticRoleBasedAPIAccessChecker]
>>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491)
>> RoleService
>>> is enabled. We will use it instead of StaticRoleBasedAPIAccessChecker.
>>> 2025-01-09 13:28:35,471 DEBUG [o.a.c.r.ApiRateLimitServiceImpl]
>>> (qtp364604394-5358:ctx-c649c5e4 ctx-618420eb) (logid:14ea3491) API rate
>>> limiting is disabled. We will not use ApiRateLimitService.
>>> 2025-01-09 13:28:35,473 ERROR [c.c.a.ApiAsyncJobDispatcher]
>>> (API-Job-Executor-81:ctx-4b9a2500 job-418252) (logid:086f0ba0) Unexpected
>>> exception while executing
>>> org.apache.cloudstack.api.command.user.vm.StartVMCmd
>>> java.lang.NullPointerException
>>>        at
>>> com.cloud.host.dao.HostDaoImpl.loadHostTags(HostDaoImpl.java:879)
>>>        at jdk.internal.reflect.GeneratedMethodAccessor336.invoke(Unknown
>>> Source)
>>>        at
>>> 
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>>>        at
>>> 
>> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
>>>        at
>>> 
>> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
>>>        at
>>> 
>> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
>>>        at
>>> 
>> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
>>>        at com.sun.proxy.$Proxy96.loadHostTags(Unknown Source)
>>>        at
>>> 
>> com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:451)
>>>        at
>>> 
>> org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:206)
>>>        at
>>> 
>> org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:202)
>>>        at
>>> 
>> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5445)
>>>        at
>>> 
>> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:5299)
>>>        at
>>> 
>> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3224)
>>>        at jdk.internal.reflect.GeneratedMethodAccessor787.invoke(Unknown
>>> Source)
>>>        at
>>> 
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>>>        at
>>> 
>> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
>>>        at
>>> 
>> org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:107)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
>>>        at
>>> 
>> com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
>>>        at
>>> 
>> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
>>>        at
>>> 
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
>>>        at
>>> 
>> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
>>>        at com.sun.proxy.$Proxy186.startVirtualMachine(Unknown Source)
>>>        at
>>> 
>> org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:181)
>>>        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:172)
>>>        at
>>> 
>> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:112)
>>>        at
>>> 
>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:654)
>>>        at
>>> 
>> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
>>>        at
>>> 
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
>>>        at
>>> 
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
>>>        at
>>> 
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
>>>        at
>>> 
>> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
>>>        at
>>> 
>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:602)
>>>        at
>>> 
>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>>>        at
>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>        at
>>> 
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>>>        at
>>> 
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>>>        at java.base/java.lang.Thread.run(Thread.java:829)
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> -- 
> Daan

Reply via email to