TimServers commented on issue #12921:
URL: https://github.com/apache/cloudstack/issues/12921#issuecomment-4203449950

   @kiranchavala 
   Redfish is configured without issue - all power actions and status works 
fine from the UI - we only see the above errors during host ha / fencing event.
   
   You can see it's configured normally.
   
   ```
   mysql> select * from oobm;
   
+----+---------+---------+-------------+---------+------------+------+------------+--------------------------------------------------------------+--------------+---------------------+-----------------+
   | id | host_id | enabled | power_state | driver  | address    | port | 
username   | password                                                     | 
update_count | update_time         | mgmt_server_id  |
   
+----+---------+---------+-------------+---------+------------+------+------------+--------------------------------------------------------------+--------------+---------------------+-----------------+
   |  1 |       5 |       1 | On          | redfish | 10.9.3.173 | 443  | 
CLOUDSTACK | wa1h6RRckH33xp59QJpiHEhajwv6Crl+xxxxKkyFmqGB0SV195WSsggA= |        
   75 | 2026-04-08 02:23:43 | 248902281439561 |
   |  2 |       6 |       1 | On          | redfish | 10.9.3.166 | 443  | 
CLOUDSTACK | 1f9UAW941qi953SJ+QWxuVVrfC53BOKxxxx13kLfyUddrW3/G6Bv4DOw= |        
   63 | 2026-04-08 02:23:03 | 248902281439561 |
   
+----+---------+---------+-------------+---------+------------+------+------------+--------------------------------------------------------------+--------------+---------------------+-----------------+
   2 rows in set (0.00 sec)
   ```
   
   The issue is that OOBM works fine until a HA event occurs. 
   
   During the HA fencing workflow is the only time we see any errors.
   
   All power actions via the UI / web interface work fine.
   
   
   As you can see, host never properly completes fencing.
   
   ```
   Out-of-band management power state
   Off
   Out-of-band management driver
   redfish
   Out-of-band management address
   10.9.3.173
   Out-of-band management port
   443
   HA enabled
   true
   HA state
   Fencing
   HA provider
   kvmhaprovider
   UEFI supported
   true
   Dedicated
   No
   ```
   
   
   
   As you can see, the curl returns a 'successful' result, but within an 
"error" block, I think cloudstack is mis-interpreting this as a failure.
   
   ```
   2026-04-08 02:40:31,761 WARN  [o.a.c.k.h.KVMHAProvider] 
(pool-6-thread-8:[ctx-c6509ce8]) (logid:) OOBM service is not configured or 
enabled for this host Host 
{"id":5,"name":"kvm-6-3.servercontrol.com.au","type":"Routing","uuid":"784ce3fb-3960-4659-bd28-a0295c59663b"}
 error is Failed to execute System power command for host by performing 'POST' 
request on URL 
'https://10.9.3.173/redfish/v1/Systems/1/Actions/ComputerSystem.Reset' and host 
address '10.9.3.173'. The expected HTTP status code is '2XX' but it got '400'.
   2026-04-08 02:40:31,761 WARN  [o.a.c.h.t.FenceTask] (pool-6-thread-19:[]) 
(logid:) Exception occurred while running FenceTask on a resource: 
org.apache.cloudstack.ha.provider.HAFenceException: OBM service is not 
configured or enabled for this host kvm-6-3.servercontrol.com.au 
org.apache.cloudstack.ha.provider.HAFenceException: OBM service is not 
configured or enabled for this host kvm-6-3.servercontrol.com.au
           at 
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:98)
           at 
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:41)
           at 
org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42)
           at 
org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:87)
           at 
org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:84)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
           at java.base/java.lang.Thread.run(Thread.java:840)
   Caused by: org.apache.cloudstack.utils.redfish.RedfishException: Failed to 
execute System power command for host by performing 'POST' request on URL 
'https://10.9.3.173/redfish/v1/Systems/1/Actions/ComputerSystem.Reset' and host 
address '10.9.3.173'. The expected HTTP status code is '2XX' but it got '400'.
           at 
org.apache.cloudstack.utils.redfish.RedfishClient.executeComputerSystemReset(RedfishClient.java:312)
           at 
org.apache.cloudstack.outofbandmanagement.driver.redfish.RedfishOutOfBandManagementDriver.execute(RedfishOutOfBandManagementDriver.java:88)
           at 
org.apache.cloudstack.outofbandmanagement.driver.redfish.RedfishOutOfBandManagementDriver.execute(RedfishOutOfBandManagementDriver.java:64)
           at 
org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:422)
           at jdk.internal.reflect.GeneratedMethodAccessor608.invoke(Unknown 
Source)
           at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.base/java.lang.reflect.Method.invoke(Method.java:569)
           at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
           at 
org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:109)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
           at 
com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
           at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
           at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
           at jdk.proxy3/jdk.proxy3.$Proxy423.executePowerOperation(Unknown 
Source)
           at 
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:90)
           ... 8 more
   
   2026-04-08 02:40:31,763 WARN  [c.c.a.AlertManagerImpl] (pool-6-thread-19:[]) 
(logid:) alertType=[30] dataCenterId=[1] podId=[2] clusterId=[null] message=[HA 
Fencing of host id=5, in dc id=1 performed].
   
   
   ```
   
   
   
   Here is the result of a successful curl
   
   ```
   {"error":{"code":"iLO.0.10.ExtendedInfo","message":"See 
@Message.ExtendedInfo for more 
information.","@Message.ExtendedInfo":[{"MessageId":"Base.1.18.Success"}]}}
   ```
   
   It's a "success" return, however, may be misinterpreted by cloudstack as an 
error?
   
   Here are the logs when doing power actions via the GUI, you can see it's 
configured fine and there are no errors - all power actions work fine. We're 
running the latest version of ILO.
   
   ```
   2026-04-08 02:29:15,255 DEBUG [c.c.a.ApiServlet] 
(qtp1438988851-42184:[ctx-4389f345, ctx-ed4a9052]) (logid:879c23e3) ===END===  
221.121.128.73 -- POST
   command=issueOutOfBandManagementPowerAction
   response=json
   action=OFF
   hostid=784ce3fb-3960-4659-bd28-a0295c59663b
   sessionkey=oTvqEoVn91XqdoddtJUUKaxV6OY
   
   2026-04-08 02:29:16,347 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-15:[ctx-8395114c, job-616, ctx-7de6e440]) (logid:b2bc657a) 
Retrieved System ID '1' with request 'GET: 
https://10.9.3.173/redfish/v1/Systems'
   
   2026-04-08 02:29:17,775 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-15:[ctx-8395114c, job-616, ctx-7de6e440]) (logid:b2bc657a) 
Sending ComputerSystem.Reset Command 'GracefulShutdown' to host '10.9.3.173' 
with request 'POST 
https://10.9.3.173/redfish/v1/Systems/1/Actions/ComputerSystem.Reset'
   
   2026-04-08 02:29:17,810 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-15:[ctx-8395114c, job-616, ctx-7de6e440]) (logid:b2bc657a) 
Complete async job-616, jobStatus: SUCCEEDED, resultCode: 0, result: 
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"784ce3fb-3960-4659-bd28-a0295c59663b","powerstate":"On","enabled":"true","driver":"redfish","address":"10.9.3.173","port":"443","username":"CLOUDSTACK","password":"M*****","action":"OFF","description":"200","status":"true"}
   
   2026-04-08 02:29:17,848 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-15:[ctx-8395114c, job-616]) (logid:b2bc657a) Done executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 for job-616
   
   2026-04-08 02:29:17,848 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(API-Job-Executor-15:[ctx-8395114c, job-616]) (logid:b2bc657a) Remove job-616 
from job monitoring
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to