[
https://issues.apache.org/jira/browse/CLOUDSTACK-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
exion updated CLOUDSTACK-10355:
-------------------------------
Description:
On a perfectly working 4.10 node with KVM hypervisor and Ceph RBD primary
storage, after upgrading to 4.11, cloudstack agent is unable to connect the BRD
pool in libvirt, giving just a generic "operation not supported" error in its
logs:
2018-04-06 16:27:37,650 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Attempting to create storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
2018-04-06 16:27:37,652 WARN [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Storage pool
be80af6a-7201-3410-8da4-9b3b58c4954f was not found running in libvirt. Need to
create it.
2018-04-06 16:27:37,653 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Didn't find an existing storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f by UUID, checking for pools with
duplicate paths
2018-04-06 16:27:37,664 ERROR [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Failed to create RBD storage
pool: org.libvirt.LibvirtException: failed to connect to the RADOS monitor on:
storagepool1:6789,: Operation not supported
2018-04-06 16:27:42,762 INFO [cloud.agent.Agent] (Agent-Handler-4:null)
(logid:) Lost connection to the server. Dealing with the remaining commands...
Exactly the same pool was previously working before upgrade:
2018-04-06 12:53:52,847 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-3:null) (logid:14dace5e) Attempting to create storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
2018-04-06 12:53:52,850 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-3:null) (logid:14dace5e) Found existing defined storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f, using it.
2018-04-06 12:53:52,850 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-3:null) (logid:14dace5e) Trying to fetch storage pool
be80af6a-7201-3410-8da4-9b3b58c4954f from libvirt
2018-04-06 12:53:53,171 INFO [cloud.agent.Agent] (agentRequest-Handler-2:null)
(logid:14dace5e) Proccess agent ready command, agent id = 46
To workaround the issue I have tried to use the following XML config (dumped
from another node where it is correctly running) and define the pool directly
in libvirt, and it worked as expected:
<pool type="rbd">
<name>be80af6a-7201-3410-8da4-9b3b58c4954f</name>
<uuid>be80af6a-7201-3410-8da4-9b3b58c4954f</uuid>
<source>
<name>cephstor1</name>
<host name='storagepool1' port='6789'/>
<auth username='admin' type='ceph'>
<secret uuid='be80af6a-7201-3410-8da4-9b3b58c4954f'/>
</auth>
</source>
</pool>
virsh pool-define test.xml
Pool be80af6a-7201-3410-8da4-9b3b58c4954f defined from test.xml
root@compute6:~# virsh pool-start be80af6a-7201-3410-8da4-9b3b58c4954f
Pool be80af6a-7201-3410-8da4-9b3b58c4954f started
root@compute6:~# virsh pool-info be80af6a-7201-3410-8da4-9b3b58c4954f
Name: be80af6a-7201-3410-8da4-9b3b58c4954f
UUID: be80af6a-7201-3410-8da4-9b3b58c4954f
State: running
Persistent: yes
Autostart: no
Capacity: 10.05 TiB
Allocation: 2.22 TiB
Available: 2.71 TiB
And now the cloudstack agent correctly starts:
2018-04-09 10:29:19,989 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:f0021131) Attempting to create storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
2018-04-09 10:29:19,990 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:f0021131) Found existing defined storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f, using it.
2018-04-09 10:29:19,991 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:f0021131) Trying to fetch storage pool
be80af6a-7201-3410-8da4-9b3b58c4954f from libvirt
2018-04-09 10:29:20,372 INFO [cloud.agent.Agent] (agentRequest-Handler-2:null)
(logid:f0021131) Proccess agent ready command, agent id = 56
was:
On a perfectly working 4.10 node with KVM hypervisor and Ceph RBD primary
storage, after upgrading to 4.11, cloudstack agent is unable to connect the BRD
pool in libvirt, giving just a generic "operation not supported" error in its
logs:
2018-04-06 16:27:37,650 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Attempting to create storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
2018-04-06 16:27:37,652 WARN [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Storage pool
be80af6a-7201-3410-8da4-9b3b58c4954f was not found running in libvirt. Need to
create it.
2018-04-06 16:27:37,653 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Didn't find an existing storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f by UUID, checking for pools with
duplicate paths
2018-04-06 16:27:37,664 ERROR [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-2:null) (logid:91b4e1df) Failed to create RBD storage
pool: org.libvirt.LibvirtException: failed to connect to the RADOS monitor on:
storagepool1:6789,: Operation not supported
2018-04-06 16:27:42,762 INFO [cloud.agent.Agent] (Agent-Handler-4:null)
(logid:) Lost connection to the server. Dealing with the remaining commands...
Exactly the same pool was previously working before upgrade:
2018-04-06 12:53:52,847 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-3:null) (logid:14dace5e) Attempting to create storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
2018-04-06 12:53:52,850 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-3:null) (logid:14dace5e) Found existing defined storage
pool be80af6a-7201-3410-8da4-9b3b58c4954f, using it.
2018-04-06 12:53:52,850 INFO [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-3:null) (logid:14dace5e) Trying to fetch storage pool
be80af6a-7201-3410-8da4-9b3b58c4954f from libvirt
2018-04-06 12:53:53,171 INFO [cloud.agent.Agent] (agentRequest-Handler-2:null)
(logid:14dace5e) Proccess agent ready command, agent id = 46
To nail out the issue I have tried to use the following XML config and attach
the pool directly to libvirt in order to nail out system related issues, and it
worked as expected:
<pool type="rbd">
<name>be80af6a-7201-3410-8da4-9b3b58c4954f</name>
<source>
<name>cephstor1</name>
<host name='storagepool1' port='6789'/>
<auth username='admin' type='ceph'>
<secret uuid='XXXXX'/>
</auth>
</source>
</pool>
virsh pool-create test.xml
Pool be80af6a-7201-3410-8da4-9b3b58c4954f created from test.xml
root@compute6:~# virsh pool-info be80af6a-7201-3410-8da4-9b3b58c4954f
Name: be80af6a-7201-3410-8da4-9b3b58c4954f
UUID: 47afe7d4-61cb-46c5-a642-93712c758b5c
State: running
Persistent: no
Autostart: no
Capacity: 10.05 TiB
Allocation: 2.22 TiB
Available: 2.71 TiB
That being said the issue looks related to the way cloudstack scripts interface
with libvirt's daemon.
> After upgrade to 4.11, Ceph RBD primary storage fails connection and renders
> node unusable
> ------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-10355
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10355
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: cloudstack-agent
> Affects Versions: 4.11.0.0
> Reporter: exion
> Priority: Blocker
>
> On a perfectly working 4.10 node with KVM hypervisor and Ceph RBD primary
> storage, after upgrading to 4.11, cloudstack agent is unable to connect the
> BRD pool in libvirt, giving just a generic "operation not supported" error in
> its logs:
>
> 2018-04-06 16:27:37,650 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:91b4e1df) Attempting to create storage
> pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
> 2018-04-06 16:27:37,652 WARN [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:91b4e1df) Storage pool
> be80af6a-7201-3410-8da4-9b3b58c4954f was not found running in libvirt. Need
> to create it.
> 2018-04-06 16:27:37,653 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:91b4e1df) Didn't find an existing
> storage pool be80af6a-7201-3410-8da4-9b3b58c4954f by UUID, checking for pools
> with duplicate paths
> 2018-04-06 16:27:37,664 ERROR [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:91b4e1df) Failed to create RBD storage
> pool: org.libvirt.LibvirtException: failed to connect to the RADOS monitor
> on: storagepool1:6789,: Operation not supported
> 2018-04-06 16:27:42,762 INFO [cloud.agent.Agent] (Agent-Handler-4:null)
> (logid:) Lost connection to the server. Dealing with the remaining commands...
>
> Exactly the same pool was previously working before upgrade:
>
> 2018-04-06 12:53:52,847 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-3:null) (logid:14dace5e) Attempting to create storage
> pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
> 2018-04-06 12:53:52,850 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-3:null) (logid:14dace5e) Found existing defined storage
> pool be80af6a-7201-3410-8da4-9b3b58c4954f, using it.
> 2018-04-06 12:53:52,850 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-3:null) (logid:14dace5e) Trying to fetch storage pool
> be80af6a-7201-3410-8da4-9b3b58c4954f from libvirt
> 2018-04-06 12:53:53,171 INFO [cloud.agent.Agent]
> (agentRequest-Handler-2:null) (logid:14dace5e) Proccess agent ready command,
> agent id = 46
>
> To workaround the issue I have tried to use the following XML config (dumped
> from another node where it is correctly running) and define the pool directly
> in libvirt, and it worked as expected:
>
> <pool type="rbd">
> <name>be80af6a-7201-3410-8da4-9b3b58c4954f</name>
> <uuid>be80af6a-7201-3410-8da4-9b3b58c4954f</uuid>
> <source>
> <name>cephstor1</name>
> <host name='storagepool1' port='6789'/>
> <auth username='admin' type='ceph'>
> <secret uuid='be80af6a-7201-3410-8da4-9b3b58c4954f'/>
> </auth>
> </source>
> </pool>
>
> virsh pool-define test.xml
> Pool be80af6a-7201-3410-8da4-9b3b58c4954f defined from test.xml
>
> root@compute6:~# virsh pool-start be80af6a-7201-3410-8da4-9b3b58c4954f
> Pool be80af6a-7201-3410-8da4-9b3b58c4954f started
>
> root@compute6:~# virsh pool-info be80af6a-7201-3410-8da4-9b3b58c4954f
> Name: be80af6a-7201-3410-8da4-9b3b58c4954f
> UUID: be80af6a-7201-3410-8da4-9b3b58c4954f
> State: running
> Persistent: yes
> Autostart: no
> Capacity: 10.05 TiB
> Allocation: 2.22 TiB
> Available: 2.71 TiB
>
> And now the cloudstack agent correctly starts:
>
> 2018-04-09 10:29:19,989 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:f0021131) Attempting to create storage
> pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt
> 2018-04-09 10:29:19,990 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:f0021131) Found existing defined storage
> pool be80af6a-7201-3410-8da4-9b3b58c4954f, using it.
> 2018-04-09 10:29:19,991 INFO [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:f0021131) Trying to fetch storage pool
> be80af6a-7201-3410-8da4-9b3b58c4954f from libvirt
> 2018-04-09 10:29:20,372 INFO [cloud.agent.Agent]
> (agentRequest-Handler-2:null) (logid:f0021131) Proccess agent ready command,
> agent id = 56
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)