Re: Connection issue with the master
Hi Jaypal, in the table configuration I see this field: ssl.keystore SSL Keystore for the management servers Do you mean I need to null these values? Thanks again S. 2015-02-10 10:12 GMT+01:00 Salvatore Sciacco scia...@iperweb.com: Also, I'd like to understand why some agent connect just fine 2015-02-10 07:07:02,179 INFO [utils.nio.NioClient] (Agent-Selector:null) Connecting to 192.168.11.9:8250 2015-02-10 07:07:07,318 INFO [utils.nio.NioClient] (Agent-Selector:null) SSL: Handshake done 2015-02-10 07:07:07,319 INFO [utils.nio.NioClient] (Agent-Selector:null) Connected to 192.168.11.9:8250 2015-02-10 10:05 GMT+01:00 Salvatore Sciacco scia...@iperweb.com: Hi Jayapal, do you mean the keystore in /etc/cloudstack/management or the ssl setting in the config? I already tried removing the keystore but the same was generated (copied?) in place of the existing. Thank you very much! S. 2015-02-10 9:53 GMT+01:00 Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com: Hi, This issue is related to ssl keys. Can you remove the keys and try restarting MS and recreate systemvms. Thanks, Jayapal On 10-Feb-2015, at 1:52 PM, Salvatore Sciacco scia...@iperweb.com wrote: Hi, I have a few hosts and the systemvm (console proxy) which stopped connecting to the master with ssl error: 2015-02-10 08:43:46,057 INFO [cloud.agent.Agent] (Agent-Handler-4:null) Reconnecting... 2015-02-10 08:43:46,058 INFO [utils.nio.NioClient] (Agent-Selector:null) Connecting to 111.11.1.1:8250 2015-02-10 08:43:56,191 ERROR [utils.nio.NioConnection] (Agent-Selector:null) Unable to initialize the threads. java.io.IOException: SSL: Fail to init SSL! java.io.IOException: Connection closed with -1 on reading size. at com.cloud.utils.nio.NioClient.init(NioClient.java:87) at com.cloud.utils.nio.NioConnection.run(NioConnection.java:111) at java.lang.Thread.run(Thread.java:745) anybody can sugggest how I can debug the SSL layer? Cients are able to connect to the port 8250, but they are disconnected just after the connection is established. Thank you very much S.
Re: Connection issue with the master
Hi Jayapal, do you mean the keystore in /etc/cloudstack/management or the ssl setting in the config? I already tried removing the keystore but the same was generated (copied?) in place of the existing. Thank you very much! S. 2015-02-10 9:53 GMT+01:00 Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com : Hi, This issue is related to ssl keys. Can you remove the keys and try restarting MS and recreate systemvms. Thanks, Jayapal On 10-Feb-2015, at 1:52 PM, Salvatore Sciacco scia...@iperweb.com wrote: Hi, I have a few hosts and the systemvm (console proxy) which stopped connecting to the master with ssl error: 2015-02-10 08:43:46,057 INFO [cloud.agent.Agent] (Agent-Handler-4:null) Reconnecting... 2015-02-10 08:43:46,058 INFO [utils.nio.NioClient] (Agent-Selector:null) Connecting to 111.11.1.1:8250 2015-02-10 08:43:56,191 ERROR [utils.nio.NioConnection] (Agent-Selector:null) Unable to initialize the threads. java.io.IOException: SSL: Fail to init SSL! java.io.IOException: Connection closed with -1 on reading size. at com.cloud.utils.nio.NioClient.init(NioClient.java:87) at com.cloud.utils.nio.NioConnection.run(NioConnection.java:111) at java.lang.Thread.run(Thread.java:745) anybody can sugggest how I can debug the SSL layer? Cients are able to connect to the port 8250, but they are disconnected just after the connection is established. Thank you very much S.
Connection issue with the master
Hi, I have a few hosts and the systemvm (console proxy) which stopped connecting to the master with ssl error: 2015-02-10 08:43:46,057 INFO [cloud.agent.Agent] (Agent-Handler-4:null) Reconnecting... 2015-02-10 08:43:46,058 INFO [utils.nio.NioClient] (Agent-Selector:null) Connecting to 111.11.1.1:8250 2015-02-10 08:43:56,191 ERROR [utils.nio.NioConnection] (Agent-Selector:null) Unable to initialize the threads. java.io.IOException: SSL: Fail to init SSL! java.io.IOException: Connection closed with -1 on reading size. at com.cloud.utils.nio.NioClient.init(NioClient.java:87) at com.cloud.utils.nio.NioConnection.run(NioConnection.java:111) at java.lang.Thread.run(Thread.java:745) anybody can sugggest how I can debug the SSL layer? Cients are able to connect to the port 8250, but they are disconnected just after the connection is established. Thank you very much S.
Re: Connection issue with the master
Also, I'd like to understand why some agent connect just fine 2015-02-10 07:07:02,179 INFO [utils.nio.NioClient] (Agent-Selector:null) Connecting to 192.168.11.9:8250 2015-02-10 07:07:07,318 INFO [utils.nio.NioClient] (Agent-Selector:null) SSL: Handshake done 2015-02-10 07:07:07,319 INFO [utils.nio.NioClient] (Agent-Selector:null) Connected to 192.168.11.9:8250 2015-02-10 10:05 GMT+01:00 Salvatore Sciacco scia...@iperweb.com: Hi Jayapal, do you mean the keystore in /etc/cloudstack/management or the ssl setting in the config? I already tried removing the keystore but the same was generated (copied?) in place of the existing. Thank you very much! S. 2015-02-10 9:53 GMT+01:00 Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com: Hi, This issue is related to ssl keys. Can you remove the keys and try restarting MS and recreate systemvms. Thanks, Jayapal On 10-Feb-2015, at 1:52 PM, Salvatore Sciacco scia...@iperweb.com wrote: Hi, I have a few hosts and the systemvm (console proxy) which stopped connecting to the master with ssl error: 2015-02-10 08:43:46,057 INFO [cloud.agent.Agent] (Agent-Handler-4:null) Reconnecting... 2015-02-10 08:43:46,058 INFO [utils.nio.NioClient] (Agent-Selector:null) Connecting to 111.11.1.1:8250 2015-02-10 08:43:56,191 ERROR [utils.nio.NioConnection] (Agent-Selector:null) Unable to initialize the threads. java.io.IOException: SSL: Fail to init SSL! java.io.IOException: Connection closed with -1 on reading size. at com.cloud.utils.nio.NioClient.init(NioClient.java:87) at com.cloud.utils.nio.NioConnection.run(NioConnection.java:111) at java.lang.Thread.run(Thread.java:745) anybody can sugggest how I can debug the SSL layer? Cients are able to connect to the port 8250, but they are disconnected just after the connection is established. Thank you very much S.
Re: KVM - Migration of CLVM volumes to another primary storage fail
Hello Lucian, did you have any chance to try to reproduce my setup? :-) Best, S. 2014-04-20 15:26 GMT+02:00 Nux! n...@li.nux.ro: On 20.04.2014 13:24, Salvatore Sciacco wrote: 2014-04-20 12:31 GMT+02:00 Nux! n...@li.nux.ro: It looks like a bug, qemu-img convert should be used instead of cp -f, among others. I suppose that some code was added to do a simple copy when format is the same, this wasn't the case with 4.1.1 version. Do you mind opening an issue in https://issues.apache.org/jira ? Already did :-) https://issues.apache.org/jira/browse/CLOUDSTACK-6462 Thanks S. Cool, I'll try to find out after the holidays if the problem exists in 4.3 as well and if yes, bug some people about it. Happy Easter :-) Lucian -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro
Re: KVM - Migration of CLVM volumes to another primary storage fail
Thanks! :-) I suppose there isn't much people running different CLVM pools in the same zone... S. 2014-04-20 15:26 GMT+02:00 Nux! n...@li.nux.ro: On 20.04.2014 13:24, Salvatore Sciacco wrote: 2014-04-20 12:31 GMT+02:00 Nux! n...@li.nux.ro: It looks like a bug, qemu-img convert should be used instead of cp -f, among others. I suppose that some code was added to do a simple copy when format is the same, this wasn't the case with 4.1.1 version. Do you mind opening an issue in https://issues.apache.org/jira ? Already did :-) https://issues.apache.org/jira/browse/CLOUDSTACK-6462 Thanks S. Cool, I'll try to find out after the holidays if the problem exists in 4.3 as well and if yes, bug some people about it. Happy Easter :-) Lucian -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro
KVM - Migration of CLVM volumes to another primary storage fail
ACS version: 4.2.1 Hypervisors: KVM Storage pool type: CLVM Since we upgraded from 4.1 to 4.2.1 moving volumes to a different primary storage pool fail. I've enabled debug on the agents side and I think there is a problem with the format type conversion Volume on database has format QCOW2 these are the parameters for the first step (CLVM - NFS): srcTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid:cda46430-52d7-4bf0-b0c2-adfc78dd011c,volumeType:ROOT,dataStore:{org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:655d6965-b3f3-4118-a970-d50cf6afc365,id:211,poolType:CLVM,host:localhost,path:/FC10KY1,port:0}},name:ROOT-4450,size:5368709120,path:39a25daf-23a1-4b65-99ac-fb98469ac197,volumeId:5937,vmName:i-402-4450-VM,accountId:402,format:QCOW2,id:5937,hypervisorType:KVM}} destTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid:cda46430-52d7-4bf0-b0c2-adfc78dd011c,volumeType:ROOT,dataStore:{com.cloud.agent.api.to.NfsTO:{_url:nfs:// 192.168.11.6/home/a1iwstack ,_role:Image}},name:ROOT-4450,size:5368709120,path:volumes/402/5937,volumeId:5937,vmName:i-402-4450-VM,accountId:402,format:QCOW2,id:5937,hypervisorType:KVM}} Those commads are translated into the agent: DEBUG [utils.script.Script] (agentRequest-Handler-1:null) Executing: qemu-img info /dev/FC10KY1/39a25daf-23a1-4b65-99ac-fb98469ac197 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) Execution is successful. DEBUG [utils.script.Script] (agentRequest-Handler-1:null) Executing: */bin/bash -c cp -f /dev/FC10KY1/39a25daf-23a1-4b65-99ac-fb98469ac197 /mnt/b8311c72-fe75-3832-98fc-975445028a12/5c713376-c418-478c-8a31-89c4181cb48e.qcow2* With the result that the output file isn't a qcow2 file but a raw partition, which in turn make the next step fail. (NFS - CLVM) DEBUG [utils.script.Script] (agentRequest-Handler-2:) Executing: qemu-img info /mnt/b8311c72-fe75-3832-98fc-975445028a12/b9303d8d-cd51-4b6c-a244-43c405df4238.qcow2 DEBUG [utils.script.Script] (agentRequest-Handler-2:) Execution is successful. DEBUG [utils.script.Script] (agentRequest-Handler-2:) Executing: qemu-img convert -f qcow2 -O raw/mnt/b8311c72-fe75-3832-98fc-975445028a12/b9303d8d-cd51-4b6c-a244-43c405df4238.qcow2 /dev/FCSTORAGE/da162325-467b-4e78-af07-4bad85470d66 DEBUG [utils.script.Script] (agentRequest-Handler-2:) Exit value is 1 DEBUG [utils.script.Script] (agentRequest-Handler-2:) qemu-img: Could not open '/mnt/b8311c72-fe75-3832-98fc-975445028a12/b9303d8d-cd51-4b6c-a244-43c405df4238.qcow2'qemu-img: Could not open '/mnt/b8311c72-fe75-3832-98fc-975445028a12/b9303d8d-cd51-4b6c-a244-43c405df4238.qcow2' ERROR [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-2:) Failed to convert /mnt/b8311c72-fe75-3832-98fc-975445028a12/b9303d8d-cd51-4b6c-a244-43c405df4238.qcow2 to /dev/FCSTORAGE/da162325-467b-4e78-af07-4bad85470d66 the error was: qemu-img: Could not open '/mnt/b8311c72-fe75-3832-98fc-975445028a12/b9303d8d-cd51-4b6c-a244-43c405df4238.qcow2'qemu-img: Could not open '/mnt/b8311c72-fe75-3832-98fc-975445028a12/b9303d8d-cd51-4b6c-a244-43c405df4238.qcow2' If I change on the database the format of the volume to RAW the effect is even worse as *data is lost* in the process! These are the parameter for the first step (CLVM = NFS) srcTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid:cda46430-52d7-4bf0-b0c2-adfc78dd011c,volumeType:ROOT,dataStore:{org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:655d6965-b3f3-4118-a970d50cf6afc365,id:211,poolType:CLVM,host:localhost,path:/FC10KY1,port:0}},name:ROOT-4450 ,size:5368709120,path:39a25daf-23a1-4b65-99ac-fb98469ac197,volumeId:5937,vmName:i-4024450VM,accountId:402, format:RAW,id:5937,hypervisorType:KVM}}, destTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid:cda46430-52d7-4bf0-b0c2-adfc78dd011c,volumeType:ROOT,dataStore:{com.cloud.agent.api.to.NfsTO:{_url:nfs:// 192.168.11.6/home/a1iwstack ,_role:Image}},name:ROOT4450,size:5368709120,path:volumes/402/5937,volumeId:5937,vmName:i-402-4450-VM,accountId:402, format:RAW,id:5937,hypervisorType:KVM}} this time the output is converted to qcow2! DEBUG [utils.script.Script] (agentRequest-Handler-3:null) Executing: qemu-img info /dev/FC10KY1/39a25daf-23a1-4b65-99ac-fb98469ac197 DEBUG [utils.script.Script] (agentRequest-Handler-3:null) Execution is successful. DEBUG [utils.script.Script] (agentRequest-Handler-3:null) *Executing: qemu-img convert -f raw -O qcow2 */dev/FC10KY1/39a25daf-23a1-4b65-99ac-fb98469ac197 /mnt/b8311c72-fe75-3832-98fc-975445028a12/01ab129f-aaf6-4b1a-8e2a-093bee0b811c.raw and *data is lost* in the next step (NFS - CLVM): srcTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid:cda46430-52d7-4bf0-b0c2-adfc78dd011c,volumeType:ROOT,dataStore:{ com.cl oud.agent.api.to.NfsTO:{_url:nfs://192.168.11.6/home/a1iwstack ,_role:Image}},name:ROOT4450,size:5368709120,path:volumes/402/5937/01ab129f-aaf6-4b1a-8e2a-093bee0b811c.raw,volumeId:5937,vmName:i-402-4450-VM,accountId:402, format:RAW,id:5937,hypervisorType:KVM}}
Re: KVM - Migration of CLVM volumes to another primary storage fail
2014-04-20 12:31 GMT+02:00 Nux! n...@li.nux.ro: It looks like a bug, qemu-img convert should be used instead of cp -f, among others. I suppose that some code was added to do a simple copy when format is the same, this wasn't the case with 4.1.1 version. Do you mind opening an issue in https://issues.apache.org/jira ? Already did :-) https://issues.apache.org/jira/browse/CLOUDSTACK-6462 Thanks S.
Missing resize volume button for users in 4.1.1?
Hello, I noticed that after the upgrade to the 4.1.1 the button to resize the data volume is missing for users but it is present for ROOT Admin users IIRC the button was present when I installed 4.1.0? Thank you Salvatore
Re: HA not working - CloudStack 4.1.0 and KVM hypervisor hosts
there are workaround / database update to declare a host died so that HA operations can be triggered? 2013/7/25 Lennert den Teuling lenn...@pcextreme.nl Op 25-07-13 07:48, Bryan Whitehead schreef: Starting off, there is never going to be a way to conclusively decide if a host is down. This is just the nature of complex systems. We can only hope our software does well - and if well is wrong - we have a way to clean up the mess created. That said, I like the old behavior 3.0.x has. As I mentioned in -3535 I've had a host lose its network (e1000 oops in kernel) and HA got triggered. The storage (in this case gluster using a sharedmountpount) wouldn't let qemu-kvm start on another host because the underlying qcow2 file was locked by an already running qemu-kvm process (on the machine that lost network). So HA being triggered didn't ruin any VM disks. Gluster was running on Infiniband so the shared storage with working locks prevented HA from screwing things up. Further, even if gluster lost connectivity, gluster itself would split-brain and later I could decide which qcow2/disk image should be truth. Do I keep the VM that kept on running? Or do I keep the version HA booted and fscked? That's for me - the user - to decide. As a cloudstack admin/user I understand the risks of HA and I choose to live with them - I've even made sure that should such a disaster happen I can recover (gluster will split brain as well). The #1 reason for choosing HA is I want the VM to be available as much as possible. Right now 4.1 DOES NOT have HA... I don't know how emailing the admin to figure out what to do is being entertained as an option. That's just nonsense and is NOT HIGH AVAILABILITY. IMHO If one is so terrified of HA screwing up they should probably pass on HA and manually start things up. When a simple reproducible test like pulling the plug on a host can't trigger an HA event - then that feature doesn't exist. It is simple as that. I would like to add that when testing this on our development cluster, something bizar happened: First, when i killed the VMs _and_ the agent on the host the HA worked just fine, after 10 minutes everything was restarted on a working host. The second time i turned of the host, nothing happened: 2013-07-25 15:31:41,347 DEBUG [cloud.ha.**AbstractInvestigatorImpl] (AgentTaskPool-3:null) host (192.168.122.32) cannot be pinged, returning null ('I don't know') 2013-07-25 15:31:41,348 DEBUG [cloud.ha.**UserVmDomRInvestigator] (AgentTaskPool-3:null) could not reach agent, could not reach agent's host, returning that we don't have enough information 2013-07-25 15:31:41,348 DEBUG [cloud.ha.**HighAvailabilityManagerImpl] (AgentTaskPool-3:null) null unable to determine the state of the host. Moving on. 2013-07-25 15:31:41,348 DEBUG [cloud.ha.**HighAvailabilityManagerImpl] (AgentTaskPool-3:null) null unable to determine the state of the host. Moving on. 2013-07-25 15:31:41,349 WARN [agent.manager.**AgentManagerImpl] (AgentTaskPool-3:null) Agent state cannot be determined, do nothing So when the host is still pingable it's OK to do a HA, but when it is totally unreachable it's not? My third try was even worse. I killed the agent, forgot to kill the VMs and the management server restarted the VMs on another host and it seems that all images are corrupted. 2013-07-25 15:37:31,614 DEBUG [agent.manager.**AgentManagerImpl] (HA-Worker-2:work-29) Details from executing class com.cloud.agent.api.**PingTestCommand: PING 192.168.122.170 (192.168.122.170): 56 data bytes6 4 bytes from 192.168.122.161: Destination Host UnreachableVr HL TOS Len ID Flg off TTL Pro cks Src Dst Data 4 5 00 5400 0 0040 40 01 0cc4 192.168.122.161 192.168.122.170 --- 192. 168.122.170 ping statistics ---1 packets transmitted, 0 packets received, 100% packet lossUnable to ping the vm, exiting 2013-07-25 15:37:31,614 DEBUG [cloud.ha.**UserVmDomRInvestigator] (HA-Worker-2:work-29) VM[User|c88924e9-a8c9-4705-**acc8-3237ffcf009d] could not be pinged, returning that it is unknown Ping is disabled by default if you use security groups, so a ping test is not reliable. Concluding that a VM is down on a simple ping test, is when you use security groups for example not the right option. (It's even dangerous) I will do some more tests, but if it's true that my last HA was based on a failed ping i will need to turn ping on on all my production instances asap. I do agree with Bryan that HA needs to go automatically without intervention of a sysadmin. I think you could base a HA operation on: - An unreachable agent - Unpingable host - A file with a timestamp on the network storage which updates every X seconds, when it's not updated, something is wrong. Ideally the management server would turn of the host using IPMI to make sure it's dead, then you are sure no corruption will happen. On Wed, Jul 24, 2013 at 9:31 PM,
Re: Does cloudstack support memory-overcommitting?
There is a global setting parameter afaik... Il giorno 08/lug/2013 11:45, Tao Lin linba...@gmail.com ha scritto: Hi,there: Does cloudstack support memory-overcommitting so that memory resource can be scheduled among VMs using the balloon device of the KVM? And how cloudstack handle this? Best regards.
Re: Enable HTTPS for CloudStack Web Interface
It worked on 4.1 for me so I think it will be fine for 4.0 S. Il giorno 02/lug/2013 05:14, CSG - Ashley Lester ash...@computer-services.com.au ha scritto: Hello, Is anybody able to advise if this is valid for CS 4.02. or a suggestion to get https working ? http://support.citrix.com/article/CTX132008 Best Regards, Ashley Lester