instances using virtual router for DNS instead of DNS servers
When I set up CloudStack, I chose my physical DNS servers as both the internal and external DNS servers. They perform recursion, so they are suitable for queries about hosts on the LAN as well as on the rest of the internet. However, the /etc/resolv.conf file in my instances lists the virtual router first, followed by the physical servers I chose during setup. The virtual router does not successfully return answers about internal hosts, causing the instances to be unable to reach each other. I'm aware of the use.external.dns option but the last time I set that to true and restarted the virtual router, it failed to start up again. Why is DHCP assigning the virtual router as the first name server instead of using the ones I selected during setup?
Re: services not running after reboot
I didn't find anything like that. Everything's been runnin ok over the weekend so I will leave it as is. On Mon, Oct 13, 2014 at 2:18 AM, Daan Hoogland wrote: > Good going Ian, sorry you didn't get any assistance on the way. Did you > find a setting that should have a different default? Like the router > service offering memory :P or doesn't that make any sense? > > On Sat, Oct 11, 2014 at 5:11 AM, Ian Young wrote: > > > Aha! I restarted cloudstack-agent, which caused the virtual router to > > change to a "stopped" status in the management console. However, the > > console viewer icon was still visible, so I clicked it. The router had > run > > out of memory and caused a kernel panic. I created a new system service > > offering with 500 MB of memory, changed the router's service offering, > and > > started it. It booted with no problem. The default memory size of 128 > MB > > is not enough. This is the system VM template I was using: > > > > > > > http://cloudstack.apt-get.eu/systemvm/4.4/systemvm64template-4.4.0-6-kvm.qcow2.bz2 > > > > On Fri, Oct 10, 2014 at 7:28 PM, Ian Young > wrote: > > > > > I dropped all the cloud* databases, deleted everything in primary and > > > secondary storage, and reinstalled the management server, following the > > > guide I wrote for myself the last time I built a stable CloudStack > > system. > > > Then I imported one of my backed up instances as a template and tried > to > > > create a new VM. Same problem as before. How is this possible? > > > > > > 2014-10-10 19:17:44,075 WARN [kvm.resource.LibvirtComputingResource] > > > (agentRequest-Handler-3:null) Timed out: > > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ > patchviasocket.pl > > > -n r-4-VM -p > > > > > > %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > > > lax.ratespecial.com > > > %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > > > . Output is: > > > 2014-10-10 19:18:05,078 WARN [kvm.resource.LibvirtComputingResource] > > > (Script-3:null) Interrupting script. > > > > > > On Fri, Oct 10, 2014 at 4:33 PM, Ian Young > > wrote: > > > > > >> I've restarted all the services and restarted the servers too. The > SSVM > > >> and CP start with no trouble. Every time I try to start or create an > > >> instance, I see repeated messages like these: > > >> > > >> /var/log/cloudstack/agent/cloudstack-agent.out: > > >> 2014-10-10 16:27:21,841{GMT} WARN > > >> [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting > > script. > > >> 2014-10-10 16:27:21,841{GMT} WARN > > >> [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) > Timed > > >> out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ > > >> patchviasocket.pl -n r-19-VM -p > > >> > > > %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > > >> lax.ratespecial.com > > > %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > > >> . Output is: > > >> > > >> /var/log/cloudstack/agent/security_group.log: > > >> 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next > > time! > > >> > > >> On Fri, Oct 10, 2014 at 3:04 PM, Ian Young > > >> wrote: > > >> > > >>> I tried to restart the network with the "clean up" option, via the > web > > >>> console. After several minutes, it failed to restart the network. > The > > >>> SSVM and CP are still running but the VR no longer exists. Why would > > these > > >>> be able to start but not the virtual router? > > >>> > > >>> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young > > >>> wrote: > > >>> > > >>>> I restarted the libvirtd service and the management service is now > > >>>> fully started (there are services listening on ports 8250 and 9090). > > The > > >>>> SSVM health check script now reports no problems. > > >>>> > > >>>> However, I tried starting an instance and both t
Re: services not running after reboot
Aha! I restarted cloudstack-agent, which caused the virtual router to change to a "stopped" status in the management console. However, the console viewer icon was still visible, so I clicked it. The router had run out of memory and caused a kernel panic. I created a new system service offering with 500 MB of memory, changed the router's service offering, and started it. It booted with no problem. The default memory size of 128 MB is not enough. This is the system VM template I was using: http://cloudstack.apt-get.eu/systemvm/4.4/systemvm64template-4.4.0-6-kvm.qcow2.bz2 On Fri, Oct 10, 2014 at 7:28 PM, Ian Young wrote: > I dropped all the cloud* databases, deleted everything in primary and > secondary storage, and reinstalled the management server, following the > guide I wrote for myself the last time I built a stable CloudStack system. > Then I imported one of my backed up instances as a template and tried to > create a new VM. Same problem as before. How is this possible? > > 2014-10-10 19:17:44,075 WARN [kvm.resource.LibvirtComputingResource] > (agentRequest-Handler-3:null) Timed out: > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl > -n r-4-VM -p > %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > . Output is: > 2014-10-10 19:18:05,078 WARN [kvm.resource.LibvirtComputingResource] > (Script-3:null) Interrupting script. > > On Fri, Oct 10, 2014 at 4:33 PM, Ian Young wrote: > >> I've restarted all the services and restarted the servers too. The SSVM >> and CP start with no trouble. Every time I try to start or create an >> instance, I see repeated messages like these: >> >> /var/log/cloudstack/agent/cloudstack-agent.out: >> 2014-10-10 16:27:21,841{GMT} WARN >> [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting script. >> 2014-10-10 16:27:21,841{GMT} WARN >> [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) Timed >> out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ >> patchviasocket.pl -n r-19-VM -p >> %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= >> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 >> . Output is: >> >> /var/log/cloudstack/agent/security_group.log: >> 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time! >> >> On Fri, Oct 10, 2014 at 3:04 PM, Ian Young >> wrote: >> >>> I tried to restart the network with the "clean up" option, via the web >>> console. After several minutes, it failed to restart the network. The >>> SSVM and CP are still running but the VR no longer exists. Why would these >>> be able to start but not the virtual router? >>> >>> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young >>> wrote: >>> >>>> I restarted the libvirtd service and the management service is now >>>> fully started (there are services listening on ports 8250 and 9090). The >>>> SSVM health check script now reports no problems. >>>> >>>> However, I tried starting an instance and both the instance and the >>>> virtual router are in a "starting" state but have been so for almost 10 >>>> minutes. In the catalina.out log I see: >>>> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >>>> There is pending job or HA tasks working on the VM. vm id: 4, postpone >>>> power-change report by resetting power-change counters >>>> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >>>> There is pending job or HA tasks working on the VM. vm id: 13, postpone >>>> power-change report by resetting power-change counters >>>> >>>> I'm also seeing this in the agent.log: >>>> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >>>> (Script-6:null) Interrupting script. >>>> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >>>> (agentRequest-Handler-2:null) Timed out: >>>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ >>>> patchviasocket.pl -n r-4-VM -p >>>> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= >>>> lax.ra
Re: services not running after reboot
I dropped all the cloud* databases, deleted everything in primary and secondary storage, and reinstalled the management server, following the guide I wrote for myself the last time I built a stable CloudStack system. Then I imported one of my backed up instances as a template and tried to create a new VM. Same problem as before. How is this possible? 2014-10-10 19:17:44,075 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-3:null) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-4-VM -p %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: 2014-10-10 19:18:05,078 WARN [kvm.resource.LibvirtComputingResource] (Script-3:null) Interrupting script. On Fri, Oct 10, 2014 at 4:33 PM, Ian Young wrote: > I've restarted all the services and restarted the servers too. The SSVM > and CP start with no trouble. Every time I try to start or create an > instance, I see repeated messages like these: > > /var/log/cloudstack/agent/cloudstack-agent.out: > 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] > (Script-8:) Interrupting script. > 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] > (agentRequest-Handler-4:) Timed out: > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl > -n r-19-VM -p > %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > . Output is: > > /var/log/cloudstack/agent/security_group.log: > 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time! > > On Fri, Oct 10, 2014 at 3:04 PM, Ian Young wrote: > >> I tried to restart the network with the "clean up" option, via the web >> console. After several minutes, it failed to restart the network. The >> SSVM and CP are still running but the VR no longer exists. Why would these >> be able to start but not the virtual router? >> >> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young >> wrote: >> >>> I restarted the libvirtd service and the management service is now fully >>> started (there are services listening on ports 8250 and 9090). The SSVM >>> health check script now reports no problems. >>> >>> However, I tried starting an instance and both the instance and the >>> virtual router are in a "starting" state but have been so for almost 10 >>> minutes. In the catalina.out log I see: >>> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >>> There is pending job or HA tasks working on the VM. vm id: 4, postpone >>> power-change report by resetting power-change counters >>> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >>> There is pending job or HA tasks working on the VM. vm id: 13, postpone >>> power-change report by resetting power-change counters >>> >>> I'm also seeing this in the agent.log: >>> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >>> (Script-6:null) Interrupting script. >>> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >>> (agentRequest-Handler-2:null) Timed out: >>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl >>> -n r-4-VM -p >>> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= >>> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 >>> . Output is: >>> >>> And in the security_group.log: >>> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! >>> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! >>> >>> What does this mean? >>> >>> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young >>> wrote: >>> >>>> This morning I was unable to start new instances. I discovered that I >>>> could SSH into the SSVM and the console proxy but not the virtual router. >>>> Something strange was happening so I thought it might be a good time to >>>> gracefully stop all the instances and reboot the hypervisor to see if the >>>> VR wo
Re: services not running after reboot
I've restarted all the services and restarted the servers too. The SSVM and CP start with no trouble. Every time I try to start or create an instance, I see repeated messages like these: /var/log/cloudstack/agent/cloudstack-agent.out: 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting script. 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-19-VM -p %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: /var/log/cloudstack/agent/security_group.log: 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time! On Fri, Oct 10, 2014 at 3:04 PM, Ian Young wrote: > I tried to restart the network with the "clean up" option, via the web > console. After several minutes, it failed to restart the network. The > SSVM and CP are still running but the VR no longer exists. Why would these > be able to start but not the virtual router? > > On Fri, Oct 10, 2014 at 2:48 PM, Ian Young wrote: > >> I restarted the libvirtd service and the management service is now fully >> started (there are services listening on ports 8250 and 9090). The SSVM >> health check script now reports no problems. >> >> However, I tried starting an instance and both the instance and the >> virtual router are in a "starting" state but have been so for almost 10 >> minutes. In the catalina.out log I see: >> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >> There is pending job or HA tasks working on the VM. vm id: 4, postpone >> power-change report by resetting power-change counters >> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >> There is pending job or HA tasks working on the VM. vm id: 13, postpone >> power-change report by resetting power-change counters >> >> I'm also seeing this in the agent.log: >> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >> (Script-6:null) Interrupting script. >> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >> (agentRequest-Handler-2:null) Timed out: >> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl >> -n r-4-VM -p >> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= >> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 >> . Output is: >> >> And in the security_group.log: >> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! >> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! >> >> What does this mean? >> >> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young >> wrote: >> >>> This morning I was unable to start new instances. I discovered that I >>> could SSH into the SSVM and the console proxy but not the virtual router. >>> Something strange was happening so I thought it might be a good time to >>> gracefully stop all the instances and reboot the hypervisor to see if the >>> VR would start working again. I also rebooted the management server (a >>> separate machine) to have a clean slate. Now that they've both been >>> rebooted, the following symptoms exist: >>> >>> * On the management server, there is no services listening on 9090 or >>> 8250. >>> * When I run the SSVM health check script, it says NFS is not currently >>> mounted. >>> * The management server log is reporting that Zone 1 is not ready to >>> launch SSVM/CP yet, even though both of those are running. >>> >>> The NFS server is running just fine. I can mount it in the management >>> server with no problems. I've restarted cloudstack-management and >>> cloudstack-agent but the problems persist. The "not ready to launch >>> SSVM/CP yet" messages sounds like the management server and the hypervisor >>> are not communicating or some information about the system state is out of >>> sync. How can I confirm this? >>> >> >> >
Re: services not running after reboot
I tried to restart the network with the "clean up" option, via the web console. After several minutes, it failed to restart the network. The SSVM and CP are still running but the VR no longer exists. Why would these be able to start but not the virtual router? On Fri, Oct 10, 2014 at 2:48 PM, Ian Young wrote: > I restarted the libvirtd service and the management service is now fully > started (there are services listening on ports 8250 and 9090). The SSVM > health check script now reports no problems. > > However, I tried starting an instance and both the instance and the > virtual router are in a "starting" state but have been so for almost 10 > minutes. In the catalina.out log I see: > INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) > There is pending job or HA tasks working on the VM. vm id: 4, postpone > power-change report by resetting power-change counters > INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) > There is pending job or HA tasks working on the VM. vm id: 13, postpone > power-change report by resetting power-change counters > > I'm also seeing this in the agent.log: > 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] > (Script-6:null) Interrupting script. > 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] > (agentRequest-Handler-2:null) Timed out: > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl > -n r-4-VM -p > %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > . Output is: > > And in the security_group.log: > 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! > 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! > > What does this mean? > > On Fri, Oct 10, 2014 at 2:11 PM, Ian Young wrote: > >> This morning I was unable to start new instances. I discovered that I >> could SSH into the SSVM and the console proxy but not the virtual router. >> Something strange was happening so I thought it might be a good time to >> gracefully stop all the instances and reboot the hypervisor to see if the >> VR would start working again. I also rebooted the management server (a >> separate machine) to have a clean slate. Now that they've both been >> rebooted, the following symptoms exist: >> >> * On the management server, there is no services listening on 9090 or >> 8250. >> * When I run the SSVM health check script, it says NFS is not currently >> mounted. >> * The management server log is reporting that Zone 1 is not ready to >> launch SSVM/CP yet, even though both of those are running. >> >> The NFS server is running just fine. I can mount it in the management >> server with no problems. I've restarted cloudstack-management and >> cloudstack-agent but the problems persist. The "not ready to launch >> SSVM/CP yet" messages sounds like the management server and the hypervisor >> are not communicating or some information about the system state is out of >> sync. How can I confirm this? >> > >
Re: services not running after reboot
I restarted the libvirtd service and the management service is now fully started (there are services listening on ports 8250 and 9090). The SSVM health check script now reports no problems. However, I tried starting an instance and both the instance and the virtual router are in a "starting" state but have been so for almost 10 minutes. In the catalina.out log I see: INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 4, postpone power-change report by resetting power-change counters INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 13, postpone power-change report by resetting power-change counters I'm also seeing this in the agent.log: 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (Script-6:null) Interrupting script. 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-2:null) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-4-VM -p %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: And in the security_group.log: 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! What does this mean? On Fri, Oct 10, 2014 at 2:11 PM, Ian Young wrote: > This morning I was unable to start new instances. I discovered that I > could SSH into the SSVM and the console proxy but not the virtual router. > Something strange was happening so I thought it might be a good time to > gracefully stop all the instances and reboot the hypervisor to see if the > VR would start working again. I also rebooted the management server (a > separate machine) to have a clean slate. Now that they've both been > rebooted, the following symptoms exist: > > * On the management server, there is no services listening on 9090 or 8250. > * When I run the SSVM health check script, it says NFS is not currently > mounted. > * The management server log is reporting that Zone 1 is not ready to > launch SSVM/CP yet, even though both of those are running. > > The NFS server is running just fine. I can mount it in the management > server with no problems. I've restarted cloudstack-management and > cloudstack-agent but the problems persist. The "not ready to launch > SSVM/CP yet" messages sounds like the management server and the hypervisor > are not communicating or some information about the system state is out of > sync. How can I confirm this? >
services not running after reboot
This morning I was unable to start new instances. I discovered that I could SSH into the SSVM and the console proxy but not the virtual router. Something strange was happening so I thought it might be a good time to gracefully stop all the instances and reboot the hypervisor to see if the VR would start working again. I also rebooted the management server (a separate machine) to have a clean slate. Now that they've both been rebooted, the following symptoms exist: * On the management server, there is no services listening on 9090 or 8250. * When I run the SSVM health check script, it says NFS is not currently mounted. * The management server log is reporting that Zone 1 is not ready to launch SSVM/CP yet, even though both of those are running. The NFS server is running just fine. I can mount it in the management server with no problems. I've restarted cloudstack-management and cloudstack-agent but the problems persist. The "not ready to launch SSVM/CP yet" messages sounds like the management server and the hypervisor are not communicating or some information about the system state is out of sync. How can I confirm this?
Re: unable to start virtual router
I restarted the agent about a half dozen times and the router magically started itself. What's the best way to make my instances use our internal DNS servers? On Thu, Oct 9, 2014 at 3:33 PM, Ian Young wrote: > I wanted to bypass the virtual router as the first DNS server so that my > instances would use our existing physical DNS servers. I followed the > instructions here: > > http://support.citrix.com/article/CTX138970 > > I set "use.external.dns" to true and then restarted the virtual router. > The VR remained in a "starting" state indefinitely. Eventually it timed > out. Now I can't start the VR. The management log says: > > 2014-10-09 14:00:13,275 ERROR [c.c.v.VirtualMachineManagerImpl] > (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Failed to > start instance VM[DomainRouter|r-4-VM] > 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobHandlerProxy] > (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Invocation > exception, caused by: com.cloud.exception.AgentUnavailableException: > Resource [Host:1] is unreachable: Host 1: Unable to start instance due to > Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not > retrying > 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobDispatcher] > (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95) Unable to complete > AsyncJobVO {id:95, userId: 2, accountId: 2, instanceType: null, instanceId: > null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: > rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIABHQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, > cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, > result: null, initMsid: 132241037805012, completeMsid: null, lastUpdated: > null, lastPolled: null, created: Thu Oct 09 13:39:51 PDT 2014}, job > origin:94 > 2014-10-09 14:00:13,763 ERROR [c.c.a.ApiAsyncJobDispatcher] > (API-Job-Executor-1:ctx-e206f887 job-94) Unexpected exception while > executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd >
unable to start virtual router
I wanted to bypass the virtual router as the first DNS server so that my instances would use our existing physical DNS servers. I followed the instructions here: http://support.citrix.com/article/CTX138970 I set "use.external.dns" to true and then restarted the virtual router. The VR remained in a "starting" state indefinitely. Eventually it timed out. Now I can't start the VR. The management log says: 2014-10-09 14:00:13,275 ERROR [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Failed to start instance VM[DomainRouter|r-4-VM] 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95) Unable to complete AsyncJobVO {id:95, userId: 2, accountId: 2, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIABHQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 132241037805012, completeMsid: null, lastUpdated: null, lastPolled: null, created: Thu Oct 09 13:39:51 PDT 2014}, job origin:94 2014-10-09 14:00:13,763 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-1:ctx-e206f887 job-94) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd
Re: upgraded to 4.4.1, management console is broken
Since the database had allegedly not been upgraded, I downgraded to 4.4.0. The management console was now available but I can't start instances. Thinking I would have to back up all the instances and reinstall CloudStack from scratch again, I converted one of the instances to a template but was unable to download it. The browser times out while trying to connect to the SSVM. I can't connect to the SSVM using the RSA key either, although the console proxy shows it up and running, with the login prompt. I port scanned the SSVM and the only ports open were 80 and 443, not 22. On Fri, Oct 3, 2014 at 1:51 PM, Ian Young wrote: > It looks like this is the root of the problem: > > 2014-10-03 13:44:47,131 DEBUG [c.c.u.d.Upgrade440to441] (main:null) > Updating System Vm template IDs > 2014-10-03 13:44:47,136 DEBUG [c.c.u.d.Upgrade440to441] (main:null) > Updating LXC System Vms > 2014-10-03 13:44:47,137 WARN [c.c.u.d.Upgrade440to441] (main:null) 4.4.0 > LXC SystemVm template not found. LXC hypervisor is not used, so not failing > upgrade > 2014-10-03 13:44:47,138 DEBUG [c.c.u.d.Upgrade440to441] (main:null) > Updating KVM System Vms > 2014-10-03 13:44:47,141 ERROR [c.c.u.DatabaseUpgradeChecker] (main:null) > Unable to upgrade the database > com.cloud.utils.exception.CloudRuntimeException: 4.4.0 KVM SystemVm > template not found. Cannot upgrade system Vms > > Any ideas about how I can fix this? I had a 4.4.0 KVM SystemVm template > prior to the upgrade. > > On Fri, Oct 3, 2014 at 1:39 PM, Ian Young wrote: > >> I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I >> try to access the management console. The localhost.2014-10-03.log shows >> this error: >> >> Caused by: java.io.IOException: Resource >> [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties] >> and >> [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties] >> do not appear to be the same resource, please ensure the name property is >> correct or that the module is not defined twice >> >> Where is this value defined? The full log can be viewed here: >> >> pastebin.com/nrdEsxZK >> >> I also noticed the error "4.4.0 KVM SystemVm template not found. Cannot >> upgrade system Vms." The URL for the system VM template in the 4.4.1 >> upgrade instructions is the same as the one I used when I installed 4.4.0 >> initially. Is there really a need to install the same template again? >> > >
Re: upgraded to 4.4.1, management console is broken
It looks like this is the root of the problem: 2014-10-03 13:44:47,131 DEBUG [c.c.u.d.Upgrade440to441] (main:null) Updating System Vm template IDs 2014-10-03 13:44:47,136 DEBUG [c.c.u.d.Upgrade440to441] (main:null) Updating LXC System Vms 2014-10-03 13:44:47,137 WARN [c.c.u.d.Upgrade440to441] (main:null) 4.4.0 LXC SystemVm template not found. LXC hypervisor is not used, so not failing upgrade 2014-10-03 13:44:47,138 DEBUG [c.c.u.d.Upgrade440to441] (main:null) Updating KVM System Vms 2014-10-03 13:44:47,141 ERROR [c.c.u.DatabaseUpgradeChecker] (main:null) Unable to upgrade the database com.cloud.utils.exception.CloudRuntimeException: 4.4.0 KVM SystemVm template not found. Cannot upgrade system Vms Any ideas about how I can fix this? I had a 4.4.0 KVM SystemVm template prior to the upgrade. On Fri, Oct 3, 2014 at 1:39 PM, Ian Young wrote: > I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I try > to access the management console. The localhost.2014-10-03.log shows this > error: > > Caused by: java.io.IOException: Resource > [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties] > and > [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties] > do not appear to be the same resource, please ensure the name property is > correct or that the module is not defined twice > > Where is this value defined? The full log can be viewed here: > > pastebin.com/nrdEsxZK > > I also noticed the error "4.4.0 KVM SystemVm template not found. Cannot > upgrade system Vms." The URL for the system VM template in the 4.4.1 > upgrade instructions is the same as the one I used when I installed 4.4.0 > initially. Is there really a need to install the same template again? >
upgraded to 4.4.1, management console is broken
I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I try to access the management console. The localhost.2014-10-03.log shows this error: Caused by: java.io.IOException: Resource [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties] and [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties] do not appear to be the same resource, please ensure the name property is correct or that the module is not defined twice Where is this value defined? The full log can be viewed here: pastebin.com/nrdEsxZK I also noticed the error "4.4.0 KVM SystemVm template not found. Cannot upgrade system Vms." The URL for the system VM template in the 4.4.1 upgrade instructions is the same as the one I used when I installed 4.4.0 initially. Is there really a need to install the same template again?
system capacity not updating
I set some new over provisioning values, stopped all running instances, and restarted the management service. When I logged back in, the system capacity has not changed. All instances are stopped, yet the dashboard still reports the same resource usage as before I shut them down. How do I refresh this information?
Re: basic zone setup
Do you know which MySQL tables need to be updated to reference the new template? I'm worried that if I miss one the system will break unexpectedly the next time I launch a system VM. It might be worthwhile for me to simply reinstall the entire thing to be certain everything's set up correctly. On Thu, Jul 31, 2014 at 12:31 AM, Erik Weber wrote: > Yes, if you don't want to reinstall/re-seed the system vm template, you > should also download the new ones and do the mysql queries so that it is > used for any future system vm deployments. > > > Erik > > > On Thu, Jul 31, 2014 at 3:17 AM, Ian Young wrote: > > > Yes, that makes the ssvm-check pass all the tests. Thanks. Should I > > repeat that upgrade with the console proxy? > > > > > > On Wed, Jul 30, 2014 at 6:09 PM, Carlos Reategui > > wrote: > > > > > There have been some messages going around about the template needing a > > fix > > > for this. > > > > > > Per this link (https://gist.github.com/terbolous/102ae8edd1cda192561c) > > > from > > > one of the messages you can try the following on the ssvm itself: > > > > > > apt-get update && apt-get -y install openjdk-7-jre-headless > > > openjdk-7-jre-lib && apt-get -y remove openjdk-6-jre-headless > > > > > > then you may also need to: > > > service cloud stop && sleep 3 && service cloud start > > > > > > try the ssvm-check again after that. > > > > > > My understanding is that this is not a permanent fix, but should get > you > > > going for now. > > > > > > > > > On Wed, Jul 30, 2014 at 6:00 PM, Ian Young > > wrote: > > > > > > > I found this in the cloud.out log in the SSVM: > > > > > > > > Exception in thread "main" java.lang.UnsupportedClassVersionError: > > > > com/cloud/agent/AgentShell : Unsupported major.minor version 51.0 > > > > at java.lang.ClassLoader.defineClass1(Native Method) > > > > at java.lang.ClassLoader.defineClass(ClassLoader.java:634) > > > > at > > > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > > > > at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) > > > > at java.net.URLClassLoader.access$000(URLClassLoader.java:73) > > > > at java.net.URLClassLoader$1.run(URLClassLoader.java:212) > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:321) > > > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266) > > > > Could not find the main class: com.cloud.agent.AgentShell. Program > will > > > > exit. > > > > > > > > It seems to have to do with a Java version mismatch. I'm using JDK 7 > > on > > > > both the management server and hypervisor but the SSVM is using > version > > > 6. > > > > Is this the most current system VM template? > > > > > > > > > > > > > > > > > > http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 > > > > > > > > > > > > On Wed, Jul 30, 2014 at 5:25 PM, Ian Young > > > wrote: > > > > > > > > > root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh > > > > > > > > > > First DNS server is 192.168.100.2 > > > > > PING 192.168.100.2 (192.168.100.2): 48 data bytes > > > > > 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms > > > > > 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms > > > > > --- 192.168.100.2 ping statistics --- > > > > > 2 packets transmitted, 2 packets received, 0% packet loss > > > > > round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms > > > > > Good: Can ping DNS server > > > > > > > > > > Good: DNS resolves download.cloud.com > > > > > > > > > > ERROR: NFS is not currently mounted > > > > > Try manually mounting from inside the VM > > > > > NFS server is 169.254.1.0 > > > > > PING 169.254.1.0
Re: basic zone setup
Yes, that makes the ssvm-check pass all the tests. Thanks. Should I repeat that upgrade with the console proxy? On Wed, Jul 30, 2014 at 6:09 PM, Carlos Reategui wrote: > There have been some messages going around about the template needing a fix > for this. > > Per this link (https://gist.github.com/terbolous/102ae8edd1cda192561c) > from > one of the messages you can try the following on the ssvm itself: > > apt-get update && apt-get -y install openjdk-7-jre-headless > openjdk-7-jre-lib && apt-get -y remove openjdk-6-jre-headless > > then you may also need to: > service cloud stop && sleep 3 && service cloud start > > try the ssvm-check again after that. > > My understanding is that this is not a permanent fix, but should get you > going for now. > > > On Wed, Jul 30, 2014 at 6:00 PM, Ian Young wrote: > > > I found this in the cloud.out log in the SSVM: > > > > Exception in thread "main" java.lang.UnsupportedClassVersionError: > > com/cloud/agent/AgentShell : Unsupported major.minor version 51.0 > > at java.lang.ClassLoader.defineClass1(Native Method) > > at java.lang.ClassLoader.defineClass(ClassLoader.java:634) > > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > > at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) > > at java.net.URLClassLoader.access$000(URLClassLoader.java:73) > > at java.net.URLClassLoader$1.run(URLClassLoader.java:212) > > at java.security.AccessController.doPrivileged(Native Method) > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:321) > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266) > > Could not find the main class: com.cloud.agent.AgentShell. Program will > > exit. > > > > It seems to have to do with a Java version mismatch. I'm using JDK 7 on > > both the management server and hypervisor but the SSVM is using version > 6. > > Is this the most current system VM template? > > > > > > > http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 > > > > > > On Wed, Jul 30, 2014 at 5:25 PM, Ian Young > wrote: > > > > > root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh > > > > > > First DNS server is 192.168.100.2 > > > PING 192.168.100.2 (192.168.100.2): 48 data bytes > > > 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms > > > 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms > > > --- 192.168.100.2 ping statistics --- > > > 2 packets transmitted, 2 packets received, 0% packet loss > > > round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms > > > Good: Can ping DNS server > > > > > > Good: DNS resolves download.cloud.com > > > > > > ERROR: NFS is not currently mounted > > > Try manually mounting from inside the VM > > > NFS server is 169.254.1.0 > > > PING 169.254.1.0 (169.254.1.0): 48 data bytes > > > 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms > > > 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms > > > --- 169.254.1.0 ping statistics --- > > > 2 packets transmitted, 2 packets received, 0% packet loss > > > round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms > > > Good: Can ping nfs server > > > > > > Management server is 192.168.101.3. Checking connectivity. > > > Good: Can connect to management server port 8250 > > > > > > ERROR: Java process not running. Try restarting the SSVM. > > > > > > It says the NFS server is 169.254.1.0 which is the SSVM's link local > > > address. How did it decide that? During the zone configuration I > > > specified "virthost1.lax.ratespecial.com" as the NFS server and that > > > resolves to 192.168.101.4. Also, in what path does it expect the NFS > > > volume to be mounted? > > > > > > > > > On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui > > > wrote: > > > > > >> If the template is not ready then your ssvm may be having problems > > >> downloading it. Have you followed the info here: > > >> > > >&g
Re: basic zone setup
I found this in the cloud.out log in the SSVM: Exception in thread "main" java.lang.UnsupportedClassVersionError: com/cloud/agent/AgentShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:634) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: com.cloud.agent.AgentShell. Program will exit. It seems to have to do with a Java version mismatch. I'm using JDK 7 on both the management server and hypervisor but the SSVM is using version 6. Is this the most current system VM template? http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 On Wed, Jul 30, 2014 at 5:25 PM, Ian Young wrote: > root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh > > First DNS server is 192.168.100.2 > PING 192.168.100.2 (192.168.100.2): 48 data bytes > 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms > 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms > --- 192.168.100.2 ping statistics --- > 2 packets transmitted, 2 packets received, 0% packet loss > round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms > Good: Can ping DNS server > > Good: DNS resolves download.cloud.com > > ERROR: NFS is not currently mounted > Try manually mounting from inside the VM > NFS server is 169.254.1.0 > PING 169.254.1.0 (169.254.1.0): 48 data bytes > 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms > 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms > --- 169.254.1.0 ping statistics --- > 2 packets transmitted, 2 packets received, 0% packet loss > round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms > Good: Can ping nfs server > > Management server is 192.168.101.3. Checking connectivity. > Good: Can connect to management server port 8250 > > ERROR: Java process not running. Try restarting the SSVM. > > It says the NFS server is 169.254.1.0 which is the SSVM's link local > address. How did it decide that? During the zone configuration I > specified "virthost1.lax.ratespecial.com" as the NFS server and that > resolves to 192.168.101.4. Also, in what path does it expect the NFS > volume to be mounted? > > > On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui > wrote: > >> If the template is not ready then your ssvm may be having problems >> downloading it. Have you followed the info here: >> >> https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting >> to make sure that the ssvm is actually working properly? >> >> >> On Wed, Jul 30, 2014 at 4:39 PM, Ian Young >> wrote: >> >> > I've reinstalled CloudStack 4.4 again, configuring the network as >> follows: >> > >> > management server: >> > p4p1 http://pastebin.com/skMXxVtk >> > >> > hypervisor/storage server: >> > eth0 http://pastebin.com/LxUxFdpe >> > eth1 http://pastebin.com/K5si1L4d >> > cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic) >> > cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic) >> > >> > When I logged into the management GUI for the first time, I skipped the >> > wizard and went straight to the dashboard. There, I set up a basic >> zone as >> > follows: >> > >> > http://imgur.com/a/R1dX0#0 >> > >> > Now that the infrastructure has been launched and the SSVM and console >> > proxy are running, I noticed that the CentOS template is not ready. >> > Neither the management server or the hypervisor are downloading >> anything, >> > so it doesn't appear the CentOS template will be ready. If I try to >> > register my own templates, I fill out all the fields but the window just >> > disappears when I click OK and no template is added. I don't see any >> new >> > messages in the management server l
Re: basic zone setup
root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh First DNS server is 192.168.100.2 PING 192.168.100.2 (192.168.100.2): 48 data bytes 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms --- 192.168.100.2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms Good: Can ping DNS server Good: DNS resolves download.cloud.com ERROR: NFS is not currently mounted Try manually mounting from inside the VM NFS server is 169.254.1.0 PING 169.254.1.0 (169.254.1.0): 48 data bytes 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms --- 169.254.1.0 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms Good: Can ping nfs server Management server is 192.168.101.3. Checking connectivity. Good: Can connect to management server port 8250 ERROR: Java process not running. Try restarting the SSVM. It says the NFS server is 169.254.1.0 which is the SSVM's link local address. How did it decide that? During the zone configuration I specified "virthost1.lax.ratespecial.com" as the NFS server and that resolves to 192.168.101.4. Also, in what path does it expect the NFS volume to be mounted? On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui wrote: > If the template is not ready then your ssvm may be having problems > downloading it. Have you followed the info here: > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting > to make sure that the ssvm is actually working properly? > > > On Wed, Jul 30, 2014 at 4:39 PM, Ian Young wrote: > > > I've reinstalled CloudStack 4.4 again, configuring the network as > follows: > > > > management server: > > p4p1 http://pastebin.com/skMXxVtk > > > > hypervisor/storage server: > > eth0 http://pastebin.com/LxUxFdpe > > eth1 http://pastebin.com/K5si1L4d > > cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic) > > cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic) > > > > When I logged into the management GUI for the first time, I skipped the > > wizard and went straight to the dashboard. There, I set up a basic zone > as > > follows: > > > > http://imgur.com/a/R1dX0#0 > > > > Now that the infrastructure has been launched and the SSVM and console > > proxy are running, I noticed that the CentOS template is not ready. > > Neither the management server or the hypervisor are downloading > anything, > > so it doesn't appear the CentOS template will be ready. If I try to > > register my own templates, I fill out all the fields but the window just > > disappears when I click OK and no template is added. I don't see any new > > messages in the management server log at the time this occurs. I suspect > > there is a storage problem. However, I can mount the NFS shares onto the > > management server with no problems. That's how I was able to manually > > download the system VM template, as the installation guide indicated. > > What's wrong with this setup? I don't see any obvious errors in the > > management log besides these repetitive messages, which seem to > contradict > > the fact that there is a SSVM and console proxy running: > > > > http://pastebin.com/yvW5GmSB > > >
basic zone setup
I've reinstalled CloudStack 4.4 again, configuring the network as follows: management server: p4p1 http://pastebin.com/skMXxVtk hypervisor/storage server: eth0 http://pastebin.com/LxUxFdpe eth1 http://pastebin.com/K5si1L4d cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic) cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic) When I logged into the management GUI for the first time, I skipped the wizard and went straight to the dashboard. There, I set up a basic zone as follows: http://imgur.com/a/R1dX0#0 Now that the infrastructure has been launched and the SSVM and console proxy are running, I noticed that the CentOS template is not ready. Neither the management server or the hypervisor are downloading anything, so it doesn't appear the CentOS template will be ready. If I try to register my own templates, I fill out all the fields but the window just disappears when I click OK and no template is added. I don't see any new messages in the management server log at the time this occurs. I suspect there is a storage problem. However, I can mount the NFS shares onto the management server with no problems. That's how I was able to manually download the system VM template, as the installation guide indicated. What's wrong with this setup? I don't see any obvious errors in the management log besides these repetitive messages, which seem to contradict the fact that there is a SSVM and console proxy running: http://pastebin.com/yvW5GmSB
Re: dual NIC VLAN configuration
Is private traffic the same thing as management/storage traffic? On Fri, Jul 25, 2014 at 11:17 PM, Geoff Higginbottom < geoff.higginbot...@shapeblue.com> wrote: > Hi Ian, > > As you are deploying a Basic network there will be no public traffic. > > The private traffic, assuming you allocate an IP range to the POD which is > in the same CIDR as the Management Server would typically be assigned to > cloudbr0 > > private.network.device=cloudbr0 > > Guest traffic would then be assigned to cloudbr1 > > guest.network.device=cloudbr1 > > > > Regards > > Geoff Higginbottom > CTO / Cloud Architect > > D: +44 20 3603 0542 | S: +44 20 3603 0540 +442036030540> | M: +447968161581 > > geoff.higginbot...@shapeblue.com<mailto:geoff.higginbot...@shapeblue.com> > | www.shapeblue.com | Twitter:@cloudstackguru< > https://twitter.com/#!/cloudstackguru> > > ShapeBlue Ltd, 53 Chandos Place, Covent Garden, London, WC2N > 4HS > > > On 25 Jul 2014, at 19:18, "Ian Young" iyo...@ratespecial.com>> wrote: > > So if management/storage traffic is on cloudbr0 and guest VMs are on > cloudbr1, would these be the correct settings in agent.properties? > > guest.network.device=cloudbr1 > private.network.device=cloudbr1 > public.network.device=cloudbr1 > > > On Fri, Jul 25, 2014 at 10:11 AM, Ian Young <mailto:iyo...@ratespecial.com>> wrote: > > Thank you, Geoff. That was precisely the answer I was looking for. I > knew I was doing something wrong. I didn't realize the second adapter > could be used without an IP address explicitly assigned to it. Yes, this > is a basic zone (just an internal project so we don't need any public IP > addresses). I was planning to set up an NFS server on the > 192.168.101.0/24 network so this is exactly what I was trying to > accomplish. Thanks. > > > On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom < > geoff.higginbot...@shapeblue.com<mailto:geoff.higginbot...@shapeblue.com>> > wrote: > > Ian, > > It looks like you are trying to setup a basic zone and have a Management > Server on IP 192.168.101.3 and a Host on IP 192.168.101.4. > > The second interface on the host does not need any IP configuration on > the Host as it will not be used by the Host so remove the 192.168.102.4 > mapping.. This interface will be used by the Guest VMs running on the Host > who will have their own IP schema. > > Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway > of 192.168.102.1 > > The Management Serve will talk to the Host via the 1st Interface, and > Guest VMs will use the 2nd. > > You have not mentioned storage, but assuming you are using NFS for > Primary and Secondary, put the NFS Server on the 192.168.101.0/24 > network, and then all storage traffic will also go over the 1st interface. > > Regards > > Geoff Higginbottom > > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 > > geoff.higginbot...@shapeblue.com<mailto:geoff.higginbot...@shapeblue.com> > > -Original Message- > From: Daan Hoogland [mailto:daan.hoogl...@gmail.com] > Sent: 25 July 2014 08:47 > To: users@cloudstack.apache.org<mailto:users@cloudstack.apache.org> > Subject: Re: dual NIC VLAN configuration > > Ian, I would imagine that guest traffic can't go out to the net this way. > Maybe you should swap them. This is only guessing however. What are you > seeing? > > On Fri, Jul 25, 2014 at 2:00 AM, Ian Young iyo...@ratespecial.com>> > wrote: > Here's the less verbose version: My hypervisor has two NICs and I've > set up a label on each. Traffic to and from cloudbr0 works perfectly. > Traffic going into cloudbr1 goes out cloudbr0 because that interface > has a default gateway. Will this pose a problem when I try to set up > separate management and guest networks in CloudStack? > > > On Thu, Jul 24, 2014 at 10:56 AM, Ian Young <mailto:iyo...@ratespecial.com>> > wrote: > > I am trying to set up a server with two NICs as a hypervisor. I > would like to use the two interfaces to separate management and guest > traffic, as recommended by the CloudStack installation guide. This > server is connected to a managed switch, which is connected to a > hardware firewall, both of which are set up with tagged VLANs. Some > of the ports on the switch are designated as VLAN 6 and some are VLAN > 7. I've confirmed the VLANs are set up correctly by configuring eth0 > and eth1 (one at a time) with the appropriate IP address, netmask, and > gateway. > > However, the difficulty arises when I try to configure both > interfaces simultaneously. The return traffic ten
Re: dual NIC VLAN configuration
So if management/storage traffic is on cloudbr0 and guest VMs are on cloudbr1, would these be the correct settings in agent.properties? guest.network.device=cloudbr1 private.network.device=cloudbr1 public.network.device=cloudbr1 On Fri, Jul 25, 2014 at 10:11 AM, Ian Young wrote: > Thank you, Geoff. That was precisely the answer I was looking for. I > knew I was doing something wrong. I didn't realize the second adapter > could be used without an IP address explicitly assigned to it. Yes, this > is a basic zone (just an internal project so we don't need any public IP > addresses). I was planning to set up an NFS server on the > 192.168.101.0/24 network so this is exactly what I was trying to > accomplish. Thanks. > > > On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom < > geoff.higginbot...@shapeblue.com> wrote: > >> Ian, >> >> It looks like you are trying to setup a basic zone and have a Management >> Server on IP 192.168.101.3 and a Host on IP 192.168.101.4. >> >> The second interface on the host does not need any IP configuration on >> the Host as it will not be used by the Host so remove the 192.168.102.4 >> mapping.. This interface will be used by the Guest VMs running on the Host >> who will have their own IP schema. >> >> Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway >> of 192.168.102.1 >> >> The Management Serve will talk to the Host via the 1st Interface, and >> Guest VMs will use the 2nd. >> >> You have not mentioned storage, but assuming you are using NFS for >> Primary and Secondary, put the NFS Server on the 192.168.101.0/24 >> network, and then all storage traffic will also go over the 1st interface. >> >> Regards >> >> Geoff Higginbottom >> >> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 >> >> geoff.higginbot...@shapeblue.com >> >> -Original Message- >> From: Daan Hoogland [mailto:daan.hoogl...@gmail.com] >> Sent: 25 July 2014 08:47 >> To: users@cloudstack.apache.org >> Subject: Re: dual NIC VLAN configuration >> >> Ian, I would imagine that guest traffic can't go out to the net this way. >> Maybe you should swap them. This is only guessing however. What are you >> seeing? >> >> On Fri, Jul 25, 2014 at 2:00 AM, Ian Young >> wrote: >> > Here's the less verbose version: My hypervisor has two NICs and I've >> > set up a label on each. Traffic to and from cloudbr0 works perfectly. >> > Traffic going into cloudbr1 goes out cloudbr0 because that interface >> > has a default gateway. Will this pose a problem when I try to set up >> > separate management and guest networks in CloudStack? >> > >> > >> > On Thu, Jul 24, 2014 at 10:56 AM, Ian Young >> wrote: >> > >> >> I am trying to set up a server with two NICs as a hypervisor. I >> >> would like to use the two interfaces to separate management and guest >> >> traffic, as recommended by the CloudStack installation guide. This >> >> server is connected to a managed switch, which is connected to a >> >> hardware firewall, both of which are set up with tagged VLANs. Some >> >> of the ports on the switch are designated as VLAN 6 and some are VLAN >> >> 7. I've confirmed the VLANs are set up correctly by configuring eth0 >> >> and eth1 (one at a time) with the appropriate IP address, netmask, and >> gateway. >> >> >> >> However, the difficulty arises when I try to configure both >> >> interfaces simultaneously. The return traffic tends to go out >> >> whichever interface is associated with the default gateway, a typical >> >> issue when using multiple network interfaces. I've followed numerous >> >> guides, which all basically say the same thing: Don't set a default >> >> gateway; use iproute2 to control the flow of traffic with route-eth0, >> >> rule-eth0, and rt_tables. I've tried setting this up numerous times >> >> to no avail, probably because the guides I'm reading don't involve >> >> VLANs. Add to that the the cloudbr0 and cloudbr1 bridges that >> >> CloudStack requires and now I'm really confused as to how to set up >> >> the network. I can't be the first person to have set up CloudStack >> >> this way; it sounds pretty common. Can someone explain to me the >> correct way to configure these interfaces? >> >> >> >> Here is my network information: >> >&g
Re: dual NIC VLAN configuration
Thank you, Geoff. That was precisely the answer I was looking for. I knew I was doing something wrong. I didn't realize the second adapter could be used without an IP address explicitly assigned to it. Yes, this is a basic zone (just an internal project so we don't need any public IP addresses). I was planning to set up an NFS server on the 192.168.101.0/24 network so this is exactly what I was trying to accomplish. Thanks. On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom < geoff.higginbot...@shapeblue.com> wrote: > Ian, > > It looks like you are trying to setup a basic zone and have a Management > Server on IP 192.168.101.3 and a Host on IP 192.168.101.4. > > The second interface on the host does not need any IP configuration on the > Host as it will not be used by the Host so remove the 192.168.102.4 > mapping.. This interface will be used by the Guest VMs running on the Host > who will have their own IP schema. > > Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway > of 192.168.102.1 > > The Management Serve will talk to the Host via the 1st Interface, and > Guest VMs will use the 2nd. > > You have not mentioned storage, but assuming you are using NFS for Primary > and Secondary, put the NFS Server on the 192.168.101.0/24 network, and > then all storage traffic will also go over the 1st interface. > > Regards > > Geoff Higginbottom > > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 > > geoff.higginbot...@shapeblue.com > > -Original Message- > From: Daan Hoogland [mailto:daan.hoogl...@gmail.com] > Sent: 25 July 2014 08:47 > To: users@cloudstack.apache.org > Subject: Re: dual NIC VLAN configuration > > Ian, I would imagine that guest traffic can't go out to the net this way. > Maybe you should swap them. This is only guessing however. What are you > seeing? > > On Fri, Jul 25, 2014 at 2:00 AM, Ian Young wrote: > > Here's the less verbose version: My hypervisor has two NICs and I've > > set up a label on each. Traffic to and from cloudbr0 works perfectly. > > Traffic going into cloudbr1 goes out cloudbr0 because that interface > > has a default gateway. Will this pose a problem when I try to set up > > separate management and guest networks in CloudStack? > > > > > > On Thu, Jul 24, 2014 at 10:56 AM, Ian Young > wrote: > > > >> I am trying to set up a server with two NICs as a hypervisor. I > >> would like to use the two interfaces to separate management and guest > >> traffic, as recommended by the CloudStack installation guide. This > >> server is connected to a managed switch, which is connected to a > >> hardware firewall, both of which are set up with tagged VLANs. Some > >> of the ports on the switch are designated as VLAN 6 and some are VLAN > >> 7. I've confirmed the VLANs are set up correctly by configuring eth0 > >> and eth1 (one at a time) with the appropriate IP address, netmask, and > gateway. > >> > >> However, the difficulty arises when I try to configure both > >> interfaces simultaneously. The return traffic tends to go out > >> whichever interface is associated with the default gateway, a typical > >> issue when using multiple network interfaces. I've followed numerous > >> guides, which all basically say the same thing: Don't set a default > >> gateway; use iproute2 to control the flow of traffic with route-eth0, > >> rule-eth0, and rt_tables. I've tried setting this up numerous times > >> to no avail, probably because the guides I'm reading don't involve > >> VLANs. Add to that the the cloudbr0 and cloudbr1 bridges that > >> CloudStack requires and now I'm really confused as to how to set up > >> the network. I can't be the first person to have set up CloudStack > >> this way; it sounds pretty common. Can someone explain to me the > correct way to configure these interfaces? > >> > >> Here is my network information: > >> > >> VLAN 6 (management) > >> 192.168.101.0/24 > >> gateway: 192.168.101.1 > >> > >> VLAN 7 (guest) > >> 192.168.102.0/24 > >> gateway: 192.168.102.1 > >> > >> current hypervisor settings: > >> eth0: 192.168.101.4 > >> eth1: 192.168.102.4 > >> > >> current management server settings (this is a separate machine): > >> p4p1: 192.168.101.3 > >> > > > > -- > Daan > Find out more about ShapeBlue and our range of CloudStack related services > > IaaS Cloud Design & Build< > http://
Re: dual NIC VLAN configuration
Here's the less verbose version: My hypervisor has two NICs and I've set up a label on each. Traffic to and from cloudbr0 works perfectly. Traffic going into cloudbr1 goes out cloudbr0 because that interface has a default gateway. Will this pose a problem when I try to set up separate management and guest networks in CloudStack? On Thu, Jul 24, 2014 at 10:56 AM, Ian Young wrote: > I am trying to set up a server with two NICs as a hypervisor. I would > like to use the two interfaces to separate management and guest traffic, as > recommended by the CloudStack installation guide. This server is connected > to a managed switch, which is connected to a hardware firewall, both of > which are set up with tagged VLANs. Some of the ports on the switch are > designated as VLAN 6 and some are VLAN 7. I've confirmed the VLANs are set > up correctly by configuring eth0 and eth1 (one at a time) with the > appropriate IP address, netmask, and gateway. > > However, the difficulty arises when I try to configure both interfaces > simultaneously. The return traffic tends to go out whichever interface is > associated with the default gateway, a typical issue when using multiple > network interfaces. I've followed numerous guides, which all basically say > the same thing: Don't set a default gateway; use iproute2 to control the > flow of traffic with route-eth0, rule-eth0, and rt_tables. I've tried > setting this up numerous times to no avail, probably because the guides I'm > reading don't involve VLANs. Add to that the the cloudbr0 and cloudbr1 > bridges that CloudStack requires and now I'm really confused as to how to > set up the network. I can't be the first person to have set up CloudStack > this way; it sounds pretty common. Can someone explain to me the correct > way to configure these interfaces? > > Here is my network information: > > VLAN 6 (management) > 192.168.101.0/24 > gateway: 192.168.101.1 > > VLAN 7 (guest) > 192.168.102.0/24 > gateway: 192.168.102.1 > > current hypervisor settings: > eth0: 192.168.101.4 > eth1: 192.168.102.4 > > current management server settings (this is a separate machine): > p4p1: 192.168.101.3 >
dual NIC VLAN configuration
I am trying to set up a server with two NICs as a hypervisor. I would like to use the two interfaces to separate management and guest traffic, as recommended by the CloudStack installation guide. This server is connected to a managed switch, which is connected to a hardware firewall, both of which are set up with tagged VLANs. Some of the ports on the switch are designated as VLAN 6 and some are VLAN 7. I've confirmed the VLANs are set up correctly by configuring eth0 and eth1 (one at a time) with the appropriate IP address, netmask, and gateway. However, the difficulty arises when I try to configure both interfaces simultaneously. The return traffic tends to go out whichever interface is associated with the default gateway, a typical issue when using multiple network interfaces. I've followed numerous guides, which all basically say the same thing: Don't set a default gateway; use iproute2 to control the flow of traffic with route-eth0, rule-eth0, and rt_tables. I've tried setting this up numerous times to no avail, probably because the guides I'm reading don't involve VLANs. Add to that the the cloudbr0 and cloudbr1 bridges that CloudStack requires and now I'm really confused as to how to set up the network. I can't be the first person to have set up CloudStack this way; it sounds pretty common. Can someone explain to me the correct way to configure these interfaces? Here is my network information: VLAN 6 (management) 192.168.101.0/24 gateway: 192.168.101.1 VLAN 7 (guest) 192.168.102.0/24 gateway: 192.168.102.1 current hypervisor settings: eth0: 192.168.101.4 eth1: 192.168.102.4 current management server settings (this is a separate machine): p4p1: 192.168.101.3
local storage for system VMs
My CloudStack 4.3 system is a single server (for the time being, at least). Since a system VM malfunction is a show-stopper, I would like to host those on local storage to avoid issues with NFS mounts. I have changed the value system.vm.use.local.storage to true. I don't see an option for use.local.storage, so maybe that's been removed. The system offerings for SSVM, console proxy, and software router are now set to Storage Type = local. Do I need to create new compute offerings, or is that for regular instances? I want to keep normal instances on shared storage. How do I make sure the system VMs are running on local storage? I've restarted them but the qemu process still says -drive file=/mnt/2a7ec307-d797-3287-aa31-7e280afb56cf/d8668fbc-dd3b-4c85-952e-40947eda7b99,if=none,id=drive-virtio-disk0,format=qcow2,cache=none" which is a shared volume. Do I need to destroy them and create new ones?
Re: console proxy times out
I'm still getting a lot of these, though. 2014-05-23 10:52:27,908 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-838a6dc4 work-873) HA on VM[ConsoleProxy|v-2-VM] 2014-05-23 10:52:27,908 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-838a6dc4 work-873) VM VM[ConsoleProxy|v-2-VM] has been changed. Current State = Running Previous State = Starting last updated = 571 previous updated = 568 2014-05-23 10:52:27,908 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-838a6dc4 work-873) Completed HAWork[873-HA-2-Starting-Investigating] On Fri, May 23, 2014 at 10:50 AM, Ian Young wrote: > I destroyed the SSVM and then tried hacking the database to make > CloudStack realize that the console proxy is in fact stopped. > > mysql> update vm_instance set state='Stopped' where name='v-2-VM'; > mysql> update host set status='Up' where name='v-2-VM'; > > Now they're both running and I can see the console. There's got to be a > better way to use this system without having to reboot or hack the database > daily. > > > On Fri, May 23, 2014 at 10:42 AM, Ian Young wrote: > >> Also, is this normal? Every time the server is rebooted, it adds another >> record to the mshost table but the "removed" field is always NULL. >> >> http://pastebin.com/q5zDCu4b >> >> >> On Fri, May 23, 2014 at 10:39 AM, Ian Young wrote: >> >>> The SSVM is stopped. If I try to start it, it complains about >>> insufficient capacity. CPU? RAM? I have plenty of both available. >>> >>> 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] >>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of >>> aggregate capacity, that have (atleast one host with) enough CPU and RAM >>> capacity under this Pod: 1 >>> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] >>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list >>> these clusters from avoid set: [1] >>> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] >>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing >>> disabled clusters and clusters in avoid list, returning. >>> 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] >>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from >>> :Starting to Stopped with event: OperationFailedvm's original host id: 1 >>> new host id: null host id before state transition: null >>> 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] >>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start >>> secondary storage vm >>> com.cloud.exception.InsufficientServerCapacityException: Unable to >>> create a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface >>> com.cloud.dc.DataCenter; id=1 >>> >>> >>> On Fri, May 23, 2014 at 10:35 AM, Ian Young wrote: >>> >>>> I rebooted it and now it's in an even more broken state. It's >>>> repeatedly trying to stop the console proxy but can't because its state is >>>> "Starting." Here is an excerpt from the management log: >>>> >>>> http://pastebin.com/FiaDzKXb >>>> >>>> The agent log keeps repeating these messages: >>>> >>>> http://pastebin.com/yDidSbrz >>>> >>>> What's wrong with it? >>>> >>>> >>>> On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote: >>>> >>>>> I wonder if something is wrong with the NFS mount. I see this error >>>>> periodically in /var/log/messages even though I have set the Domain in >>>>> /etc/idmapd.conf to the host's FQDN: >>>>> >>>>> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' >>>>> does not map into domain 'redacted.com' >>>>> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' >>>>> does not map into domain 'redacted.com' >>>>> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' >>>>> does not map into domain 'redacted.com' >>>>> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' >>>>> does not map into domain 'redacted.com' >>>>> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' >>>>> does not map into domain 'redacted.com' >>>>>
Re: console proxy times out
I destroyed the SSVM and then tried hacking the database to make CloudStack realize that the console proxy is in fact stopped. mysql> update vm_instance set state='Stopped' where name='v-2-VM'; mysql> update host set status='Up' where name='v-2-VM'; Now they're both running and I can see the console. There's got to be a better way to use this system without having to reboot or hack the database daily. On Fri, May 23, 2014 at 10:42 AM, Ian Young wrote: > Also, is this normal? Every time the server is rebooted, it adds another > record to the mshost table but the "removed" field is always NULL. > > http://pastebin.com/q5zDCu4b > > > On Fri, May 23, 2014 at 10:39 AM, Ian Young wrote: > >> The SSVM is stopped. If I try to start it, it complains about >> insufficient capacity. CPU? RAM? I have plenty of both available. >> >> 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] >> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of >> aggregate capacity, that have (atleast one host with) enough CPU and RAM >> capacity under this Pod: 1 >> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] >> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list >> these clusters from avoid set: [1] >> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] >> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing >> disabled clusters and clusters in avoid list, returning. >> 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] >> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from >> :Starting to Stopped with event: OperationFailedvm's original host id: 1 >> new host id: null host id before state transition: null >> 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] >> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start >> secondary storage vm >> com.cloud.exception.InsufficientServerCapacityException: Unable to create >> a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface >> com.cloud.dc.DataCenter; id=1 >> >> >> On Fri, May 23, 2014 at 10:35 AM, Ian Young wrote: >> >>> I rebooted it and now it's in an even more broken state. It's >>> repeatedly trying to stop the console proxy but can't because its state is >>> "Starting." Here is an excerpt from the management log: >>> >>> http://pastebin.com/FiaDzKXb >>> >>> The agent log keeps repeating these messages: >>> >>> http://pastebin.com/yDidSbrz >>> >>> What's wrong with it? >>> >>> >>> On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote: >>> >>>> I wonder if something is wrong with the NFS mount. I see this error >>>> periodically in /var/log/messages even though I have set the Domain in >>>> /etc/idmapd.conf to the host's FQDN: >>>> >>>> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted.com' >>>> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted.com' >>>> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted.com' >>>> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted.com' >>>> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted.com' >>>> May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted.com' >>>> May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' >>>> does not map into domain 'redacted.com' >>>> May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' >>>> does not map into domain 'redacted.com' >>>> May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted.com' >>>> May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' >>>> does not map into domain 'redacted.com' >>>> May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>>> not map into domain 'redacted
Re: console proxy times out
Also, is this normal? Every time the server is rebooted, it adds another record to the mshost table but the "removed" field is always NULL. http://pastebin.com/q5zDCu4b On Fri, May 23, 2014 at 10:39 AM, Ian Young wrote: > The SSVM is stopped. If I try to start it, it complains about > insufficient capacity. CPU? RAM? I have plenty of both available. > > 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] > (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of > aggregate capacity, that have (atleast one host with) enough CPU and RAM > capacity under this Pod: 1 > 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] > (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list > these clusters from avoid set: [1] > 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] > (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing > disabled clusters and clusters in avoid list, returning. > 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] > (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from > :Starting to Stopped with event: OperationFailedvm's original host id: 1 > new host id: null host id before state transition: null > 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] > (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start > secondary storage vm > com.cloud.exception.InsufficientServerCapacityException: Unable to create > a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface > com.cloud.dc.DataCenter; id=1 > > > On Fri, May 23, 2014 at 10:35 AM, Ian Young wrote: > >> I rebooted it and now it's in an even more broken state. It's repeatedly >> trying to stop the console proxy but can't because its state is "Starting." >> Here is an excerpt from the management log: >> >> http://pastebin.com/FiaDzKXb >> >> The agent log keeps repeating these messages: >> >> http://pastebin.com/yDidSbrz >> >> What's wrong with it? >> >> >> On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote: >> >>> I wonder if something is wrong with the NFS mount. I see this error >>> periodically in /var/log/messages even though I have set the Domain in >>> /etc/idmapd.conf to the host's FQDN: >>> >>> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' >>> does not map into domain 'redacted.com' >>> May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' >>> does not map into domain 'redacted.com' >>> May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' >>> does not map into domain 'redacted.com' >>> May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >>> not map into domain 'redacted.com' >>> May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' >>> does not map into domain 'redacted.com' >>> >>> name '107' just started appearing in the log yesterday, which looks >>> unusual. Up until then, the error was always name '0'. >>> >>> >>> On Thu, May 22, 2014 at 11:15 AM, Andrija Panic >> > wrote: >>> >>>> I have observed this kind of problems ("process blocked for more than xx >>>> sec...") when I had access with storage - check your disks, smartctl >>>> etc... >>>> best >>>> >>>> Sent from Google Nexus 4 >>>> On May 22, 2014 7:49 PM, "Ian Young"
Re: console proxy times out
The SSVM is stopped. If I try to start it, it complains about insufficient capacity. CPU? RAM? I have plenty of both available. 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Pod: 1 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list these clusters from avoid set: [1] 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing disabled clusters and clusters in avoid list, returning. 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: 1 new host id: null host id before state transition: null 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start secondary storage vm com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface com.cloud.dc.DataCenter; id=1 On Fri, May 23, 2014 at 10:35 AM, Ian Young wrote: > I rebooted it and now it's in an even more broken state. It's repeatedly > trying to stop the console proxy but can't because its state is "Starting." > Here is an excerpt from the management log: > > http://pastebin.com/FiaDzKXb > > The agent log keeps repeating these messages: > > http://pastebin.com/yDidSbrz > > What's wrong with it? > > > On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote: > >> I wonder if something is wrong with the NFS mount. I see this error >> periodically in /var/log/messages even though I have set the Domain in >> /etc/idmapd.conf to the host's FQDN: >> >> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does >> not map into domain 'redacted.com' >> May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does >> not map into domain 'redacted.com' >> May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does >> not map into domain 'redacted.com' >> May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does >> not map into domain 'redacted.com' >> May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does >> not map into domain 'redacted.com' >> >> name '107' just started appearing in the log yesterday, which looks >> unusual. Up until then, the error was always name '0'. >> >> >> On Thu, May 22, 2014 at 11:15 AM, Andrija Panic >> wrote: >> >>> I have observed this kind of problems ("process blocked for more than xx >>> sec...") when I had access with storage - check your disks, smartctl >>> etc... >>> best >>> >>> Sent from Google Nexus 4 >>> On May 22, 2014 7:49 PM, "Ian Young" wrote: >>> >>> > And this is in /var/log/messages right before that event: >>> > >>> > May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for >>> more >>> > than 120 seconds. >>> > May 22 10:16:07 virthost1 kernel: Not tainted >>> > 2.6.32-431.11.2.el6.x86_64 #1 >>> > May 22 10:16:07 virthost1 kernel: "echo 0 > >>> > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>> > May 22 10:16:07 virthost1 kernel: qemu-kvm
Re: console proxy times out
I rebooted it and now it's in an even more broken state. It's repeatedly trying to stop the console proxy but can't because its state is "Starting." Here is an excerpt from the management log: http://pastebin.com/FiaDzKXb The agent log keeps repeating these messages: http://pastebin.com/yDidSbrz What's wrong with it? On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote: > I wonder if something is wrong with the NFS mount. I see this error > periodically in /var/log/messages even though I have set the Domain in > /etc/idmapd.conf to the host's FQDN: > > May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does > not map into domain 'redacted.com' > May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does > not map into domain 'redacted.com' > May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does > not map into domain 'redacted.com' > May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does > not map into domain 'redacted.com' > > name '107' just started appearing in the log yesterday, which looks > unusual. Up until then, the error was always name '0'. > > > On Thu, May 22, 2014 at 11:15 AM, Andrija Panic > wrote: > >> I have observed this kind of problems ("process blocked for more than xx >> sec...") when I had access with storage - check your disks, smartctl >> etc... >> best >> >> Sent from Google Nexus 4 >> On May 22, 2014 7:49 PM, "Ian Young" wrote: >> >> > And this is in /var/log/messages right before that event: >> > >> > May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for >> more >> > than 120 seconds. >> > May 22 10:16:07 virthost1 kernel: Not tainted >> > 2.6.32-431.11.2.el6.x86_64 #1 >> > May 22 10:16:07 virthost1 kernel: "echo 0 > >> > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> > May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 >> > 2971 1 0x0080 >> > May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082 >> > 88106b6529d8 >> > May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0 >> > 8100bb8e 8810724e9be8 >> > May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8 >> > fbc8 881073525058 >> > May 22 10:16:07 virthost1 kernel: Call Trace: >> > May 22 10:16:07 virthost1 kernel: [] ? >> > apic_timer_interrupt+0xe/0x20 >> > May 22 10:16:07 virthost1 kernel: [] ? >> > mutex_spin_on_owner+0x9f/0xc0 >> > May 22 10:16:07 virthost1 kernel: [] >> > __mutex_lock_slowpath+0x13e/0x180 >> > May 22 10:16:07 virthost1 kernel: [] >> mutex_lock+0x2b/0x50 >> > May 22 10:16:07 virthost1 kernel: [] >> > memory_access_ok+0x7f/0xc0 [vhost_net] >> > May 22 10:16:07 virthost1 kernel: [] >> > vhost_dev_ioctl+0x2ec/0xa50 [vhost_net] >> > May 22 10:16:07 virthost1 kernel: [] ? >> > vhost_work_flush+0xe1/0x120 [vhost_net] >> > May 22 10:16:07 virthost1 kernel: [] ? >> > avc_has_perm+0x71/0x90 >> > May 22 10:16:07 virthost1 kernel: [] >> > vhost_net_ioctl+0x7a/0x5d0 [vhost_net] >> > May 22 10:16:07 virthost1 kernel: [] ? >> > inode_has_perm+0x54/0xa0 >> > May 22 10:16:07 virthost1 kernel: [] ? >> > kvm_vcpu_ioctl+0x1e7/0x580 [kvm] >> > May 22 10:16:07 virthost1 kernel:
Re: console proxy times out
I wonder if something is wrong with the NFS mount. I see this error periodically in /var/log/messages even though I have set the Domain in /etc/idmapd.conf to the host's FQDN: May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' name '107' just started appearing in the log yesterday, which looks unusual. Up until then, the error was always name '0'. On Thu, May 22, 2014 at 11:15 AM, Andrija Panic wrote: > I have observed this kind of problems ("process blocked for more than xx > sec...") when I had access with storage - check your disks, smartctl > etc... > best > > Sent from Google Nexus 4 > On May 22, 2014 7:49 PM, "Ian Young" wrote: > > > And this is in /var/log/messages right before that event: > > > > May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for > more > > than 120 seconds. > > May 22 10:16:07 virthost1 kernel: Not tainted > > 2.6.32-431.11.2.el6.x86_64 #1 > > May 22 10:16:07 virthost1 kernel: "echo 0 > > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 > > 2971 1 0x0080 > > May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082 > > 88106b6529d8 > > May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0 > > 8100bb8e 8810724e9be8 > > May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8 > > fbc8 881073525058 > > May 22 10:16:07 virthost1 kernel: Call Trace: > > May 22 10:16:07 virthost1 kernel: [] ? > > apic_timer_interrupt+0xe/0x20 > > May 22 10:16:07 virthost1 kernel: [] ? > > mutex_spin_on_owner+0x9f/0xc0 > > May 22 10:16:07 virthost1 kernel: [] > > __mutex_lock_slowpath+0x13e/0x180 > > May 22 10:16:07 virthost1 kernel: [] > mutex_lock+0x2b/0x50 > > May 22 10:16:07 virthost1 kernel: [] > > memory_access_ok+0x7f/0xc0 [vhost_net] > > May 22 10:16:07 virthost1 kernel: [] > > vhost_dev_ioctl+0x2ec/0xa50 [vhost_net] > > May 22 10:16:07 virthost1 kernel: [] ? > > vhost_work_flush+0xe1/0x120 [vhost_net] > > May 22 10:16:07 virthost1 kernel: [] ? > > avc_has_perm+0x71/0x90 > > May 22 10:16:07 virthost1 kernel: [] > > vhost_net_ioctl+0x7a/0x5d0 [vhost_net] > > May 22 10:16:07 virthost1 kernel: [] ? > > inode_has_perm+0x54/0xa0 > > May 22 10:16:07 virthost1 kernel: [] ? > > kvm_vcpu_ioctl+0x1e7/0x580 [kvm] > > May 22 10:16:07 virthost1 kernel: [] ? > > send_signal+0x3e/0x90 > > May 22 10:16:07 virthost1 kernel: [] > vfs_ioctl+0x22/0xa0 > > May 22 10:16:07 virthost1 kernel: [] > > do_vfs_ioctl+0x84/0x580 > > May 22 10:16:07 virthost1 kernel: [] > sys_ioctl+0x81/0xa0 > > May 22 10:16:07 virthost1 kernel: [] ? > > __audit_syscall_exit+0x25e/0x290 > > May 22 10:16:07 virthost1 kernel: [] > > system_call_fastpath+0x16/0x1b > > > > > > On Thu, May 22, 2014 at 10:39 AM, Ian Young > > wrote: > > > > > The console proxy became unavailable again yesterday afternoon. I > could > > > SSH into it via its link local address and nothing seemed to be wrong > > > inside the VM itself. However, the qemu-kv
Re: console proxy times out
And this is in /var/log/messages right before that event: May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for more than 120 seconds. May 22 10:16:07 virthost1 kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1 May 22 10:16:07 virthost1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 2971 1 0x0080 May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082 88106b6529d8 May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0 8100bb8e 8810724e9be8 May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8 fbc8 881073525058 May 22 10:16:07 virthost1 kernel: Call Trace: May 22 10:16:07 virthost1 kernel: [] ? apic_timer_interrupt+0xe/0x20 May 22 10:16:07 virthost1 kernel: [] ? mutex_spin_on_owner+0x9f/0xc0 May 22 10:16:07 virthost1 kernel: [] __mutex_lock_slowpath+0x13e/0x180 May 22 10:16:07 virthost1 kernel: [] mutex_lock+0x2b/0x50 May 22 10:16:07 virthost1 kernel: [] memory_access_ok+0x7f/0xc0 [vhost_net] May 22 10:16:07 virthost1 kernel: [] vhost_dev_ioctl+0x2ec/0xa50 [vhost_net] May 22 10:16:07 virthost1 kernel: [] ? vhost_work_flush+0xe1/0x120 [vhost_net] May 22 10:16:07 virthost1 kernel: [] ? avc_has_perm+0x71/0x90 May 22 10:16:07 virthost1 kernel: [] vhost_net_ioctl+0x7a/0x5d0 [vhost_net] May 22 10:16:07 virthost1 kernel: [] ? inode_has_perm+0x54/0xa0 May 22 10:16:07 virthost1 kernel: [] ? kvm_vcpu_ioctl+0x1e7/0x580 [kvm] May 22 10:16:07 virthost1 kernel: [] ? send_signal+0x3e/0x90 May 22 10:16:07 virthost1 kernel: [] vfs_ioctl+0x22/0xa0 May 22 10:16:07 virthost1 kernel: [] do_vfs_ioctl+0x84/0x580 May 22 10:16:07 virthost1 kernel: [] sys_ioctl+0x81/0xa0 May 22 10:16:07 virthost1 kernel: [] ? __audit_syscall_exit+0x25e/0x290 May 22 10:16:07 virthost1 kernel: [] system_call_fastpath+0x16/0x1b On Thu, May 22, 2014 at 10:39 AM, Ian Young wrote: > The console proxy became unavailable again yesterday afternoon. I could > SSH into it via its link local address and nothing seemed to be wrong > inside the VM itself. However, the qemu-kvm process for that VM was at > almost 100% CPU. Inside the VM, the CPU usage was minimal and the java > process was running and listening on port 443. So there seems to be > something wrong with it down at the KVM/QEMU level. It's weird how this > keeps happening to the console proxy only and not any of the other VMs. I > tried to reboot it from the management UI and after about 15 minutes, it > finally did. Now the console proxy is working but I don't know how long it > will last before it breaks again. I found this in libvirtd.log, which > corresponds with the time the console proxy rebooted: > > 2014-05-22 17:17:04.362+: 25195: info : libvirt version: 0.10.2, > package: 29.el6_5.7 (CentOS BuildSystem <http://bugs.centos.org>, > 2014-04-07-07:42:04, c6b9.bsys.dev.centos.org) > 2014-05-22 17:17:04.362+: 25195: error : qemuMonitorIO:614 : internal > error End of file from monitor > > > On Wed, May 21, 2014 at 2:07 PM, Ian Young wrote: > >> I built and installed a libvirt 1.04 package from the Fedora src rpm. It >> installed fine inside a test VM but installing it on the real hypervisor >> was a bad idea and I doubt I'll be pursuing it further. All VMs promptly >> stopped and this appeared in libvirtd.log: >> >> 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4, >> package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com) >> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : >> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not >> accessible >> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : >> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not >> accessible >> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : >> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not >> accessible >> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : >> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not >> accessible >> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : >> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not >> accessible >> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : >> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not >> accessible >> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : >> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not >> accessible &g
Re: console proxy times out
The console proxy became unavailable again yesterday afternoon. I could SSH into it via its link local address and nothing seemed to be wrong inside the VM itself. However, the qemu-kvm process for that VM was at almost 100% CPU. Inside the VM, the CPU usage was minimal and the java process was running and listening on port 443. So there seems to be something wrong with it down at the KVM/QEMU level. It's weird how this keeps happening to the console proxy only and not any of the other VMs. I tried to reboot it from the management UI and after about 15 minutes, it finally did. Now the console proxy is working but I don't know how long it will last before it breaks again. I found this in libvirtd.log, which corresponds with the time the console proxy rebooted: 2014-05-22 17:17:04.362+: 25195: info : libvirt version: 0.10.2, package: 29.el6_5.7 (CentOS BuildSystem <http://bugs.centos.org>, 2014-04-07-07:42:04, c6b9.bsys.dev.centos.org) 2014-05-22 17:17:04.362+: 25195: error : qemuMonitorIO:614 : internal error End of file from monitor On Wed, May 21, 2014 at 2:07 PM, Ian Young wrote: > I built and installed a libvirt 1.04 package from the Fedora src rpm. It > installed fine inside a test VM but installing it on the real hypervisor > was a bad idea and I doubt I'll be pursuing it further. All VMs promptly > stopped and this appeared in libvirtd.log: > > 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4, > package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com) > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not > accessible > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not > accessible > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not > accessible > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not > accessible > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not > accessible > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not > accessible > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not > accessible > 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : > Module /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so not > accessible > 2014-05-21 20:36:49.471+: 23570: error : do_open:1220 : no connection > driver available for qemu:///system > 2014-05-21 20:36:49.472+: 23567: error : virNetSocketReadWire:1370 : > End of file while reading data: Input/output error > 2014-05-21 20:36:49.473+: 23571: error : do_open:1220 : no connection > driver available for lxc:/// > 2014-05-21 20:36:49.474+: 23567: error : virNetSocketReadWire:1370 : > End of file while reading data: Input/output error > 2014-05-21 20:36:49.475+: 23568: error : do_open:1220 : no connection > driver available for qemu:///system > 2014-05-21 20:36:49.476+: 23567: error : virNetSocketReadWire:1370 : > End of file while reading data: Input/output error > 2014-05-21 20:36:49.678+: 23575: error : do_open:1220 : no connection > driver available for qemu:///system > 2014-05-21 20:36:49.678+: 23567: error : virNetSocketReadWire:1370 : > End of file while reading data: Input/output error > 2014-05-21 20:36:49.681+: 23572: error : do_open:1220 : no connection > driver available for qemu:///system > 2014-05-21 20:36:49.682+: 23567: error : virNetSocketReadWire:1370 : > End of file while reading data: Input/output error > > > On Wed, May 21, 2014 at 10:45 AM, Ian Young wrote: > >> I was able to get it working by following these steps: >> >> 1. stop all instances >> 2. service cloudstack-management stop >> 3. service cloudstack-agent stop >> 4. virsh shutdown {domain} (for each of the system VMs) >> 5. service libvirtd stop >> 6. umount primary and secondary >> 7. reboot >> >> The console proxy is working again. I expect it will probably break >> again in a day or two. I have a feeling it's a result of this libvirtd >> bug, since I've seen the "cannot acquire state change lock" several times. >> >> https://bugs.launchpad.net/nova/+bug/1254872 >> >> I might try buildin
Re: console proxy times out
I built and installed a libvirt 1.04 package from the Fedora src rpm. It installed fine inside a test VM but installing it on the real hypervisor was a bad idea and I doubt I'll be pursuing it further. All VMs promptly stopped and this appeared in libvirtd.log: 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4, package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com) 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so not accessible 2014-05-21 20:36:49.471+: 23570: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.472+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.473+: 23571: error : do_open:1220 : no connection driver available for lxc:/// 2014-05-21 20:36:49.474+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.475+: 23568: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.476+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.678+: 23575: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.678+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.681+: 23572: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.682+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error On Wed, May 21, 2014 at 10:45 AM, Ian Young wrote: > I was able to get it working by following these steps: > > 1. stop all instances > 2. service cloudstack-management stop > 3. service cloudstack-agent stop > 4. virsh shutdown {domain} (for each of the system VMs) > 5. service libvirtd stop > 6. umount primary and secondary > 7. reboot > > The console proxy is working again. I expect it will probably break again > in a day or two. I have a feeling it's a result of this libvirtd bug, > since I've seen the "cannot acquire state change lock" several times. > > https://bugs.launchpad.net/nova/+bug/1254872 > > I might try building my own libvirtd 1.0.3 for EL6. > > > On Tue, May 20, 2014 at 6:21 PM, Ian Young wrote: > >> So I got the console proxy working via HTTPS (by managing my own " >> realhostip.com" DNS) last week and everything was working fine. Today, >> all of a sudden, the console proxy stopped working again. The browser >> says, "Connecting to 192-168-100-159.realhostip.com..." and eventually >> times out. I tried to restart it and it went into a "Stopping" state that >> never completed and the Agent State was "Disconnected." I could not shut >> down the VM using virsh or with "kill -9" because libvirtd kept saying, >> "cannot acquire state change lock," so I gracefully shut down the remaining >> instances and rebooted the entire management server/hypervisor. Start over. >> >> When it came back up, the SSVM and console proxy started but the virtual >> router was stopped. I was able to manually start it from the UI. The >> console proxy still times out when I try to access it from a browser. I >> don't see any errors in the management or agent logs, just this: >> >> 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) >> Seq 1-2130378876: Sending { Cmd , MgmtId: 55157049428734, via: 1( >> virthost1.redacted.com), Ver: v1, Flags: 100011, &g
Re: console proxy times out
I was able to get it working by following these steps: 1. stop all instances 2. service cloudstack-management stop 3. service cloudstack-agent stop 4. virsh shutdown {domain} (for each of the system VMs) 5. service libvirtd stop 6. umount primary and secondary 7. reboot The console proxy is working again. I expect it will probably break again in a day or two. I have a feeling it's a result of this libvirtd bug, since I've seen the "cannot acquire state change lock" several times. https://bugs.launchpad.net/nova/+bug/1254872 I might try building my own libvirtd 1.0.3 for EL6. On Tue, May 20, 2014 at 6:21 PM, Ian Young wrote: > So I got the console proxy working via HTTPS (by managing my own " > realhostip.com" DNS) last week and everything was working fine. Today, > all of a sudden, the console proxy stopped working again. The browser > says, "Connecting to 192-168-100-159.realhostip.com..." and eventually > times out. I tried to restart it and it went into a "Stopping" state that > never completed and the Agent State was "Disconnected." I could not shut > down the VM using virsh or with "kill -9" because libvirtd kept saying, > "cannot acquire state change lock," so I gracefully shut down the remaining > instances and rebooted the entire management server/hypervisor. Start over. > > When it came back up, the SSVM and console proxy started but the virtual > router was stopped. I was able to manually start it from the UI. The > console proxy still times out when I try to access it from a browser. I > don't see any errors in the management or agent logs, just this: > > 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) > Seq 1-2130378876: Sending { Cmd , MgmtId: 55157049428734, via: 1( > virthost1.redacted.com), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.GetVncPortCommand":{"id":4,"name":"r-4-VM","wait":0}}] > } > 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] > (AgentManager-Handler-3:null) Seq 1-2130378876: Processing: { Ans: , > MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5902,"result":true,"wait":0}}] > } > 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) > Seq 1-2130378876: Received: { Ans: , MgmtId: 55157049428734, via: 1, Ver: > v1, Flags: 10, { GetVncPortAnswer } } > 2014-05-20 18:04:27,684 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-10:null) Port info 192.168.100.6 > 2014-05-20 18:04:27,684 INFO [c.c.s.ConsoleProxyServlet] > (catalina-exec-10:null) Parse host info returned from executing > GetVNCPortCommand. host info: 192.168.100.6 > 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-10:null) Compose console url: > https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A > 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-10:null) the console url is :: > r-4-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A > "> > 2014-05-20 18:04:29,216 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-4:null) SeqA 2-545: Processing Seq 2-545: { Cmd , > MgmtId: -1, via: 2, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > > If I try to restart the system VMs with cloudstack-sysvmadm, it says: > > Stopping and starting 1 secondary storage vm(s)... > curl: (7) couldn't connect to host > ERROR: Failed to stop secondary storage vm with id 1 > > Done stopping and starting secondary storage vm(s) > > Stopping and starting 1 console proxy vm(s)... > curl: (7) couldn't connect to host > ERROR: Failed to stop console proxy vm with id 2 > > Done stopping and starting console proxy vm(s) . > > Stopping and starting 1 running routing vm(s)... > curl: (7) couldn't connect to host > 2 > Done restarting router(s). > > I notice there are now four entries for the same management server in the > mshost table, and they all are in an "Up" state and the "removed" field is > NULL. What's wrong with this system? >
console proxy times out
So I got the console proxy working via HTTPS (by managing my own " realhostip.com" DNS) last week and everything was working fine. Today, all of a sudden, the console proxy stopped working again. The browser says, "Connecting to 192-168-100-159.realhostip.com..." and eventually times out. I tried to restart it and it went into a "Stopping" state that never completed and the Agent State was "Disconnected." I could not shut down the VM using virsh or with "kill -9" because libvirtd kept saying, "cannot acquire state change lock," so I gracefully shut down the remaining instances and rebooted the entire management server/hypervisor. Start over. When it came back up, the SSVM and console proxy started but the virtual router was stopped. I was able to manually start it from the UI. The console proxy still times out when I try to access it from a browser. I don't see any errors in the management or agent logs, just this: 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq 1-2130378876: Sending { Cmd , MgmtId: 55157049428734, via: 1( virthost1.redacted.com), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.GetVncPortCommand":{"id":4,"name":"r-4-VM","wait":0}}] } 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (AgentManager-Handler-3:null) Seq 1-2130378876: Processing: { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5902,"result":true,"wait":0}}] } 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq 1-2130378876: Received: { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-20 18:04:27,684 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Port info 192.168.100.6 2014-05-20 18:04:27,684 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Compose console url: https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) the console url is :: r-4-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A "> 2014-05-20 18:04:29,216 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-4:null) SeqA 2-545: Processing Seq 2-545: { Cmd , MgmtId: -1, via: 2, Ver: v1, Flags: 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n \"connections\": []\n}","wait":0}}] } If I try to restart the system VMs with cloudstack-sysvmadm, it says: Stopping and starting 1 secondary storage vm(s)... curl: (7) couldn't connect to host ERROR: Failed to stop secondary storage vm with id 1 Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... curl: (7) couldn't connect to host ERROR: Failed to stop console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 1 running routing vm(s)... curl: (7) couldn't connect to host 2 Done restarting router(s). I notice there are now four entries for the same management server in the mshost table, and they all are in an "Up" state and the "removed" field is NULL. What's wrong with this system?
Re: can't ping guest network
I forgot to add ingress rules to the security group. It works now. On Mon, May 19, 2014 at 5:18 PM, Ian Young wrote: > My VMs can reach the rest of our internal network and even the internet > but nothing except the management/hypervisor can reach the VMs. I > monitored eth0 on one of the VMs while I tried to SSH to it from another > workstation and it displayed this: > > 17:05:29.031584 ARP, Request who-has monitor.cs1cloud.internal tell > 192.168.100.166, length 46 > > I have the network bridge set up correctly and I've tried disabling > iptables and SELinux just to rule those out. There must be something > simple I overlooked. Why does outbound traffic work but inbound traffic > doesn't? >
can't ping guest network
My VMs can reach the rest of our internal network and even the internet but nothing except the management/hypervisor can reach the VMs. I monitored eth0 on one of the VMs while I tried to SSH to it from another workstation and it displayed this: 17:05:29.031584 ARP, Request who-has monitor.cs1cloud.internal tell 192.168.100.166, length 46 I have the network bridge set up correctly and I've tried disabling iptables and SELinux just to rule those out. There must be something simple I overlooked. Why does outbound traffic work but inbound traffic doesn't?
Re: cloudstack 4.3 installation on CentOS
That quick start guide used to have the wrong URL for the system VM template but it looks like it has been corrected since then. Check your command line history and see if the template you downloaded was the same as the one in this section: http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup On Wed, May 7, 2014 at 10:26 AM, dimas yoga pratama wrote: > yes I follow the wizard. > This is my configuration with basic installation. > > management server IP : 10.151.32.51 > > Zone Configuration: > > Name : Zone1 > Public DNS1: 202.46.129.2 (College DNS) > Public DNS2: - > Internal DNS1: 10.151.32.6 (My Lab DNS) > Internal DNS2: - > > Pod Configuration: > > Name : Pod1 > Gateway : 10.151.32.1 > Netmask : 255.255.255.0 > Start/end reserved system IPs : 10.151.32.60 - 10.151.32.80 > Guest Gateway : 10.151.32.1 > Guest Netmask : 255.255.255.0 > Guest start/end IP : 10.151.32.90 - 10.151.32.200 > > > Cluster Configuration: > > Name: Cluster1 > Hypervisor:KVM > > > Host Configuration: > > Hostname:10.151.32.51 (Because I build cloudstack with single hardware) > Username:root > Password:password > > Primary storage: > > Name: Primary1 > Server: 10.151.32.51 > Path :/primary > > Secondary storage: > > NFS server: 10.151.32.51 > Path :/secondary > > > Is there anything wrong with my configurations? > I wanna test cloudstack in my Lab environment. > Thanks. > > > On Wed, May 7, 2014 at 6:31 AM, Pierre-Luc Dion > wrote: > > > Hi, > > Did you follow the zone creation wizard from the ui? > > > > Could it be possible their is not enough management IP in the pod? > > > > Le mardi 6 mai 2014, dimas yoga pratama a écrit : > > > > > Hi all, > > > > > > I'm trying to install cloudstack 4.3 on CentOS 6.5 with single hardware > > > with proxy environment. I followed this guide > > > http://cloudstack-installation.readthedocs.org/en/latest/qig.html. I > > > managed to login to cloudstack dashboard and succeded to add host. > When > > it > > > comes to "creating system VMs(this may take a while)" step, it takes a > > very > > > very loong time and when I refresh the browser suddenly it redirect to > > > dashboard. I check the infrastructure tab and I found 2 system VMs > > already > > > created and the VM state showed "starting" but the agent showed > nothing. > > > Also in dashboard tab I found this notification : "Management Server: > > > Management network CIDR is not configured original.type 14" > > > why is that happening? anything wrong with my installation? > > > > > > looking forward for your answer. > > > > > > > > > -- > > > > Pierre-Luc Dion > > Architecte de Solution Cloud | Cloud Solutions Architect > > 855-OK-CLOUD (855-652-5683) x1101 > > - - - > > > > *CloudOps*420 rue Guy > > Montréal QC H3J 1S6 > > www.cloudops.com > > @CloudOps_ > > >
Re: Quick Installation Guide for CentOS 6.5
Two things to check: That quick start guide used to have the wrong URL for the system VM template but it looks like it has been corrected since then. Check your command line history and see if the template you downloaded was the same as the one in this section: http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup For NFSv4 you need to export a pseudo file system designated by fsid=0, which this guide doesn't mention. See section "18.7.1.1. Using exportfs with NFSv4" in the following article for more information: http://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-nfs-server-config-exports.html On Tue, May 6, 2014 at 5:33 AM, dimas wrote: > Samuel Winchenbach writes: > > > > > Hi all, > > > > I am following the quick installation guide exactly (I even setup a > gateway > > w/ IP 172.16.10.1) but I can not get the setup to complete. The System > VMs > > remain stuck on "Starting". > > ​ Any help would be greatly appreciated!​ > > > > Now I logged into the management console and used the exact settings as > > listed in the "Quick Install Guide". It seems to hang forever (>2 > hours) > > on "Creating system VMs (this may take a while)" > > ​ > > > Hi I encountered same problem like you. did you managed to find a solution? > looking forward for your answer > > > >
Re: replacement for realhostip
I just realized I had to set the consoleproxy.url.domain field to " realhostip.com" but now when I try to view the console, the browser says "The server refused the connection." Does that indicate a problem with the SSL certificate? management-server.log: 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (AgentManager-Handler-5:null) Seq 1-90898443: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Port info 192.168.100.6 2014-05-15 14:43:55,563 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Compose console url: https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) the console url is :: v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg "> ssl_access_log: 192.168.100.166 - - [15/May/2014:14:44:55 -0700] "GET /client/console?cmd=access&vm=086b5822-de00-4764-8b05-d8e00657ee54 HTTP/1.1" 200 405 On Wed, May 14, 2014 at 5:56 PM, Ian Young wrote: > Looks like it's still using HTTP, not HTTPS: > > 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) > Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( > virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}] > } > 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] > (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , > MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5903,"result":true,"wait":0}}] > } > 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) > Seq 1-800529939: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, > Flags: 10, { GetVncPortAnswer } } > 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-20:null) Port info 192.168.100.6 > 2014-05-14 17:52:35,861 INFO [c.c.s.ConsoleProxyServlet] > (catalina-exec-20:null) Parse host info returned from executing > GetVNCPortCommand. host info: 192.168.100.6 > 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-20:null) Compose console url: > http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw > 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-20:null) the console url is :: > phonesynergyhttp://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw > "> > > > On Wed, May 14, 2014 at 5:41 PM, Ian Young wrote: > >> I decided to create my own internal realhostip.com. My DNS servers use >> PowerDNS, not BIND, so the $GENERATE directive was not an option and I >> didn't want to have to populate my DNS servers' databases with a record for >> every possible
Re: cloudstack 4.3 installation on CentOS
Also, which version of NFS are you using? For NFSv4 you need to export a pseudo file system designated by fsid=0, which this guide doesn't mention. See section "18.7.1.1. Using exportfs with NFSv4" in the following article for more information: http://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-nfs-server-config-exports.html On Thu, May 15, 2014 at 10:37 AM, Ian Young wrote: > That quick start guide used to have the wrong URL for the system VM > template but it looks like it has been corrected since then. Check your > command line history and see if the template you downloaded was the same as > the one in this section: > > > http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup > > > On Wed, May 7, 2014 at 10:26 AM, dimas yoga pratama wrote: > >> yes I follow the wizard. >> This is my configuration with basic installation. >> >> management server IP : 10.151.32.51 >> >> Zone Configuration: >> >> Name : Zone1 >> Public DNS1: 202.46.129.2 (College DNS) >> Public DNS2: - >> Internal DNS1: 10.151.32.6 (My Lab DNS) >> Internal DNS2: - >> >> Pod Configuration: >> >> Name : Pod1 >> Gateway : 10.151.32.1 >> Netmask : 255.255.255.0 >> Start/end reserved system IPs : 10.151.32.60 - 10.151.32.80 >> Guest Gateway : 10.151.32.1 >> Guest Netmask : 255.255.255.0 >> Guest start/end IP : 10.151.32.90 - 10.151.32.200 >> >> >> Cluster Configuration: >> >> Name: Cluster1 >> Hypervisor:KVM >> >> >> Host Configuration: >> >> Hostname:10.151.32.51 (Because I build cloudstack with single hardware) >> Username:root >> Password:password >> >> Primary storage: >> >> Name: Primary1 >> Server: 10.151.32.51 >> Path :/primary >> >> Secondary storage: >> >> NFS server: 10.151.32.51 >> Path :/secondary >> >> >> Is there anything wrong with my configurations? >> I wanna test cloudstack in my Lab environment. >> Thanks. >> >> >> On Wed, May 7, 2014 at 6:31 AM, Pierre-Luc Dion >> wrote: >> >> > Hi, >> > Did you follow the zone creation wizard from the ui? >> > >> > Could it be possible their is not enough management IP in the pod? >> > >> > Le mardi 6 mai 2014, dimas yoga pratama a écrit : >> > >> > > Hi all, >> > > >> > > I'm trying to install cloudstack 4.3 on CentOS 6.5 with single >> hardware >> > > with proxy environment. I followed this guide >> > > http://cloudstack-installation.readthedocs.org/en/latest/qig.html. I >> > > managed to login to cloudstack dashboard and succeded to add host. >> When >> > it >> > > comes to "creating system VMs(this may take a while)" step, it takes a >> > very >> > > very loong time and when I refresh the browser suddenly it redirect to >> > > dashboard. I check the infrastructure tab and I found 2 system VMs >> > already >> > > created and the VM state showed "starting" but the agent showed >> nothing. >> > > Also in dashboard tab I found this notification : "Management Server: >> > > Management network CIDR is not configured original.type 14" >> > > why is that happening? anything wrong with my installation? >> > > >> > > looking forward for your answer. >> > > >> > >> > >> > -- >> > >> > Pierre-Luc Dion >> > Architecte de Solution Cloud | Cloud Solutions Architect >> > 855-OK-CLOUD (855-652-5683) x1101 >> > - - - >> > >> > *CloudOps*420 rue Guy >> > Montréal QC H3J 1S6 >> > www.cloudops.com >> > @CloudOps_ >> > >> > >
Re: replacement for realhostip
I was able to confirm the certificate by going directly to https://192-168-100-159.realhostip.com/ in the browser. I wish there was an easier way to do this. I don't mind the extra step, and the rest of my tech team will understand how it works but it's going to be a hassle explaining this procedure to everyone else. I really hope someone can think of a more elegant alternative to realhostip.com when 4.4 is released. It will lead to better product adoption. On Fri, May 16, 2014 at 10:49 AM, Ian Young wrote: > Ok, so the console proxy needed to be restarted in order for the > consoleproxy.url.domain > setting to take effect. However, I still can't see the console. In > Chrome, it just shows a frowning face with no error message (not very > useful). In Firefox, at least it tells me the certificate is not trusted > because it is self-signed but it doesn't give me the option to accept it. > > It's not an unreasonable expectation to be able to use self-signed SSL > certificates for an internal site. Is there a setting in CloudStack that > allows them to be trusted? > > > On Fri, May 16, 2014 at 10:38 AM, Ian Young wrote: > >> The problem appears to be with the console proxy itself. Here are the >> ports that are listening on the public interface, according to an nmap TCP >> scan: >> >> PORTSTATE SERVICE >> 80/tcp open http >> 443/tcp closed https >> >> When I logged into the console proxy through the link local address, I >> checked for processes on port 443 and there are none, so obviously an HTTPS >> connection can't be made. There is a Java process listening on port 80 but >> nothing on 443. Is there something in the global settings that will enable >> HTTPS, or is this a bug? >> >> root@v-2-VM:~# netstat -lnp | grep java >> tcp0 0 0.0.0.0:8001 0.0.0.0:* >> LISTEN 3491/java >> tcp0 0 0.0.0.0:80 0.0.0.0:* >> LISTEN 3491/java >> >> >> On Thu, May 15, 2014 at 2:53 PM, Ian Young wrote: >> >>> I just realized I had to set the consoleproxy.url.domain field to " >>> realhostip.com" but now when I try to view the console, the browser >>> says "The server refused the connection." Does that indicate a problem >>> with the SSL certificate? >>> >>> management-server.log: >>> 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) >>> Seq 1-90898443: Sending { Cmd , MgmtId: 161342909744, via: 1( >>> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, >>> [{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}] >>> } >>> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] >>> (AgentManager-Handler-5:null) Seq 1-90898443: Processing: { Ans: , MgmtId: >>> 161342909744, via: 1, Ver: v1, Flags: 10, >>> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}] >>> } >>> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) >>> Seq 1-90898443: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, >>> Flags: 10, { GetVncPortAnswer } } >>> 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet] >>> (catalina-exec-15:null) Port info 192.168.100.6 >>> 2014-05-15 14:43:55,563 INFO [c.c.s.ConsoleProxyServlet] >>> (catalina-exec-15:null) Parse host info returned from executing >>> GetVNCPortCommand. host info: 192.168.100.6 >>> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] >>> (catalina-exec-15:null) Compose console url: >>> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg >>> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] >>> (catalina-exec-15:null) the console url is :: >>> v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg >>> "> >>> >>> ssl_access_log: >>> 192.168.100.166 - - [15/May/2014:14:44:55 -0700] &q
Re: replacement for realhostip
4.3 consoleproxy.url.domain = realhostip.com It's working now. I'm just responding to clarify those questions. On Thu, May 15, 2014 at 10:43 AM, Amogh Vasekar wrote: > Hi, > > Which version of CloudStack are you on? > Also, what does the config "console proxy.url.domain" refer to? > > Thanks, > Amogh > > On 5/14/14 5:41 PM, "Ian Young" wrote: > > >I decided to create my own internal realhostip.com. My DNS servers use > >PowerDNS, not BIND, so the $GENERATE directive was not an option and I > >didn't want to have to populate my DNS servers' databases with a record > >for > >every possible IP address. Fortunately, I found the following Lua script: > > > >https://github.com/terbolous/powerdns-cloudstack-proxy-dns > > > >I can confirm the Lua script works as expected and my CloudStack server > >can > >be tricked into believing my internal DNS servers are the authority for > >realhostip.com: > > > >[root@virthost1 ]# dig +short 1-2-3-4.realhostip.com > >1.2.3.4 > > > >I followed this guide and updated the console proxy/SSVM SSL certificate > >with my own *.realhostip.com certificate. > > > > > http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/la > >test/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain > > > >The console proxy restarted but it's still blank when I try to view the > >console. Does the domain have to be something other than realhostip.com? > >
Re: replacement for realhostip
Ok, so the console proxy needed to be restarted in order for the consoleproxy.url.domain setting to take effect. However, I still can't see the console. In Chrome, it just shows a frowning face with no error message (not very useful). In Firefox, at least it tells me the certificate is not trusted because it is self-signed but it doesn't give me the option to accept it. It's not an unreasonable expectation to be able to use self-signed SSL certificates for an internal site. Is there a setting in CloudStack that allows them to be trusted? On Fri, May 16, 2014 at 10:38 AM, Ian Young wrote: > The problem appears to be with the console proxy itself. Here are the > ports that are listening on the public interface, according to an nmap TCP > scan: > > PORTSTATE SERVICE > 80/tcp open http > 443/tcp closed https > > When I logged into the console proxy through the link local address, I > checked for processes on port 443 and there are none, so obviously an HTTPS > connection can't be made. There is a Java process listening on port 80 but > nothing on 443. Is there something in the global settings that will enable > HTTPS, or is this a bug? > > root@v-2-VM:~# netstat -lnp | grep java > tcp0 0 0.0.0.0:80010.0.0.0:* > LISTEN 3491/java > tcp0 0 0.0.0.0:80 0.0.0.0:* > LISTEN 3491/java > > > On Thu, May 15, 2014 at 2:53 PM, Ian Young wrote: > >> I just realized I had to set the consoleproxy.url.domain field to " >> realhostip.com" but now when I try to view the console, the browser says >> "The server refused the connection." Does that indicate a problem with the >> SSL certificate? >> >> management-server.log: >> 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) >> Seq 1-90898443: Sending { Cmd , MgmtId: 161342909744, via: 1( >> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, >> [{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}] >> } >> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] >> (AgentManager-Handler-5:null) Seq 1-90898443: Processing: { Ans: , MgmtId: >> 161342909744, via: 1, Ver: v1, Flags: 10, >> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}] >> } >> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) >> Seq 1-90898443: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, >> Flags: 10, { GetVncPortAnswer } } >> 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet] >> (catalina-exec-15:null) Port info 192.168.100.6 >> 2014-05-15 14:43:55,563 INFO [c.c.s.ConsoleProxyServlet] >> (catalina-exec-15:null) Parse host info returned from executing >> GetVNCPortCommand. host info: 192.168.100.6 >> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] >> (catalina-exec-15:null) Compose console url: >> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg >> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] >> (catalina-exec-15:null) the console url is :: >> v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg >> "> >> >> ssl_access_log: >> 192.168.100.166 - - [15/May/2014:14:44:55 -0700] "GET >> /client/console?cmd=access&vm=086b5822-de00-4764-8b05-d8e00657ee54 >> HTTP/1.1" 200 405 >> >> >> On Wed, May 14, 2014 at 5:56 PM, Ian Young wrote: >> >>> Looks like it's still using HTTP, not HTTPS: >>> >>> 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) >>> Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( >>> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, >>> [{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}] >>> } >>> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] >>> (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , >>> MgmtI
Re: replacement for realhostip
The problem appears to be with the console proxy itself. Here are the ports that are listening on the public interface, according to an nmap TCP scan: PORTSTATE SERVICE 80/tcp open http 443/tcp closed https When I logged into the console proxy through the link local address, I checked for processes on port 443 and there are none, so obviously an HTTPS connection can't be made. There is a Java process listening on port 80 but nothing on 443. Is there something in the global settings that will enable HTTPS, or is this a bug? root@v-2-VM:~# netstat -lnp | grep java tcp0 0 0.0.0.0:80010.0.0.0:* LISTEN 3491/java tcp0 0 0.0.0.0:80 0.0.0.0:* LISTEN 3491/java On Thu, May 15, 2014 at 2:53 PM, Ian Young wrote: > I just realized I had to set the consoleproxy.url.domain field to " > realhostip.com" but now when I try to view the console, the browser says > "The server refused the connection." Does that indicate a problem with the > SSL certificate? > > management-server.log: > 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) > Seq 1-90898443: Sending { Cmd , MgmtId: 161342909744, via: 1( > virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}] > } > 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] > (AgentManager-Handler-5:null) Seq 1-90898443: Processing: { Ans: , MgmtId: > 161342909744, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}] > } > 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) > Seq 1-90898443: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, > Flags: 10, { GetVncPortAnswer } } > 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-15:null) Port info 192.168.100.6 > 2014-05-15 14:43:55,563 INFO [c.c.s.ConsoleProxyServlet] > (catalina-exec-15:null) Parse host info returned from executing > GetVNCPortCommand. host info: 192.168.100.6 > 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-15:null) Compose console url: > https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg > 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] > (catalina-exec-15:null) the console url is :: > v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg > "> > > ssl_access_log: > 192.168.100.166 - - [15/May/2014:14:44:55 -0700] "GET > /client/console?cmd=access&vm=086b5822-de00-4764-8b05-d8e00657ee54 > HTTP/1.1" 200 405 > > > On Wed, May 14, 2014 at 5:56 PM, Ian Young wrote: > >> Looks like it's still using HTTP, not HTTPS: >> >> 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) >> Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( >> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, >> [{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}] >> } >> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] >> (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , >> MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, >> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5903,"result":true,"wait":0}}] >> } >> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) >> Seq 1-800529939: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, >> Flags: 10, { GetVncPortAnswer } } >> 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet] >> (catalina-exec-20:null) Port info 192.168.100.6 >> 2014-05-14 17:52:35,861 INFO [c.c.s.ConsoleProxyServlet] >> (catalina-exec-20:null) Parse host info returned from executing >> GetVNCPortCommand. host info: 192.168.100.6 >> 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] >
Re: new installation--ssvm won't start
I wiped the server clean and started over again today. In the process, I realized that, the previous time, I forgot to uncomment the Domain line in /etc/idmapd.conf. However, even though I included the step this time, the GUI installer still seems to hang on the final "Creating system VMs" step. I see two VMs running when I run "virsh list" (the secondary storage VM keeps getting regenerated). In the primary storage, it looks like there is one complete 693 MB image but the other two are only 11 and 12 MB, although they are gradually growing. What's happening here? [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 nobody nobody 11M May 8 09:55 54de167f-ad9c-453b-91c7-fdd644922932 -rwxr--r--. 1 nobody nobody 12M May 8 09:55 91069b66-b1b3-41aa-8995-874fd4353473 -rwxr--r--. 1 nobody nobody 693M May 8 09:16 c2e6efba-d6c7-11e3-9e76-002590c96d30 The management server log keeps reporting that "There is no secondary storage VM for secondary storage host nfs://192.168.100.6/var/secondary." Here is a larger section of logs: http://pastebin.com/NFf5cBx3 On Wed, May 7, 2014 at 10:49 AM, Ian Young wrote: > I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab, > the Secondary Storage says "Allocated 0.00 KB / 0.00 KB". However, the > secondary storage NFS mount is listed in Home > Infrastructure > Secondary > Storage and the URL is correct. Does this mean the secondary storage is > unreachable? > > > On Wed, May 7, 2014 at 10:26 AM, Ian Young wrote: > >> I reinstalled my single server CloudStack system yesterday, following the >> quick start guide precisely. The only difference was that I used >> /var/primary and /var/secondary instead of /primary and /secondary, because >> the /var partition on this machine is very large. The UI installer reached >> the point where it says "Creating system VMs (this may take a while)" but >> never finished. I left it overnight and it still hadn't completed. This >> is typically the step that fails, most of the times I've installed >> CloudStack, so I imagine I must be making the same fundamental mistake each >> time, and I'd like to know what that is. >> >> I checked management.log and it's in a loop where it creates a secondary >> storage VM, fails to start it, destroys it, and tries again. It says Host >> 1 is unreachable but I'm using the correct password, SELinux is permissive, >> and all the iptables rules are in place. In what way is it trying to >> connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to >> the SSVM: >> >> http://pastebin.com/X11A51bh >> >> NFS appears to be functional, since CloudStack automatically mounted the >> primary storage. >> >> FilesystemSize Used Avail Use% Mounted on >> /dev/sda3 20G 1.8G 17G 10% / >> tmpfs 32G 0 32G 0% /dev/shm >> /dev/sda1 194M 42M 143M 23% /boot >> /dev/sda4 1.8T 1.9G 1.7T 1% /var >> 192.168.100.6:/var/primary >> 1.8T 1.9G 1.7T 1% >> /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af >> >> How can I identify whatever it is that's preventing the SSVM from >> starting? Here is another log excerpt, without any filtering: >> >> http://pastebin.com/XsPGJQik >> > >
Re: replacement for realhostip
Looks like it's still using HTTP, not HTTPS: 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5903,"result":true,"wait":0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Port info 192.168.100.6 2014-05-14 17:52:35,861 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Compose console url: http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) the console url is :: phonesynergyhttp://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw "> On Wed, May 14, 2014 at 5:41 PM, Ian Young wrote: > I decided to create my own internal realhostip.com. My DNS servers use > PowerDNS, not BIND, so the $GENERATE directive was not an option and I > didn't want to have to populate my DNS servers' databases with a record for > every possible IP address. Fortunately, I found the following Lua script: > > https://github.com/terbolous/powerdns-cloudstack-proxy-dns > > I can confirm the Lua script works as expected and my CloudStack server > can be tricked into believing my internal DNS servers are the authority for > realhostip.com: > > [root@virthost1 ]# dig +short 1-2-3-4.realhostip.com > 1.2.3.4 > > I followed this guide and updated the console proxy/SSVM SSL certificate > with my own *.realhostip.com certificate. > > > http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/latest/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain > > The console proxy restarted but it's still blank when I try to view the > console. Does the domain have to be something other than realhostip.com? >
replacement for realhostip
I decided to create my own internal realhostip.com. My DNS servers use PowerDNS, not BIND, so the $GENERATE directive was not an option and I didn't want to have to populate my DNS servers' databases with a record for every possible IP address. Fortunately, I found the following Lua script: https://github.com/terbolous/powerdns-cloudstack-proxy-dns I can confirm the Lua script works as expected and my CloudStack server can be tricked into believing my internal DNS servers are the authority for realhostip.com: [root@virthost1 ]# dig +short 1-2-3-4.realhostip.com 1.2.3.4 I followed this guide and updated the console proxy/SSVM SSL certificate with my own *.realhostip.com certificate. http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/latest/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain The console proxy restarted but it's still blank when I try to view the console. Does the domain have to be something other than realhostip.com?
Re: new installation--ssvm won't start
I know this has something to do with idmapd and NFS. This error keeps appearing in /var/log/messages: May 8 10:29:54 virthost1 rpc.idmapd[11044]: nss_getpwnam: name '0' does not map into domain 'redacted.com' On Thu, May 8, 2014 at 5:20 PM, Ian Young wrote: > I wiped the server clean and started over again today. In the process, I > realized that, the previous time, I forgot to uncomment the Domain line in > /etc/idmapd.conf. However, even though I included the step this time, the > GUI installer still seems to hang on the final "Creating system VMs" step. > I see two VMs running when I run "virsh list" (the secondary storage VM > keeps getting regenerated). In the primary storage, it looks like there is > one complete 693 MB image but the other two are only 11 and 12 MB, although > they are gradually growing. What's happening here? > > [root@virthost1 ~]# ls -hl /var/primary/ > total 715M > -rwxr--r--. 1 nobody nobody 11M May 8 09:55 > 54de167f-ad9c-453b-91c7-fdd644922932 > -rwxr--r--. 1 nobody nobody 12M May 8 09:55 > 91069b66-b1b3-41aa-8995-874fd4353473 > -rwxr--r--. 1 nobody nobody 693M May 8 09:16 > c2e6efba-d6c7-11e3-9e76-002590c96d30 > > The management server log keeps reporting that "There is no secondary > storage VM for secondary storage host nfs://192.168.100.6/var/secondary." > Here is a larger section of logs: > http://pastebin.com/NFf5cBx3 > > > On Wed, May 7, 2014 at 10:49 AM, Ian Young wrote: > >> I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab, >> the Secondary Storage says "Allocated 0.00 KB / 0.00 KB". However, the >> secondary storage NFS mount is listed in Home > Infrastructure > Secondary >> Storage and the URL is correct. Does this mean the secondary storage is >> unreachable? >> >> >> On Wed, May 7, 2014 at 10:26 AM, Ian Young wrote: >> >>> I reinstalled my single server CloudStack system yesterday, following >>> the quick start guide precisely. The only difference was that I used >>> /var/primary and /var/secondary instead of /primary and /secondary, because >>> the /var partition on this machine is very large. The UI installer reached >>> the point where it says "Creating system VMs (this may take a while)" but >>> never finished. I left it overnight and it still hadn't completed. This >>> is typically the step that fails, most of the times I've installed >>> CloudStack, so I imagine I must be making the same fundamental mistake each >>> time, and I'd like to know what that is. >>> >>> I checked management.log and it's in a loop where it creates a secondary >>> storage VM, fails to start it, destroys it, and tries again. It says Host >>> 1 is unreachable but I'm using the correct password, SELinux is permissive, >>> and all the iptables rules are in place. In what way is it trying to >>> connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to >>> the SSVM: >>> >>> http://pastebin.com/X11A51bh >>> >>> NFS appears to be functional, since CloudStack automatically mounted the >>> primary storage. >>> >>> FilesystemSize Used Avail Use% Mounted on >>> /dev/sda3 20G 1.8G 17G 10% / >>> tmpfs 32G 0 32G 0% /dev/shm >>> /dev/sda1 194M 42M 143M 23% /boot >>> /dev/sda4 1.8T 1.9G 1.7T 1% /var >>> 192.168.100.6:/var/primary >>> 1.8T 1.9G 1.7T 1% >>> /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af >>> >>> How can I identify whatever it is that's preventing the SSVM from >>> starting? Here is another log excerpt, without any filtering: >>> >>> http://pastebin.com/XsPGJQik >>> >> >> >
Re: new installation--ssvm won't start
I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab, the Secondary Storage says "Allocated 0.00 KB / 0.00 KB". However, the secondary storage NFS mount is listed in Home > Infrastructure > Secondary Storage and the URL is correct. Does this mean the secondary storage is unreachable? On Wed, May 7, 2014 at 10:26 AM, Ian Young wrote: > I reinstalled my single server CloudStack system yesterday, following the > quick start guide precisely. The only difference was that I used > /var/primary and /var/secondary instead of /primary and /secondary, because > the /var partition on this machine is very large. The UI installer reached > the point where it says "Creating system VMs (this may take a while)" but > never finished. I left it overnight and it still hadn't completed. This > is typically the step that fails, most of the times I've installed > CloudStack, so I imagine I must be making the same fundamental mistake each > time, and I'd like to know what that is. > > I checked management.log and it's in a loop where it creates a secondary > storage VM, fails to start it, destroys it, and tries again. It says Host > 1 is unreachable but I'm using the correct password, SELinux is permissive, > and all the iptables rules are in place. In what way is it trying to > connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to > the SSVM: > > http://pastebin.com/X11A51bh > > NFS appears to be functional, since CloudStack automatically mounted the > primary storage. > > FilesystemSize Used Avail Use% Mounted on > /dev/sda3 20G 1.8G 17G 10% / > tmpfs 32G 0 32G 0% /dev/shm > /dev/sda1 194M 42M 143M 23% /boot > /dev/sda4 1.8T 1.9G 1.7T 1% /var > 192.168.100.6:/var/primary > 1.8T 1.9G 1.7T 1% > /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af > > How can I identify whatever it is that's preventing the SSVM from > starting? Here is another log excerpt, without any filtering: > > http://pastebin.com/XsPGJQik >
Re: new installation--ssvm won't start
I'm using 4.3. The Quick Installation Guide for CentOS (which is what I was following) still has the old URL. I forgot to mention changing the URL was another thing I did differently in order to get it working. http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/latest/qig.html On Tue, May 13, 2014 at 2:01 AM, sebgoa wrote: > > On May 13, 2014, at 10:02 AM, Geoff Higginbottom < > geoff.higginbot...@shapeblue.com> wrote: > > > Just for the record, the latest install doc does have the correct URLs > for the System VM Templates > > > > > http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html > > > > Yep, I just checked the master and the 4.3 version and the url seem > correct: > > > http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html#prepare-the-system-vm-template > > if it' snot let me know or submit a patch > > > > > Regards > > > > Geoff Higginbottom > > > > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 > > > > geoff.higginbot...@shapeblue.com > > > > -Original Message- > > From: dimas yoga pratama [mailto:smid...@gmail.com] > > Sent: 12 May 2014 17:44 > > To: users@cloudstack.apache.org > > Subject: Re: new installation--ssvm won't start > > > > Which version of Cloudstack you installd? If you follow the Cloudstack > 4.3 installation guide there is a mistake in system template setup section, > > > > you should change the old URL with: > > > http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 > > > > hope it works. > > > > On Fri, May 9, 2014 at 7:20 AM, Ian Young > wrote: > > > >> I wiped the server clean and started over again today. In the > >> process, I realized that, the previous time, I forgot to uncomment the > >> Domain line in /etc/idmapd.conf. However, even though I included the > >> step this time, the GUI installer still seems to hang on the final > "Creating system VMs" step. > >> I see two VMs running when I run "virsh list" (the secondary storage > >> VM keeps getting regenerated). In the primary storage, it looks like > >> there is one complete 693 MB image but the other two are only 11 and > >> 12 MB, although they are gradually growing. What's happening here? > >> > >> [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 > >> nobody nobody 11M May 8 09:55 > >> 54de167f-ad9c-453b-91c7-fdd644922932 > >> -rwxr--r--. 1 nobody nobody 12M May 8 09:55 > >> 91069b66-b1b3-41aa-8995-874fd4353473 > >> -rwxr--r--. 1 nobody nobody 693M May 8 09:16 > >> c2e6efba-d6c7-11e3-9e76-002590c96d30 > >> > >> The management server log keeps reporting that "There is no secondary > >> storage VM for secondary storage host nfs://192.168.100.6/var/secondary > ." > >> Here is a larger section of logs: > >> http://pastebin.com/NFf5cBx3 > >> > >> > >> On Wed, May 7, 2014 at 10:49 AM, Ian Young > wrote: > >> > >>> I noticed that in Home > Infrastructure > Zones > Zone1, Resources > >>> tab, the Secondary Storage says "Allocated 0.00 KB / 0.00 KB". > >>> However, the secondary storage NFS mount is listed in Home > > >>> Infrastructure > > >> Secondary > >>> Storage and the URL is correct. Does this mean the secondary > >>> storage is unreachable? > >>> > >>> > >>> On Wed, May 7, 2014 at 10:26 AM, Ian Young > >> wrote: > >>> > >>>> I reinstalled my single server CloudStack system yesterday, > >>>> following > >> the > >>>> quick start guide precisely. The only difference was that I used > >>>> /var/primary and /var/secondary instead of /primary and /secondary, > >> because > >>>> the /var partition on this machine is very large. The UI installer > >> reached > >>>> the point where it says "Creating system VMs (this may take a while)" > >> but > >>>> never finished. I left it overnight and it still hadn't completed. > >> This > >>>> is typically the step that fails, most of the times I've installed > >>>> CloudStack, so I imagine I must be making the same fundamental > >>>> mistake > >> each > >>>> time, and I'd like to
Re: new installation--ssvm won't start
Exactly. That's the URL I was referring to. I changed it to the 2014-01-14 template and it worked. On Tue, May 13, 2014 at 9:52 AM, dimas yoga pratama wrote: > oh okay, I should have read that part as well. > What I mean is this : > > http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/qig.html#system-template-setup > > > > On Tue, May 13, 2014 at 4:01 PM, sebgoa wrote: > > > > > On May 13, 2014, at 10:02 AM, Geoff Higginbottom < > > geoff.higginbot...@shapeblue.com> wrote: > > > > > Just for the record, the latest install doc does have the correct URLs > > for the System VM Templates > > > > > > > > > http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html > > > > > > > Yep, I just checked the master and the 4.3 version and the url seem > > correct: > > > > > > > http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html#prepare-the-system-vm-template > > > > if it' snot let me know or submit a patch > > > > > > > > Regards > > > > > > Geoff Higginbottom > > > > > > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 > > > > > > geoff.higginbot...@shapeblue.com > > > > > > -Original Message- > > > From: dimas yoga pratama [mailto:smid...@gmail.com] > > > Sent: 12 May 2014 17:44 > > > To: users@cloudstack.apache.org > > > Subject: Re: new installation--ssvm won't start > > > > > > Which version of Cloudstack you installd? If you follow the Cloudstack > > 4.3 installation guide there is a mistake in system template setup > section, > > > > > > you should change the old URL with: > > > > > > http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 > > > > > > hope it works. > > > > > > On Fri, May 9, 2014 at 7:20 AM, Ian Young > > wrote: > > > > > >> I wiped the server clean and started over again today. In the > > >> process, I realized that, the previous time, I forgot to uncomment the > > >> Domain line in /etc/idmapd.conf. However, even though I included the > > >> step this time, the GUI installer still seems to hang on the final > > "Creating system VMs" step. > > >> I see two VMs running when I run "virsh list" (the secondary storage > > >> VM keeps getting regenerated). In the primary storage, it looks like > > >> there is one complete 693 MB image but the other two are only 11 and > > >> 12 MB, although they are gradually growing. What's happening here? > > >> > > >> [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 > > >> nobody nobody 11M May 8 09:55 > > >> 54de167f-ad9c-453b-91c7-fdd644922932 > > >> -rwxr--r--. 1 nobody nobody 12M May 8 09:55 > > >> 91069b66-b1b3-41aa-8995-874fd4353473 > > >> -rwxr--r--. 1 nobody nobody 693M May 8 09:16 > > >> c2e6efba-d6c7-11e3-9e76-002590c96d30 > > >> > > >> The management server log keeps reporting that "There is no secondary > > >> storage VM for secondary storage host nfs:// > 192.168.100.6/var/secondary > > ." > > >> Here is a larger section of logs: > > >> http://pastebin.com/NFf5cBx3 > > >> > > >> > > >> On Wed, May 7, 2014 at 10:49 AM, Ian Young > > wrote: > > >> > > >>> I noticed that in Home > Infrastructure > Zones > Zone1, Resources > > >>> tab, the Secondary Storage says "Allocated 0.00 KB / 0.00 KB". > > >>> However, the secondary storage NFS mount is listed in Home > > > >>> Infrastructure > > > >> Secondary > > >>> Storage and the URL is correct. Does this mean the secondary > > >>> storage is unreachable? > > >>> > > >>> > > >>> On Wed, May 7, 2014 at 10:26 AM, Ian Young > > >> wrote: > > >>> > > >>>> I reinstalled my single server CloudStack system yesterday, > > >>>> following > > >> the > > >>>> quick start guide precisely. The only difference was that I used > > >>>> /var/primary and /var/secondary instead of /primary and /secondary, > > >> because > > >>>> the /var partition on this ma
Re: new installation--ssvm won't start
I was able to complete the installation on Friday. Two things I did differently that were not mentioned in the quick start guide were to disable requiretty in /etc/sudoers and to set up NFSv4 correctly (i.e. set up a global root directory with fsid=0). I'm not sure how much impact the sudoers configuration had on my problem but I'm pretty sure the NFS setup was the main issue. On Thu, May 8, 2014 at 5:33 PM, Ian Young wrote: > I know this has something to do with idmapd and NFS. This error keeps > appearing in /var/log/messages: > May 8 10:29:54 virthost1 rpc.idmapd[11044]: nss_getpwnam: name '0' does > not map into domain 'redacted.com' > > > On Thu, May 8, 2014 at 5:20 PM, Ian Young wrote: > >> I wiped the server clean and started over again today. In the process, I >> realized that, the previous time, I forgot to uncomment the Domain line in >> /etc/idmapd.conf. However, even though I included the step this time, the >> GUI installer still seems to hang on the final "Creating system VMs" step. >> I see two VMs running when I run "virsh list" (the secondary storage VM >> keeps getting regenerated). In the primary storage, it looks like there is >> one complete 693 MB image but the other two are only 11 and 12 MB, although >> they are gradually growing. What's happening here? >> >> [root@virthost1 ~]# ls -hl /var/primary/ >> total 715M >> -rwxr--r--. 1 nobody nobody 11M May 8 09:55 >> 54de167f-ad9c-453b-91c7-fdd644922932 >> -rwxr--r--. 1 nobody nobody 12M May 8 09:55 >> 91069b66-b1b3-41aa-8995-874fd4353473 >> -rwxr--r--. 1 nobody nobody 693M May 8 09:16 >> c2e6efba-d6c7-11e3-9e76-002590c96d30 >> >> The management server log keeps reporting that "There is no secondary >> storage VM for secondary storage host nfs://192.168.100.6/var/secondary." >> Here is a larger section of logs: >> http://pastebin.com/NFf5cBx3 >> >> >> On Wed, May 7, 2014 at 10:49 AM, Ian Young wrote: >> >>> I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab, >>> the Secondary Storage says "Allocated 0.00 KB / 0.00 KB". However, the >>> secondary storage NFS mount is listed in Home > Infrastructure > Secondary >>> Storage and the URL is correct. Does this mean the secondary storage is >>> unreachable? >>> >>> >>> On Wed, May 7, 2014 at 10:26 AM, Ian Young wrote: >>> >>>> I reinstalled my single server CloudStack system yesterday, following >>>> the quick start guide precisely. The only difference was that I used >>>> /var/primary and /var/secondary instead of /primary and /secondary, because >>>> the /var partition on this machine is very large. The UI installer reached >>>> the point where it says "Creating system VMs (this may take a while)" but >>>> never finished. I left it overnight and it still hadn't completed. This >>>> is typically the step that fails, most of the times I've installed >>>> CloudStack, so I imagine I must be making the same fundamental mistake each >>>> time, and I'd like to know what that is. >>>> >>>> I checked management.log and it's in a loop where it creates a >>>> secondary storage VM, fails to start it, destroys it, and tries again. It >>>> says Host 1 is unreachable but I'm using the correct password, SELinux is >>>> permissive, and all the iptables rules are in place. In what way is it >>>> trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages >>>> related to the SSVM: >>>> >>>> http://pastebin.com/X11A51bh >>>> >>>> NFS appears to be functional, since CloudStack automatically mounted >>>> the primary storage. >>>> >>>> FilesystemSize Used Avail Use% Mounted on >>>> /dev/sda3 20G 1.8G 17G 10% / >>>> tmpfs 32G 0 32G 0% /dev/shm >>>> /dev/sda1 194M 42M 143M 23% /boot >>>> /dev/sda4 1.8T 1.9G 1.7T 1% /var >>>> 192.168.100.6:/var/primary >>>> 1.8T 1.9G 1.7T 1% >>>> /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af >>>> >>>> How can I identify whatever it is that's preventing the SSVM from >>>> starting? Here is another log excerpt, without any filtering: >>>> >>>> http://pastebin.com/XsPGJQik >>>> >>> >>> >> >
new installation--ssvm won't start
I reinstalled my single server CloudStack system yesterday, following the quick start guide precisely. The only difference was that I used /var/primary and /var/secondary instead of /primary and /secondary, because the /var partition on this machine is very large. The UI installer reached the point where it says "Creating system VMs (this may take a while)" but never finished. I left it overnight and it still hadn't completed. This is typically the step that fails, most of the times I've installed CloudStack, so I imagine I must be making the same fundamental mistake each time, and I'd like to know what that is. I checked management.log and it's in a loop where it creates a secondary storage VM, fails to start it, destroys it, and tries again. It says Host 1 is unreachable but I'm using the correct password, SELinux is permissive, and all the iptables rules are in place. In what way is it trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to the SSVM: http://pastebin.com/X11A51bh NFS appears to be functional, since CloudStack automatically mounted the primary storage. FilesystemSize Used Avail Use% Mounted on /dev/sda3 20G 1.8G 17G 10% / tmpfs 32G 0 32G 0% /dev/shm /dev/sda1 194M 42M 143M 23% /boot /dev/sda4 1.8T 1.9G 1.7T 1% /var 192.168.100.6:/var/primary 1.8T 1.9G 1.7T 1% /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af How can I identify whatever it is that's preventing the SSVM from starting? Here is another log excerpt, without any filtering: http://pastebin.com/XsPGJQik
Re: basic networking, single server
I forgot to mention this system is entirely for internal purposes. We don't need a public network. On Mon, May 5, 2014 at 3:39 PM, Ian Young wrote: > I'm reinstalling CloudStack on a single server with lots of RAM, CPU > cores, and storage. I also have a single 192.168.100.0/24 private > network, which was set up before I was hired and can't be easily > reconfigured due to the high number of employee workstations currently > connected to it and occupying IP addresses across this range. I see that > the CloudStack documentation strongly recommends separate NICs for > management traffic and guest traffic. This server does have two NICs, so > what would be the ideal way to configure the network? Another switch with > a different subnet for the management network? What about the storage > network? >
basic networking, single server
I'm reinstalling CloudStack on a single server with lots of RAM, CPU cores, and storage. I also have a single 192.168.100.0/24 private network, which was set up before I was hired and can't be easily reconfigured due to the high number of employee workstations currently connected to it and occupying IP addresses across this range. I see that the CloudStack documentation strongly recommends separate NICs for management traffic and guest traffic. This server does have two NICs, so what would be the ideal way to configure the network? Another switch with a different subnet for the management network? What about the storage network?
Re: failed to start virtual router
I downgraded to 4.2.1, restored the database backup I made during the initial upgrade to 4.3, copied the rpmsave files over /etc/cloudstack/agent/agent.properties and /etc/cloudstack/managment/db.properties, and started the agent and management services. The service_ip field in the mshost table changed to 127.0.0.1 again, so I changed it to the actual IP address of the server. Then I ran cloud-install-sys-tmplt -m /var/storage/secondary -u http://download.cloud.com/templates/4.2/systemvmtemplate-2013-06-12-master-kvm.qcow2.bz2-h kvm -F which said it successfully installed the system VM template to /var/storage/secondary/template/tmpl/1/3/. Next, I restarted the management service. When I logged in, the management console said there was an SSVM, console proxy, and router present but they were all stopped. I tried starting the SSVM but it failed to start. Here is an excerpt from management.log: http://pastebin.com/uW4NPZbC Is there some sort of general troubleshooting script that could identify major configuration problems? I feel like this system is mostly functional but one or two misconfigured things are causing it to break. I'd hate to have to wipe it clean and start over, losing all my VMs in the process. On Wed, Apr 30, 2014 at 6:00 PM, Ian Young wrote: > I read this article about upgrading the system VMs: > > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+4.2+(KVM)+System+Vm+Upgrade > > However, there's just an empty set in the template_host_ref table. Is > this no longer used in 4.3? > > > On Wed, Apr 30, 2014 at 5:41 PM, Ian Young wrote: > >> I've tried upgrading to 4.3 again. After poking around some more in the >> database, I've discovered that the KVM system VM template was only 27% >> downloaded. I think this is why the virtual router was unable to >> start--the template was incomplete. Is there a way to force it to resume >> downloading? >> >> >> On Wed, Apr 30, 2014 at 3:03 PM, Ian Young wrote: >> >>> The address in Infrastructure > Hosts > (management server) is set to >>> the correct IP address, not 127.0.0.1. Why are the logs referring to >>> 127.0.0.1? >>> >>> >>> On Wed, Apr 30, 2014 at 3:00 PM, Ian Young wrote: >>> >>>> I notice my dashboard says "Management server node 127.0.0.1 is up." >>>> It used to have an actual address, not localhost. Could this be causing >>>> problems and if so, how can I set it back? >>>> >>>> >>>> On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote: >>>> >>>>> Yes, I replaced the new files with the rpmsave ones, which allowed the >>>>> agent to start. However, most of the functions in the management console >>>>> fail. >>>>> >>>>> >>>>> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang >>>>> wrote: >>>>> >>>>>> Do you have the file db.properties.rpmsave on management server and >>>>>> agent.properties.rpmsave on agents? If so, and the date is correct, you >>>>>> can >>>>>> use it rather than db.properties and agent.properties. >>>>>> And then restart management and agent services. >>>>>> >>>>>> >>>>>> On 30/04/14 03:26 PM, Ian Young wrote: >>>>>> >>>>>>> Yes, I restored the DB from the backup. When I try to start the >>>>>>> router it >>>>>>> says: >>>>>>> >>>>>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance >>>>>>> due to >>>>>>> Unable to start VM[DomainRouter|r-63-VM] due to error in >>>>>>> finalizeStart, not >>>>>>> retrying >>>>>>> >>>>>>> The management server log says: >>>>>>> >>>>>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] >>>>>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) >>>>>>> Unexpected exception while executing >>>>>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd >>>>>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is >>>>>>> unreachable: Host 1: Unable to start instance due to Unable to start >>>>>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying >>>>>>> >>>>>>> >>>
management server IP address
My management server IP address has always been 192.168.100.6. Ever since I upgraded to 4.3, it's been set to 127.0.0.1 (mshost.service_ip in the database). Where is this value being set and how can I change it back to the original IP address?
Re: failed to start virtual router
I read this article about upgrading the system VMs: https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+4.2+(KVM)+System+Vm+Upgrade However, there's just an empty set in the template_host_ref table. Is this no longer used in 4.3? On Wed, Apr 30, 2014 at 5:41 PM, Ian Young wrote: > I've tried upgrading to 4.3 again. After poking around some more in the > database, I've discovered that the KVM system VM template was only 27% > downloaded. I think this is why the virtual router was unable to > start--the template was incomplete. Is there a way to force it to resume > downloading? > > > On Wed, Apr 30, 2014 at 3:03 PM, Ian Young wrote: > >> The address in Infrastructure > Hosts > (management server) is set to the >> correct IP address, not 127.0.0.1. Why are the logs referring to 127.0.0.1? >> >> >> On Wed, Apr 30, 2014 at 3:00 PM, Ian Young wrote: >> >>> I notice my dashboard says "Management server node 127.0.0.1 is up." It >>> used to have an actual address, not localhost. Could this be causing >>> problems and if so, how can I set it back? >>> >>> >>> On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote: >>> >>>> Yes, I replaced the new files with the rpmsave ones, which allowed the >>>> agent to start. However, most of the functions in the management console >>>> fail. >>>> >>>> >>>> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote: >>>> >>>>> Do you have the file db.properties.rpmsave on management server and >>>>> agent.properties.rpmsave on agents? If so, and the date is correct, you >>>>> can >>>>> use it rather than db.properties and agent.properties. >>>>> And then restart management and agent services. >>>>> >>>>> >>>>> On 30/04/14 03:26 PM, Ian Young wrote: >>>>> >>>>>> Yes, I restored the DB from the backup. When I try to start the >>>>>> router it >>>>>> says: >>>>>> >>>>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance >>>>>> due to >>>>>> Unable to start VM[DomainRouter|r-63-VM] due to error in >>>>>> finalizeStart, not >>>>>> retrying >>>>>> >>>>>> The management server log says: >>>>>> >>>>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] >>>>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) >>>>>> Unexpected exception while executing >>>>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd >>>>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is >>>>>> unreachable: Host 1: Unable to start instance due to Unable to start >>>>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying >>>>>> >>>>>> >>>>>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang >>>>>> wrote: >>>>>> >>>>>> I think you had backed up database, when you upgraded. >>>>>>> When you downgraded CS, you also need to restore DB. >>>>>>> >>>>>>> >>>>>>> On 30/04/14 02:58 PM, Ian Young wrote: >>>>>>> >>>>>>> I think my problem stems from a partially downloaded system VM >>>>>>>> template. >>>>>>>> I >>>>>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must >>>>>>>> have >>>>>>>> been interrupted during the upgrade to 4.3. At the moment I've >>>>>>>> rolled >>>>>>>> back >>>>>>>> to 4.2.1 with a somewhat usable management interface, although the >>>>>>>> system >>>>>>>> VMs won't start. I suspect there is something in the database that >>>>>>>> is >>>>>>>> causing it to try to use the 4.3 template. How can I delete the >>>>>>>> template >>>>>>>> and make sure the management server is using the older one? >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young >>>>>>>> wrote: >>>>&g
Re: failed to start virtual router
I've tried upgrading to 4.3 again. After poking around some more in the database, I've discovered that the KVM system VM template was only 27% downloaded. I think this is why the virtual router was unable to start--the template was incomplete. Is there a way to force it to resume downloading? On Wed, Apr 30, 2014 at 3:03 PM, Ian Young wrote: > The address in Infrastructure > Hosts > (management server) is set to the > correct IP address, not 127.0.0.1. Why are the logs referring to 127.0.0.1? > > > On Wed, Apr 30, 2014 at 3:00 PM, Ian Young wrote: > >> I notice my dashboard says "Management server node 127.0.0.1 is up." It >> used to have an actual address, not localhost. Could this be causing >> problems and if so, how can I set it back? >> >> >> On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote: >> >>> Yes, I replaced the new files with the rpmsave ones, which allowed the >>> agent to start. However, most of the functions in the management console >>> fail. >>> >>> >>> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote: >>> >>>> Do you have the file db.properties.rpmsave on management server and >>>> agent.properties.rpmsave on agents? If so, and the date is correct, you can >>>> use it rather than db.properties and agent.properties. >>>> And then restart management and agent services. >>>> >>>> >>>> On 30/04/14 03:26 PM, Ian Young wrote: >>>> >>>>> Yes, I restored the DB from the backup. When I try to start the >>>>> router it >>>>> says: >>>>> >>>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due >>>>> to >>>>> Unable to start VM[DomainRouter|r-63-VM] due to error in >>>>> finalizeStart, not >>>>> retrying >>>>> >>>>> The management server log says: >>>>> >>>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] >>>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) >>>>> Unexpected exception while executing >>>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd >>>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is >>>>> unreachable: Host 1: Unable to start instance due to Unable to start >>>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying >>>>> >>>>> >>>>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang >>>>> wrote: >>>>> >>>>> I think you had backed up database, when you upgraded. >>>>>> When you downgraded CS, you also need to restore DB. >>>>>> >>>>>> >>>>>> On 30/04/14 02:58 PM, Ian Young wrote: >>>>>> >>>>>> I think my problem stems from a partially downloaded system VM >>>>>>> template. >>>>>>> I >>>>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must >>>>>>> have >>>>>>> been interrupted during the upgrade to 4.3. At the moment I've >>>>>>> rolled >>>>>>> back >>>>>>> to 4.2.1 with a somewhat usable management interface, although the >>>>>>> system >>>>>>> VMs won't start. I suspect there is something in the database that >>>>>>> is >>>>>>> causing it to try to use the 4.3 template. How can I delete the >>>>>>> template >>>>>>> and make sure the management server is using the older one? >>>>>>> >>>>>>> >>>>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young >>>>>>> wrote: >>>>>>> >>>>>>> Ok, so I've figured out a way to identify volumes in the >>>>>>> filesystem. For >>>>>>> >>>>>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad >>>>>>>> is >>>>>>>> the >>>>>>>> root volume for an instance I want to back up. Is this in qcow2 >>>>>>>> format >>>>>>>> or >>>>>>>> something else? I'm using KVM. >>>>>>>> >>>>>>>> >>>>&
Re: failed to start virtual router
I notice my dashboard says "Management server node 127.0.0.1 is up." It used to have an actual address, not localhost. Could this be causing problems and if so, how can I set it back? On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote: > Yes, I replaced the new files with the rpmsave ones, which allowed the > agent to start. However, most of the functions in the management console > fail. > > > On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote: > >> Do you have the file db.properties.rpmsave on management server and >> agent.properties.rpmsave on agents? If so, and the date is correct, you can >> use it rather than db.properties and agent.properties. >> And then restart management and agent services. >> >> >> On 30/04/14 03:26 PM, Ian Young wrote: >> >>> Yes, I restored the DB from the backup. When I try to start the router >>> it >>> says: >>> >>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to >>> Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, >>> not >>> retrying >>> >>> The management server log says: >>> >>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] >>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) >>> Unexpected exception while executing >>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd >>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is >>> unreachable: Host 1: Unable to start instance due to Unable to start >>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying >>> >>> >>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang >>> wrote: >>> >>> I think you had backed up database, when you upgraded. >>>> When you downgraded CS, you also need to restore DB. >>>> >>>> >>>> On 30/04/14 02:58 PM, Ian Young wrote: >>>> >>>> I think my problem stems from a partially downloaded system VM >>>>> template. >>>>> I >>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have >>>>> been interrupted during the upgrade to 4.3. At the moment I've rolled >>>>> back >>>>> to 4.2.1 with a somewhat usable management interface, although the >>>>> system >>>>> VMs won't start. I suspect there is something in the database that is >>>>> causing it to try to use the 4.3 template. How can I delete the >>>>> template >>>>> and make sure the management server is using the older one? >>>>> >>>>> >>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young >>>>> wrote: >>>>> >>>>> Ok, so I've figured out a way to identify volumes in the filesystem. >>>>> For >>>>> >>>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad >>>>>> is >>>>>> the >>>>>> root volume for an instance I want to back up. Is this in qcow2 >>>>>> format >>>>>> or >>>>>> something else? I'm using KVM. >>>>>> >>>>>> >>>>>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young >>>>>> wrote: >>>>>> >>>>>> Now I can't start cloudstack-agent. The agent.log says: >>>>>> >>>>>>> Unable to start agent: Failed to get private nic name >>>>>>> >>>>>>> I know this is because the network bridge is no longer set up >>>>>>> correctly. >>>>>>>I used to have a cloud0 and a cloudbr0 interface. Now I only have >>>>>>> cloudbr0. I haven't changed my network configuration. Somehow it's >>>>>>> been >>>>>>> changed by CloudStack during the upgrade/downgrade. This is getting >>>>>>> worse >>>>>>> and worse the more I try to recover my data. Is there any way to >>>>>>> back >>>>>>> up >>>>>>> the instances' volumes via the command line? I can't tell which is >>>>>>> which >>>>>>> because the filenames are all hashes. I really need to get these >>>>>>> instances >>>>>>> up and running--there ar
Re: failed to start virtual router
The address in Infrastructure > Hosts > (management server) is set to the correct IP address, not 127.0.0.1. Why are the logs referring to 127.0.0.1? On Wed, Apr 30, 2014 at 3:00 PM, Ian Young wrote: > I notice my dashboard says "Management server node 127.0.0.1 is up." It > used to have an actual address, not localhost. Could this be causing > problems and if so, how can I set it back? > > > On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote: > >> Yes, I replaced the new files with the rpmsave ones, which allowed the >> agent to start. However, most of the functions in the management console >> fail. >> >> >> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote: >> >>> Do you have the file db.properties.rpmsave on management server and >>> agent.properties.rpmsave on agents? If so, and the date is correct, you can >>> use it rather than db.properties and agent.properties. >>> And then restart management and agent services. >>> >>> >>> On 30/04/14 03:26 PM, Ian Young wrote: >>> >>>> Yes, I restored the DB from the backup. When I try to start the router >>>> it >>>> says: >>>> >>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due >>>> to >>>> Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, >>>> not >>>> retrying >>>> >>>> The management server log says: >>>> >>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] >>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) >>>> Unexpected exception while executing >>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd >>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is >>>> unreachable: Host 1: Unable to start instance due to Unable to start >>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying >>>> >>>> >>>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang >>>> wrote: >>>> >>>> I think you had backed up database, when you upgraded. >>>>> When you downgraded CS, you also need to restore DB. >>>>> >>>>> >>>>> On 30/04/14 02:58 PM, Ian Young wrote: >>>>> >>>>> I think my problem stems from a partially downloaded system VM >>>>>> template. >>>>>> I >>>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must >>>>>> have >>>>>> been interrupted during the upgrade to 4.3. At the moment I've rolled >>>>>> back >>>>>> to 4.2.1 with a somewhat usable management interface, although the >>>>>> system >>>>>> VMs won't start. I suspect there is something in the database that is >>>>>> causing it to try to use the 4.3 template. How can I delete the >>>>>> template >>>>>> and make sure the management server is using the older one? >>>>>> >>>>>> >>>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young >>>>>> wrote: >>>>>> >>>>>> Ok, so I've figured out a way to identify volumes in the >>>>>> filesystem. For >>>>>> >>>>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad >>>>>>> is >>>>>>> the >>>>>>> root volume for an instance I want to back up. Is this in qcow2 >>>>>>> format >>>>>>> or >>>>>>> something else? I'm using KVM. >>>>>>> >>>>>>> >>>>>>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young >>>>>>> wrote: >>>>>>> >>>>>>> Now I can't start cloudstack-agent. The agent.log says: >>>>>>> >>>>>>>> Unable to start agent: Failed to get private nic name >>>>>>>> >>>>>>>> I know this is because the network bridge is no longer set up >>>>>>>> correctly. >>>>>>>>I used to have a cloud0 and a cloudbr0 interface. Now I only >>>>>>>> have >>>>>>>> cloudbr0. I haven't changed my network configuration. Somehow it's >>>>>
Re: failed to start virtual router
Yes, I replaced the new files with the rpmsave ones, which allowed the agent to start. However, most of the functions in the management console fail. On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote: > Do you have the file db.properties.rpmsave on management server and > agent.properties.rpmsave on agents? If so, and the date is correct, you can > use it rather than db.properties and agent.properties. > And then restart management and agent services. > > > On 30/04/14 03:26 PM, Ian Young wrote: > >> Yes, I restored the DB from the backup. When I try to start the router it >> says: >> >> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to >> Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, >> not >> retrying >> >> The management server log says: >> >> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] >> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) >> Unexpected exception while executing >> org.apache.cloudstack.api.command.admin.router.StartRouterCmd >> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is >> unreachable: Host 1: Unable to start instance due to Unable to start >> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying >> >> >> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang >> wrote: >> >> I think you had backed up database, when you upgraded. >>> When you downgraded CS, you also need to restore DB. >>> >>> >>> On 30/04/14 02:58 PM, Ian Young wrote: >>> >>> I think my problem stems from a partially downloaded system VM template. >>>> I >>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have >>>> been interrupted during the upgrade to 4.3. At the moment I've rolled >>>> back >>>> to 4.2.1 with a somewhat usable management interface, although the >>>> system >>>> VMs won't start. I suspect there is something in the database that is >>>> causing it to try to use the 4.3 template. How can I delete the >>>> template >>>> and make sure the management server is using the older one? >>>> >>>> >>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young >>>> wrote: >>>> >>>> Ok, so I've figured out a way to identify volumes in the filesystem. >>>> For >>>> >>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is >>>>> the >>>>> root volume for an instance I want to back up. Is this in qcow2 format >>>>> or >>>>> something else? I'm using KVM. >>>>> >>>>> >>>>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young >>>>> wrote: >>>>> >>>>> Now I can't start cloudstack-agent. The agent.log says: >>>>> >>>>>> Unable to start agent: Failed to get private nic name >>>>>> >>>>>> I know this is because the network bridge is no longer set up >>>>>> correctly. >>>>>>I used to have a cloud0 and a cloudbr0 interface. Now I only have >>>>>> cloudbr0. I haven't changed my network configuration. Somehow it's >>>>>> been >>>>>> changed by CloudStack during the upgrade/downgrade. This is getting >>>>>> worse >>>>>> and worse the more I try to recover my data. Is there any way to back >>>>>> up >>>>>> the instances' volumes via the command line? I can't tell which is >>>>>> which >>>>>> because the filenames are all hashes. I really need to get these >>>>>> instances >>>>>> up and running--there are several months worth of work at stake here. >>>>>> >>>>>> >>>>>> On Tue, Apr 29, 2014 at 6:13 PM, ma y wrote: >>>>>> >>>>>> I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 >>>>>> safely? >>>>>> >>>>>>> >>>>>>> 2014-04-30 8:45 GMT+08:00 Ian Young : >>>>>>> >>>>>>> Ok, my Cloudstack installation is now so broken that I think it's >>>>>>> probably >>>>>>> >>>>>>> best to backup all my instances and templates, wipe the databases, >>&g
Re: failed to start virtual router
Yes, I restored the DB from the backup. When I try to start the router it says: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying The management server log says: 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying On Wed, Apr 30, 2014 at 12:02 PM, stevenliang wrote: > I think you had backed up database, when you upgraded. > When you downgraded CS, you also need to restore DB. > > > On 30/04/14 02:58 PM, Ian Young wrote: > >> I think my problem stems from a partially downloaded system VM template. >> I >> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have >> been interrupted during the upgrade to 4.3. At the moment I've rolled >> back >> to 4.2.1 with a somewhat usable management interface, although the system >> VMs won't start. I suspect there is something in the database that is >> causing it to try to use the 4.3 template. How can I delete the template >> and make sure the management server is using the older one? >> >> >> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young >> wrote: >> >> Ok, so I've figured out a way to identify volumes in the filesystem. For >>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is >>> the >>> root volume for an instance I want to back up. Is this in qcow2 format >>> or >>> something else? I'm using KVM. >>> >>> >>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young >>> wrote: >>> >>> Now I can't start cloudstack-agent. The agent.log says: >>>> >>>> Unable to start agent: Failed to get private nic name >>>> >>>> I know this is because the network bridge is no longer set up correctly. >>>> I used to have a cloud0 and a cloudbr0 interface. Now I only have >>>> cloudbr0. I haven't changed my network configuration. Somehow it's >>>> been >>>> changed by CloudStack during the upgrade/downgrade. This is getting >>>> worse >>>> and worse the more I try to recover my data. Is there any way to back >>>> up >>>> the instances' volumes via the command line? I can't tell which is >>>> which >>>> because the filenames are all hashes. I really need to get these >>>> instances >>>> up and running--there are several months worth of work at stake here. >>>> >>>> >>>> On Tue, Apr 29, 2014 at 6:13 PM, ma y wrote: >>>> >>>> I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? >>>>> >>>>> >>>>> 2014-04-30 8:45 GMT+08:00 Ian Young : >>>>> >>>>> Ok, my Cloudstack installation is now so broken that I think it's >>>>>> >>>>> probably >>>>> >>>>>> best to backup all my instances and templates, wipe the databases, and >>>>>> start from scratch. However, I can't take snapshots or download >>>>>> >>>>> volumes >>>>> >>>>>> anymore. What's causing these errors? >>>>>> >>>>>> 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] >>>>>> (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: >>>>>> com.cloud.utils.exception.CloudRuntimeException: Failed to send >>>>>> >>>>> command, >>>>> >>>>>> due to Agent:1, com.cloud.exception.OperationTimedoutException: >>>>>> >>>>> Commands >>>>> >>>>>> 841744457 to Host 1 timed out after 21600 >>>>>> 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] >>>>>> (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed >>>>>> com.cloud.utils.exception.CloudRuntimeException: >>>>>> com.cloud.utils.exception.CloudRuntimeException: Failed to send >>>>>> >>>>> command, >>>>>
Re: failed to start virtual router
I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young wrote: > Ok, so I've figured out a way to identify volumes in the filesystem. For > instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the > root volume for an instance I want to back up. Is this in qcow2 format or > something else? I'm using KVM. > > > On Tue, Apr 29, 2014 at 7:38 PM, Ian Young wrote: > >> Now I can't start cloudstack-agent. The agent.log says: >> >> Unable to start agent: Failed to get private nic name >> >> I know this is because the network bridge is no longer set up correctly. >> I used to have a cloud0 and a cloudbr0 interface. Now I only have >> cloudbr0. I haven't changed my network configuration. Somehow it's been >> changed by CloudStack during the upgrade/downgrade. This is getting worse >> and worse the more I try to recover my data. Is there any way to back up >> the instances' volumes via the command line? I can't tell which is which >> because the filenames are all hashes. I really need to get these instances >> up and running--there are several months worth of work at stake here. >> >> >> On Tue, Apr 29, 2014 at 6:13 PM, ma y wrote: >> >>> I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? >>> >>> >>> 2014-04-30 8:45 GMT+08:00 Ian Young : >>> >>> > Ok, my Cloudstack installation is now so broken that I think it's >>> probably >>> > best to backup all my instances and templates, wipe the databases, and >>> > start from scratch. However, I can't take snapshots or download >>> volumes >>> > anymore. What's causing these errors? >>> > >>> > 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] >>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: >>> > com.cloud.utils.exception.CloudRuntimeException: Failed to send >>> command, >>> > due to Agent:1, com.cloud.exception.OperationTimedoutException: >>> Commands >>> > 841744457 to Host 1 timed out after 21600 >>> > 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] >>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed >>> > com.cloud.utils.exception.CloudRuntimeException: >>> > com.cloud.utils.exception.CloudRuntimeException: Failed to send >>> command, >>> > due to Agent:1, com.cloud.exception.OperationTimedoutException: >>> Commands >>> > 841744457 to Host 1 timed out after 21600 >>> > 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] >>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object >>> > (VOLUME, >>> > org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), >>> no >>> > need to delete from object in store ref table >>> > 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] >>> > (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing >>> > org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd >>> > com.cloud.utils.exception.CloudRuntimeException: Failed to copy the >>> volume >>> > from the source primary storage pool to secondary storage. >>> > 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] >>> > (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: >>> FAILED, >>> > resultCode: 530, result: >>> > >>> > >>> org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed >>> > to copy the volume from the source primary storage pool to secondary >>> > storage."} >>> > >>> > >>> > On Tue, Apr 29, 2014 at 4:15 PM, Ian Young >>> wrote: >>> > >>> > > I downgraded to 4.2.1 again but cloudstack-management won't start >>> because >>> > > the database is version 4.3. Is it safe to restore t
Re: failed to start virtual router
Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young wrote: > Now I can't start cloudstack-agent. The agent.log says: > > Unable to start agent: Failed to get private nic name > > I know this is because the network bridge is no longer set up correctly. > I used to have a cloud0 and a cloudbr0 interface. Now I only have > cloudbr0. I haven't changed my network configuration. Somehow it's been > changed by CloudStack during the upgrade/downgrade. This is getting worse > and worse the more I try to recover my data. Is there any way to back up > the instances' volumes via the command line? I can't tell which is which > because the filenames are all hashes. I really need to get these instances > up and running--there are several months worth of work at stake here. > > > On Tue, Apr 29, 2014 at 6:13 PM, ma y wrote: > >> I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? >> >> >> 2014-04-30 8:45 GMT+08:00 Ian Young : >> >> > Ok, my Cloudstack installation is now so broken that I think it's >> probably >> > best to backup all my instances and templates, wipe the databases, and >> > start from scratch. However, I can't take snapshots or download volumes >> > anymore. What's causing these errors? >> > >> > 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] >> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: >> > com.cloud.utils.exception.CloudRuntimeException: Failed to send command, >> > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands >> > 841744457 to Host 1 timed out after 21600 >> > 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] >> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed >> > com.cloud.utils.exception.CloudRuntimeException: >> > com.cloud.utils.exception.CloudRuntimeException: Failed to send command, >> > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands >> > 841744457 to Host 1 timed out after 21600 >> > 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] >> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object >> > (VOLUME, >> > org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), >> no >> > need to delete from object in store ref table >> > 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] >> > (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing >> > org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd >> > com.cloud.utils.exception.CloudRuntimeException: Failed to copy the >> volume >> > from the source primary storage pool to secondary storage. >> > 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] >> > (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: >> FAILED, >> > resultCode: 530, result: >> > >> > >> org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed >> > to copy the volume from the source primary storage pool to secondary >> > storage."} >> > >> > >> > On Tue, Apr 29, 2014 at 4:15 PM, Ian Young >> wrote: >> > >> > > I downgraded to 4.2.1 again but cloudstack-management won't start >> because >> > > the database is version 4.3. Is it safe to restore the database >> backup I >> > > made prior to this whole process? In the meantime I have destroyed and >> > > created system VMs, so I'm not sure it's a good idea. >> > > On Apr 29, 2014 3:09 PM, "Ian Young" wrote: >> > > >> > >> @stevenliang: I take it back--you can't set the VM size when you >> > register >> > >> the template. >> > >> >> > >> >> > >> On Tue, Apr 29, 2014 at 3:02 PM, motty cruz >> > wrote: >> > >> >> > >>> yes, you would have to shutdown the router, then click on "Change >> > Service >> > >>> Offering" >> > >>> restart the VR. >> > >>> >> > >>> To Ian, >> > >>> >> > >>> I suspe
Re: failed to start virtual router
Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y wrote: > I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? > > > 2014-04-30 8:45 GMT+08:00 Ian Young : > > > Ok, my Cloudstack installation is now so broken that I think it's > probably > > best to backup all my instances and templates, wipe the databases, and > > start from scratch. However, I can't take snapshots or download volumes > > anymore. What's causing these errors? > > > > 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] > > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: > > com.cloud.utils.exception.CloudRuntimeException: Failed to send command, > > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands > > 841744457 to Host 1 timed out after 21600 > > 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] > > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed > > com.cloud.utils.exception.CloudRuntimeException: > > com.cloud.utils.exception.CloudRuntimeException: Failed to send command, > > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands > > 841744457 to Host 1 timed out after 21600 > > 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] > > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object > > (VOLUME, > > org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), > no > > need to delete from object in store ref table > > 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] > > (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing > > org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd > > com.cloud.utils.exception.CloudRuntimeException: Failed to copy the > volume > > from the source primary storage pool to secondary storage. > > 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] > > (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED, > > resultCode: 530, result: > > > > > org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed > > to copy the volume from the source primary storage pool to secondary > > storage."} > > > > > > On Tue, Apr 29, 2014 at 4:15 PM, Ian Young > wrote: > > > > > I downgraded to 4.2.1 again but cloudstack-management won't start > because > > > the database is version 4.3. Is it safe to restore the database backup > I > > > made prior to this whole process? In the meantime I have destroyed and > > > created system VMs, so I'm not sure it's a good idea. > > > On Apr 29, 2014 3:09 PM, "Ian Young" wrote: > > > > > >> @stevenliang: I take it back--you can't set the VM size when you > > register > > >> the template. > > >> > > >> > > >> On Tue, Apr 29, 2014 at 3:02 PM, motty cruz > > wrote: > > >> > > >>> yes, you would have to shutdown the router, then click on "Change > > Service > > >>> Offering" > > >>> restart the VR. > > >>> > > >>> To Ian, > > >>> > > >>> I suspect you forgot the last step: " cloudstack-setup-management" > > >>> > > >>> that would fix your issue, I think, > > >>> > > >>> Thanks, > > >>> --- > > >>> I downgraded to 4.2.1 and then upgraded to 4.3. Now the > > >>> cloudstack-management service can't start because it can't connect to > > the > > >>> database. > > >>> > > >>> 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) > Unable > > >>> to > > >>> get a new db connection > >
Re: failed to start virtual router
Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume from the source primary storage pool to secondary storage. 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed to copy the volume from the source primary storage pool to secondary storage."} On Tue, Apr 29, 2014 at 4:15 PM, Ian Young wrote: > I downgraded to 4.2.1 again but cloudstack-management won't start because > the database is version 4.3. Is it safe to restore the database backup I > made prior to this whole process? In the meantime I have destroyed and > created system VMs, so I'm not sure it's a good idea. > On Apr 29, 2014 3:09 PM, "Ian Young" wrote: > >> @stevenliang: I take it back--you can't set the VM size when you register >> the template. >> >> >> On Tue, Apr 29, 2014 at 3:02 PM, motty cruz wrote: >> >>> yes, you would have to shutdown the router, then click on "Change Service >>> Offering" >>> restart the VR. >>> >>> To Ian, >>> >>> I suspect you forgot the last step: " cloudstack-setup-management" >>> >>> that would fix your issue, I think, >>> >>> Thanks, >>> --- >>> I downgraded to 4.2.1 and then upgraded to 4.3. Now the >>> cloudstack-management service can't start because it can't connect to the >>> database. >>> >>> 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable >>> to >>> get a new db connection >>> Caused by: java.sql.SQLException: Access denied for user 'cloud'@ >>> 'localhost' >>> (using password: YES) >>> >>> Where are the credentials stored? >>> >>> >>> On Tue, Apr 29, 2014 at 2:57 PM, stevenliang >>> wrote: >>> >>> > oh, then change service offering for vr? >>> > >>> > >>> > On 29/04/14 05:53 PM, motty cruz wrote: >>> > >>> >> for my VR, I created a new >>> >> >>> >> "System Offering For Software Router" >>> >> CPU in (MHz) 1.00GHz >>> >> Memory (in MB) 1.00GB >>> >> >>> >> this are my current offerings, I'm sure the more RAM and CPU better >>> >> performance. >>> >> >>> >> Thanks, >>> >> >>> >> >>> >> >>> >> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang >>> >> wrote: >>> >> >>> >> Thank you again, motty. >>> >>> I didn't notice this earlier. >>> >>> BTW, how did you make your vr had 1GB CPU and 512MB RAM? >>> >>> >>> >>> >>> >>> >>> >>> On 29/04/14 05:33 PM, motty cruz wrote: >>> >>> >>> >>> Stevellang, >>> >>>> I not sure if you saw this in the forums earlier : >>> >>>> >>> http://mail-archives.apache.org/mod_mbox
Re: failed to start virtual router
I downgraded to 4.2.1 again but cloudstack-management won't start because the database is version 4.3. Is it safe to restore the database backup I made prior to this whole process? In the meantime I have destroyed and created system VMs, so I'm not sure it's a good idea. On Apr 29, 2014 3:09 PM, "Ian Young" wrote: > @stevenliang: I take it back--you can't set the VM size when you register > the template. > > > On Tue, Apr 29, 2014 at 3:02 PM, motty cruz wrote: > >> yes, you would have to shutdown the router, then click on "Change Service >> Offering" >> restart the VR. >> >> To Ian, >> >> I suspect you forgot the last step: " cloudstack-setup-management" >> >> that would fix your issue, I think, >> >> Thanks, >> --- >> I downgraded to 4.2.1 and then upgraded to 4.3. Now the >> cloudstack-management service can't start because it can't connect to the >> database. >> >> 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to >> get a new db connection >> Caused by: java.sql.SQLException: Access denied for user 'cloud'@ >> 'localhost' >> (using password: YES) >> >> Where are the credentials stored? >> >> >> On Tue, Apr 29, 2014 at 2:57 PM, stevenliang >> wrote: >> >> > oh, then change service offering for vr? >> > >> > >> > On 29/04/14 05:53 PM, motty cruz wrote: >> > >> >> for my VR, I created a new >> >> >> >> "System Offering For Software Router" >> >> CPU in (MHz) 1.00GHz >> >> Memory (in MB) 1.00GB >> >> >> >> this are my current offerings, I'm sure the more RAM and CPU better >> >> performance. >> >> >> >> Thanks, >> >> >> >> >> >> >> >> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang >> >> wrote: >> >> >> >> Thank you again, motty. >> >>> I didn't notice this earlier. >> >>> BTW, how did you make your vr had 1GB CPU and 512MB RAM? >> >>> >> >>> >> >>> >> >>> On 29/04/14 05:33 PM, motty cruz wrote: >> >>> >> >>> Stevellang, >> >>>> I not sure if you saw this in the forums earlier : >> >>>> >> http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% >> >>>> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com >> %3E >> >>>> >> >>>> I don't know if the bug was fixed yet, >> >>>> >> >>>> I will try upgrade in the next couple of days on a testing cluster, >> will >> >>>> report back if the bug was fixed. >> >>>> >> >>>> Thanks, >> >>>> >> >>>> >> >>>> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang >> >>>> wrote: >> >>>> >> >>>> Thank you, motty. >> >>>> >> >>>>> I am also running kvm. Since that time I failed upgrade, I am still >> >>>>> using >> >>>>> 4.2.1. I'll try as your advice. >> >>>>> >> >>>>> >> >>>>> On 29/04/14 05:19 PM, motty cruz wrote: >> >>>>> >> >>>>> Stevenllang, >> >>>>> >> >>>>>> I had the similar issue with VR, I notice it was because I leave >> the >> >>>>>> default system specs on the VR, for instance by default 500MHz on >> CPU >> >>>>>> and >> >>>>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of >> RAM >> >>>>>> your >> >>>>>> VR will survive the upgrade from 4.2.1 to 4.3.1. >> >>>>>> >> >>>>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not >> >>>>>> able >> >>>>>> to >> >>>>>> access outside world, even if I created a new router. >> >>>>>> >> >>>>>> wish you the best, >> >>>>>> -motty >> >>>>>> >> >>>>>> >> >>>>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang < >> stevenli...@yesup.com> >> >
Re: failed to start virtual router
@stevenliang: I take it back--you can't set the VM size when you register the template. On Tue, Apr 29, 2014 at 3:02 PM, motty cruz wrote: > yes, you would have to shutdown the router, then click on "Change Service > Offering" > restart the VR. > > To Ian, > > I suspect you forgot the last step: " cloudstack-setup-management" > > that would fix your issue, I think, > > Thanks, > --- > I downgraded to 4.2.1 and then upgraded to 4.3. Now the > cloudstack-management service can't start because it can't connect to the > database. > > 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to > get a new db connection > Caused by: java.sql.SQLException: Access denied for user 'cloud'@ > 'localhost' > (using password: YES) > > Where are the credentials stored? > > > On Tue, Apr 29, 2014 at 2:57 PM, stevenliang > wrote: > > > oh, then change service offering for vr? > > > > > > On 29/04/14 05:53 PM, motty cruz wrote: > > > >> for my VR, I created a new > >> > >> "System Offering For Software Router" > >> CPU in (MHz) 1.00GHz > >> Memory (in MB) 1.00GB > >> > >> this are my current offerings, I'm sure the more RAM and CPU better > >> performance. > >> > >> Thanks, > >> > >> > >> > >> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang > >> wrote: > >> > >> Thank you again, motty. > >>> I didn't notice this earlier. > >>> BTW, how did you make your vr had 1GB CPU and 512MB RAM? > >>> > >>> > >>> > >>> On 29/04/14 05:33 PM, motty cruz wrote: > >>> > >>> Stevellang, > >>>> I not sure if you saw this in the forums earlier : > >>>> > http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% > >>>> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com > %3E > >>>> > >>>> I don't know if the bug was fixed yet, > >>>> > >>>> I will try upgrade in the next couple of days on a testing cluster, > will > >>>> report back if the bug was fixed. > >>>> > >>>> Thanks, > >>>> > >>>> > >>>> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang > >>>> wrote: > >>>> > >>>> Thank you, motty. > >>>> > >>>>> I am also running kvm. Since that time I failed upgrade, I am still > >>>>> using > >>>>> 4.2.1. I'll try as your advice. > >>>>> > >>>>> > >>>>> On 29/04/14 05:19 PM, motty cruz wrote: > >>>>> > >>>>> Stevenllang, > >>>>> > >>>>>> I had the similar issue with VR, I notice it was because I leave the > >>>>>> default system specs on the VR, for instance by default 500MHz on > CPU > >>>>>> and > >>>>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM > >>>>>> your > >>>>>> VR will survive the upgrade from 4.2.1 to 4.3.1. > >>>>>> > >>>>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not > >>>>>> able > >>>>>> to > >>>>>> access outside world, even if I created a new router. > >>>>>> > >>>>>> wish you the best, > >>>>>> -motty > >>>>>> > >>>>>> > >>>>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang > > >>>>>> wrote: > >>>>>> > >>>>>>Yes, I had two zones(one is basic, another is advanced mode). > >>>>>> > >>>>>> After I upgraded from 4.2.1 to 4.3, the vrouter lost. > >>>>>>> So I rolled back to 4.2.1, the vrouter came back. > >>>>>>> > >>>>>>> > >>>>>>> On 29/04/14 04:54 PM, Ian Young wrote: > >>>>>>> > >>>>>>>Did rolling back to 4.2 fix the problem? > >>>>>>> > >>>>>>> On Tue, Apr 29, 2014 at 1:22 PM, stevenliang < > stevenli...@yesup.com > >>>>>>>> > > >>>>>>>> wrote: >
Re: failed to start virtual router
I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud'@'localhost' (using password: YES) Where are the credentials stored? On Tue, Apr 29, 2014 at 2:55 PM, Ian Young wrote: > I think you can do that when you register the new templates in step 1 of > this guide: > > > http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 > > > On Tue, Apr 29, 2014 at 2:53 PM, motty cruz wrote: > >> for my VR, I created a new >> >> "System Offering For Software Router" >> CPU in (MHz) 1.00GHz >> Memory (in MB) 1.00GB >> >> this are my current offerings, I'm sure the more RAM and CPU better >> performance. >> >> Thanks, >> >> >> >> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang >> wrote: >> >> > Thank you again, motty. >> > I didn't notice this earlier. >> > BTW, how did you make your vr had 1GB CPU and 512MB RAM? >> > >> > >> > >> > On 29/04/14 05:33 PM, motty cruz wrote: >> > >> >> Stevellang, >> >> I not sure if you saw this in the forums earlier : >> >> >> http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% >> >> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com >> %3E >> >> >> >> I don't know if the bug was fixed yet, >> >> >> >> I will try upgrade in the next couple of days on a testing cluster, >> will >> >> report back if the bug was fixed. >> >> >> >> Thanks, >> >> >> >> >> >> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang >> >> wrote: >> >> >> >> Thank you, motty. >> >>> I am also running kvm. Since that time I failed upgrade, I am still >> using >> >>> 4.2.1. I'll try as your advice. >> >>> >> >>> >> >>> On 29/04/14 05:19 PM, motty cruz wrote: >> >>> >> >>> Stevenllang, >> >>>> >> >>>> I had the similar issue with VR, I notice it was because I leave the >> >>>> default system specs on the VR, for instance by default 500MHz on CPU >> >>>> and >> >>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM >> >>>> your >> >>>> VR will survive the upgrade from 4.2.1 to 4.3.1. >> >>>> >> >>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not >> able >> >>>> to >> >>>> access outside world, even if I created a new router. >> >>>> >> >>>> wish you the best, >> >>>> -motty >> >>>> >> >>>> >> >>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang >> >>>> wrote: >> >>>> >> >>>> Yes, I had two zones(one is basic, another is advanced mode). >> >>>> >> >>>>> After I upgraded from 4.2.1 to 4.3, the vrouter lost. >> >>>>> So I rolled back to 4.2.1, the vrouter came back. >> >>>>> >> >>>>> >> >>>>> On 29/04/14 04:54 PM, Ian Young wrote: >> >>>>> >> >>>>> Did rolling back to 4.2 fix the problem? >> >>>>> >> >>>>>> >> >>>>>> On Tue, Apr 29, 2014 at 1:22 PM, stevenliang < >> stevenli...@yesup.com> >> >>>>>> wrote: >> >>>>>> >> >>>>>>I met your situation before. Finally I rolled back to 4.2 >> >>>>>> >> >>>>>> On 29/04/14 04:18 PM, Ian Young wrote: >> >>>>>>> >> >>>>>>>I destroyed the old virtual router and was able to create a new >> >>>>>>> one >> >>>>>>> by >> >>>>>>> >> >>>>>>> adding a new instance. However, this new router also failed to >> >>>>>>>> start, >> >>>>>>>> citing the same error. After that, the expungement
Re: failed to start virtual router
I think you can do that when you register the new templates in step 1 of this guide: http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 On Tue, Apr 29, 2014 at 2:53 PM, motty cruz wrote: > for my VR, I created a new > > "System Offering For Software Router" > CPU in (MHz) 1.00GHz > Memory (in MB) 1.00GB > > this are my current offerings, I'm sure the more RAM and CPU better > performance. > > Thanks, > > > > On Tue, Apr 29, 2014 at 2:44 PM, stevenliang > wrote: > > > Thank you again, motty. > > I didn't notice this earlier. > > BTW, how did you make your vr had 1GB CPU and 512MB RAM? > > > > > > > > On 29/04/14 05:33 PM, motty cruz wrote: > > > >> Stevellang, > >> I not sure if you saw this in the forums earlier : > >> http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% > >> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com%3E > >> > >> I don't know if the bug was fixed yet, > >> > >> I will try upgrade in the next couple of days on a testing cluster, will > >> report back if the bug was fixed. > >> > >> Thanks, > >> > >> > >> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang > >> wrote: > >> > >> Thank you, motty. > >>> I am also running kvm. Since that time I failed upgrade, I am still > using > >>> 4.2.1. I'll try as your advice. > >>> > >>> > >>> On 29/04/14 05:19 PM, motty cruz wrote: > >>> > >>> Stevenllang, > >>>> > >>>> I had the similar issue with VR, I notice it was because I leave the > >>>> default system specs on the VR, for instance by default 500MHz on CPU > >>>> and > >>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM > >>>> your > >>>> VR will survive the upgrade from 4.2.1 to 4.3.1. > >>>> > >>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not > able > >>>> to > >>>> access outside world, even if I created a new router. > >>>> > >>>> wish you the best, > >>>> -motty > >>>> > >>>> > >>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang > >>>> wrote: > >>>> > >>>> Yes, I had two zones(one is basic, another is advanced mode). > >>>> > >>>>> After I upgraded from 4.2.1 to 4.3, the vrouter lost. > >>>>> So I rolled back to 4.2.1, the vrouter came back. > >>>>> > >>>>> > >>>>> On 29/04/14 04:54 PM, Ian Young wrote: > >>>>> > >>>>> Did rolling back to 4.2 fix the problem? > >>>>> > >>>>>> > >>>>>> On Tue, Apr 29, 2014 at 1:22 PM, stevenliang > > >>>>>> wrote: > >>>>>> > >>>>>>I met your situation before. Finally I rolled back to 4.2 > >>>>>> > >>>>>> On 29/04/14 04:18 PM, Ian Young wrote: > >>>>>>> > >>>>>>>I destroyed the old virtual router and was able to create a new > >>>>>>> one > >>>>>>> by > >>>>>>> > >>>>>>> adding a new instance. However, this new router also failed to > >>>>>>>> start, > >>>>>>>> citing the same error. After that, the expungement delay elapsed > >>>>>>>> and > >>>>>>>> the > >>>>>>>> virtual router was expunged, so now I have none. > >>>>>>>> > >>>>>>>> > >>>>>>>> On Mon, Apr 28, 2014 at 8:52 PM, Ian Young < > iyo...@ratespecial.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>> I upgraded from 4.2.1 to 4.3.0 tonight, following the > >>>>>>>> instructions > >>>>>>>> here: > >>>>>>>> > >>>>>>>> http://docs.cloudstack.apache.org/projects/cloudstack- > >>>>>>>> > >>>>>>>>> release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 > >>>>>>>>> > >>>>>>>>> At the last step, I tried to restart the system VMs. The virtual > >>>>>>>>> router > >>>>>>>>> failed to start. Here is the message that was displayed in the > web > >>>>>>>>> UI: > >>>>>>>>> > >>>>>>>>> Resource [Host:1] is unreachable: Host 1: Unable to start > instance > >>>>>>>>> due > >>>>>>>>> to > >>>>>>>>> Unable to start VM[DomainRouter|r-4-VM] due to error in > >>>>>>>>> finalizeStart, > >>>>>>>>> not > >>>>>>>>> retrying > >>>>>>>>> > >>>>>>>>> I tried running the script to restart the VMs but this time it > >>>>>>>>> failed > >>>>>>>>> to > >>>>>>>>> start the console proxy: > >>>>>>>>> > >>>>>>>>> [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u > cloud > >>>>>>>>> -p > >>>>>>>>> -a > >>>>>>>>> > >>>>>>>>> Stopping and starting 1 secondary storage vm(s)... > >>>>>>>>> Done stopping and starting secondary storage vm(s) > >>>>>>>>> > >>>>>>>>> Stopping and starting 1 console proxy vm(s)... > >>>>>>>>> ERROR: Failed to start console proxy vm with id 2 > >>>>>>>>> > >>>>>>>>> Done stopping and starting console proxy vm(s) . > >>>>>>>>> > >>>>>>>>> Stopping and starting 0 running routing vm(s)... > >>>>>>>>> > >>>>>>>>> Is there a way to wipe the system VMs out and start over? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > > >
Re: failed to start virtual router
Did rolling back to 4.2 fix the problem? On Tue, Apr 29, 2014 at 1:22 PM, stevenliang wrote: > I met your situation before. Finally I rolled back to 4.2 > > > On 29/04/14 04:18 PM, Ian Young wrote: > >> I destroyed the old virtual router and was able to create a new one by >> adding a new instance. However, this new router also failed to start, >> citing the same error. After that, the expungement delay elapsed and the >> virtual router was expunged, so now I have none. >> >> >> On Mon, Apr 28, 2014 at 8:52 PM, Ian Young >> wrote: >> >> I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: >>> >>> >>> http://docs.cloudstack.apache.org/projects/cloudstack- >>> release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 >>> >>> At the last step, I tried to restart the system VMs. The virtual router >>> failed to start. Here is the message that was displayed in the web UI: >>> >>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to >>> Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, >>> not >>> retrying >>> >>> I tried running the script to restart the VMs but this time it failed to >>> start the console proxy: >>> >>> [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a >>> >>> Stopping and starting 1 secondary storage vm(s)... >>> Done stopping and starting secondary storage vm(s) >>> >>> Stopping and starting 1 console proxy vm(s)... >>> ERROR: Failed to start console proxy vm with id 2 >>> >>> Done stopping and starting console proxy vm(s) . >>> >>> Stopping and starting 0 running routing vm(s)... >>> >>> Is there a way to wipe the system VMs out and start over? >>> >>> >
Re: failed to start virtual router
I destroyed the old virtual router and was able to create a new one by adding a new instance. However, this new router also failed to start, citing the same error. After that, the expungement delay elapsed and the virtual router was expunged, so now I have none. On Mon, Apr 28, 2014 at 8:52 PM, Ian Young wrote: > I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: > > > http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 > > At the last step, I tried to restart the system VMs. The virtual router > failed to start. Here is the message that was displayed in the web UI: > > Resource [Host:1] is unreachable: Host 1: Unable to start instance due to > Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not > retrying > > I tried running the script to restart the VMs but this time it failed to > start the console proxy: > > [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a > > Stopping and starting 1 secondary storage vm(s)... > Done stopping and starting secondary storage vm(s) > > Stopping and starting 1 console proxy vm(s)... > ERROR: Failed to start console proxy vm with id 2 > > Done stopping and starting console proxy vm(s) . > > Stopping and starting 0 running routing vm(s)... > > Is there a way to wipe the system VMs out and start over? >
failed to start virtual router
I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Re: SSVM Public IP not pinging
What about the SSVM's link local address? Can you ping that? It should begin with 169.254. On Feb 23, 2014 9:41 PM, "Tejas Gadaria" wrote: > I have created basic zone, with no security group on CS 4.0.2 and > hypervisor is xenserver. SSVM and CPVM is running state but I am not able > to ping SSVM Public IP. it says " Destination Host Unreachable" . So I am > not able to download CentOS template and and hence not able to create any > guest vm. > > need help. > > Regards, > Tejas >
Re: Management node is detected inactive by timestamp but is pingable
I found this article, which showed me how to clear the never ending expungement: http://support.citrix.com/article/CTX139482 After clearing it, I was able to launch a new instance with the same name. However, the "inactive management node" notices continue to fill the logs. On Fri, Feb 21, 2014 at 10:18 AM, Ian Young wrote: > All the records in the mshost table have a NULL value in the 'removed' > column (and there are more of them now since I've restarted the service a > few times). These should have a timestamp instead, right? If so, what > should I do about this? > http://pastebin.com/Vz3XnBqx > > My expunge.delay and expunge.internal values are set to 24 hours but this > instance has been in that state for over 48 hours so far. This is the > record in vm_instance for the problematic instance: > http://pastebin.com/kZVDy1My > > As you can see, the 'removed' value is NULL. If I set it to a timestamp, > will it go away or are there other references to it in the database? > CloudStack won't let me create another instance with the same name until > this one has been completely removed, and this particular name (web01) is a > rather useful one for my purposes). > > > On Fri, Feb 21, 2014 at 1:36 AM, Geoff Higginbottom < > geoff.higginbot...@shapeblue.com> wrote: > >> Ian, >> >> When you delete a VM, it eventually gets expunged (deleted) the time >> this actually takes is controlled by the global settings ' expunge.delay' >> and 'expunge internal' >> >> Once the VM has been expunged its state in the DB table vm_instance will >> be 'expunging' and there will be a date/time in the 'removed' column (there >> it is again!) >> >> Regards >> >> Geoff Higginbottom >> >> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 >> >> geoff.higginbot...@shapeblue.com >> >> -Original Message- >> From: Ian Young [mailto:iyo...@ratespecial.com] >> Sent: 20 February 2014 19:37 >> To: users@cloudstack.apache.org >> Subject: Re: Management node is detected inactive by timestamp but is >> pingable >> >> Restarting cloudstack-agent and cloudstack-management made the inactive >> management node notice go away but the instance is still stuck in an >> "expunging" state. How can I get rid of it? >> >> >> On Thu, Feb 20, 2014 at 10:55 AM, Ian Young >> wrote: >> >> > I noticed that there are 12 records in the cloud.mshost table, all of >> > which have an "Up" state. I only have one management server. Should >> > I delete the other 11 records? >> > >> > >> > On Thu, Feb 20, 2014 at 10:20 AM, Ian Young > >wrote: >> > >> >> Yesterday I tried to start an existing instance but it failed. Since >> >> it was basically a brand new installation, I just decided to destroy >> >> it and start over. However, it stayed in an "expunging" state and >> >> remains so today. I cannot create new instances now. The >> >> management-server.log shows numerous messages like this: >> >> >> >> 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] >> >> (Cluster-Heartbeat-1:null) Detected management node left, id:11, >> >> nodeIP:192.168.100.6 >> >> 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] >> >> (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 >> >> 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] >> >> (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by >> >> timestamp but is pingable >> >> >> >> The management server and the hypervisor host are the same machine >> >> (budgetary constraints necessitated this setup) so, obviously, it >> >> should be able to connect to itself. What is this timestamp it's >> >> referring to? Is it simply a matter of updating this so the >> >> management server is no longer considered inactive? >> >> >> > >> > >> Need Enterprise Grade Support for Apache CloudStack? >> Our CloudStack Infrastructure Support< >> http://shapeblue.com/cloudstack-infrastructure-support/> offers the best >> 24/7 SLA for CloudStack Environments. >> >> Apache CloudStack Bootcamp training courses >> >> **NEW!** CloudStack 4.2.1 training< >> http://shapeblue.com/cloudstack-training/> >> 18th-19th February 2014, Brazil. Classroom< >> http://shapeblue.com
Re: Management node is detected inactive by timestamp but is pingable
All the records in the mshost table have a NULL value in the 'removed' column (and there are more of them now since I've restarted the service a few times). These should have a timestamp instead, right? If so, what should I do about this? http://pastebin.com/Vz3XnBqx My expunge.delay and expunge.internal values are set to 24 hours but this instance has been in that state for over 48 hours so far. This is the record in vm_instance for the problematic instance: http://pastebin.com/kZVDy1My As you can see, the 'removed' value is NULL. If I set it to a timestamp, will it go away or are there other references to it in the database? CloudStack won't let me create another instance with the same name until this one has been completely removed, and this particular name (web01) is a rather useful one for my purposes). On Fri, Feb 21, 2014 at 1:36 AM, Geoff Higginbottom < geoff.higginbot...@shapeblue.com> wrote: > Ian, > > When you delete a VM, it eventually gets expunged (deleted) the time this > actually takes is controlled by the global settings ' expunge.delay' and > 'expunge internal' > > Once the VM has been expunged its state in the DB table vm_instance will > be 'expunging' and there will be a date/time in the 'removed' column (there > it is again!) > > Regards > > Geoff Higginbottom > > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 > > geoff.higginbot...@shapeblue.com > > -Original Message- > From: Ian Young [mailto:iyo...@ratespecial.com] > Sent: 20 February 2014 19:37 > To: users@cloudstack.apache.org > Subject: Re: Management node is detected inactive by timestamp but is > pingable > > Restarting cloudstack-agent and cloudstack-management made the inactive > management node notice go away but the instance is still stuck in an > "expunging" state. How can I get rid of it? > > > On Thu, Feb 20, 2014 at 10:55 AM, Ian Young > wrote: > > > I noticed that there are 12 records in the cloud.mshost table, all of > > which have an "Up" state. I only have one management server. Should > > I delete the other 11 records? > > > > > > On Thu, Feb 20, 2014 at 10:20 AM, Ian Young >wrote: > > > >> Yesterday I tried to start an existing instance but it failed. Since > >> it was basically a brand new installation, I just decided to destroy > >> it and start over. However, it stayed in an "expunging" state and > >> remains so today. I cannot create new instances now. The > >> management-server.log shows numerous messages like this: > >> > >> 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] > >> (Cluster-Heartbeat-1:null) Detected management node left, id:11, > >> nodeIP:192.168.100.6 > >> 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] > >> (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 > >> 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] > >> (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by > >> timestamp but is pingable > >> > >> The management server and the hypervisor host are the same machine > >> (budgetary constraints necessitated this setup) so, obviously, it > >> should be able to connect to itself. What is this timestamp it's > >> referring to? Is it simply a matter of updating this so the > >> management server is no longer considered inactive? > >> > > > > > Need Enterprise Grade Support for Apache CloudStack? > Our CloudStack Infrastructure Support< > http://shapeblue.com/cloudstack-infrastructure-support/> offers the best > 24/7 SLA for CloudStack Environments. > > Apache CloudStack Bootcamp training courses > > **NEW!** CloudStack 4.2.1 training< > http://shapeblue.com/cloudstack-training/> > 18th-19th February 2014, Brazil. Classroom< > http://shapeblue.com/cloudstack-training/> > 17th-23rd March 2014, Region A. Instructor led, On-line< > http://shapeblue.com/cloudstack-training/> > 24th-28th March 2014, Region B. Instructor led, On-line< > http://shapeblue.com/cloudstack-training/> > 16th-20th June 2014, Region A. Instructor led, On-line< > http://shapeblue.com/cloudstack-training/> > 23rd-27th June 2014, Region B. Instructor led, On-line< > http://shapeblue.com/cloudstack-training/> > > This email and any attachments to it may be confidential and are intended > solely for the use of the individual to whom it is addressed. Any views or > opinions expressed are solely those of the author and do not necessarily > represent
Re: Management node is detected inactive by timestamp but is pingable
Restarting cloudstack-agent and cloudstack-management made the inactive management node notice go away but the instance is still stuck in an "expunging" state. How can I get rid of it? On Thu, Feb 20, 2014 at 10:55 AM, Ian Young wrote: > I noticed that there are 12 records in the cloud.mshost table, all of > which have an "Up" state. I only have one management server. Should I > delete the other 11 records? > > > On Thu, Feb 20, 2014 at 10:20 AM, Ian Young wrote: > >> Yesterday I tried to start an existing instance but it failed. Since it >> was basically a brand new installation, I just decided to destroy it and >> start over. However, it stayed in an "expunging" state and remains so >> today. I cannot create new instances now. The management-server.log shows >> numerous messages like this: >> >> 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] >> (Cluster-Heartbeat-1:null) Detected management node left, id:11, >> nodeIP:192.168.100.6 >> 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] >> (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 >> 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] >> (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by >> timestamp but is pingable >> >> The management server and the hypervisor host are the same machine >> (budgetary constraints necessitated this setup) so, obviously, it should be >> able to connect to itself. What is this timestamp it's referring to? Is >> it simply a matter of updating this so the management server is no longer >> considered inactive? >> > >
Re: Management node is detected inactive by timestamp but is pingable
I noticed that there are 12 records in the cloud.mshost table, all of which have an "Up" state. I only have one management server. Should I delete the other 11 records? On Thu, Feb 20, 2014 at 10:20 AM, Ian Young wrote: > Yesterday I tried to start an existing instance but it failed. Since it > was basically a brand new installation, I just decided to destroy it and > start over. However, it stayed in an "expunging" state and remains so > today. I cannot create new instances now. The management-server.log shows > numerous messages like this: > > 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] > (Cluster-Heartbeat-1:null) Detected management node left, id:11, > nodeIP:192.168.100.6 > 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] > (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 > 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] > (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by > timestamp but is pingable > > The management server and the hypervisor host are the same machine > (budgetary constraints necessitated this setup) so, obviously, it should be > able to connect to itself. What is this timestamp it's referring to? Is > it simply a matter of updating this so the management server is no longer > considered inactive? >
Management node is detected inactive by timestamp but is pingable
Yesterday I tried to start an existing instance but it failed. Since it was basically a brand new installation, I just decided to destroy it and start over. However, it stayed in an "expunging" state and remains so today. I cannot create new instances now. The management-server.log shows numerous messages like this: 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Detected management node left, id:11, nodeIP:192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by timestamp but is pingable The management server and the hypervisor host are the same machine (budgetary constraints necessitated this setup) so, obviously, it should be able to connect to itself. What is this timestamp it's referring to? Is it simply a matter of updating this so the management server is no longer considered inactive?
Re: Change of guest IP address
> > On 19-Dec-2013, at 3:58 PM, Andrei Mikhailovsky wrote: > > > > Do you know if there is an easier way? Like via the api calls or the cloudmonkey command? Or is it currently > the only way? > > > > - Original Message - > > From: "Jayapal Reddy Uradi" To: > " > " Sent: Thursday, 19 December, 2013 > 9:25:05 AM > > Subject: Re: Change of guest IP address > > > > Hi, > > > > If your VM is in isolated network please do the following > > > > 1. edit the nics table ip4_address column for your instance_id to new ip. > > 2. login to the router corresponds to the network and replace old ip with new ip in below files. > > > > a. /var/lib/misc/dnsmasq.leases > > b. /etc/dhcphosts.txt > > 3. restart the dnsmasq in router (service dnsmasq restart) 4. Reboot > > the VM or restart the network service in Vm so that VM gets the new ip from the dhcp. > > > > Thanks, > > Jayapal I put Jayapal's solution into a script for convenience: http://pastebin.com/7yJtjNQX Just edit the first group of variables according to your needs and run it like this: set-vm-ip.sh old-address new-address